When emitting the ICMD bundle, wait on the bottom half (bit 3 of the
GR_STATUS register) instead of upper half (bit 2) to make sure methods
are effectively emitted.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Kicking channels is part of their deactivation process. Maxwell chips
are particularly sensitive to this, and can start fetching the previous
pushbuffer of a recycled channel if this is not done.
While we are at it, improve the channel preemption code to only wait for
bit 20 of 0x002634 to turn to 0, as it is the bit indicating a
preempt is pending.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Increase clock timeout for SYS, FPB and GPC in order to avoid operation
failure at high gpcclk rate.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The lack of IOMMU API support can make nouveau_platform_probe_iommu()
fail to compile because struct iommu_ops is then empty. Fix this by
skipping IOMMU probe in that case - lack of IOMMU on platform devices
is sub-optimal, but is not an error.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The memory allocated for a nouveau_cli object in nouveau_cli_create() is
never freed. Free the memory in nouveau_cli_destroy() to plug this leak.
kmemleak recorded this after running a couple of nouveau test programs.
Note that kmemleak points at drm_open_helper() because for some reason
it thinks that skipping the first two stack frames is a good idea.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Use kvfree() instead of open-coding it.
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Cc: David Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On the MacBook Pro, power of the gpu is cut by a gmux chip. Sometimes
the gpu gets stuck in powersaving mode and refuses to wake up
("Refused to change power state, currently in D3"). Inserting a
delay between setting the gpu to D3hot and cutting the power seems
to help (most of the time). This issue and its (partial) remediation
by the patch was observed with an Nvidia GT650M (NVE7 / GK107).
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
One more drm-misch pull for 4.1 with mostly simple stuff and boring
refactoring. Even the cursor fix from Matt is just to make a really anal
igt happy.
* tag 'topic/drm-misc-2015-04-15' of git://anongit.freedesktop.org/drm-intel:
drm: fix trivial typo mistake
drm: Make integer overflow checking cover universal cursor updates (v2)
drm: make crtc/encoder/connector/plane helper_private a const pointer
drm/armada: constify struct drm_encoder_helper_funcs pointer
drm/radeon: constify more struct drm_*_helper funcs pointers
drm/edid: add #defines for ELD versions
drm/atomic: Add for_each_{connector,crtc,plane}_in_state helper macros
drm: Use kref_put_mutex in drm_gem_object_unreference_unlocked
drm/drm: constify all struct drm_*_helper funcs pointers
drm/qxl: constify all struct drm_*_helper funcs pointers
drm/nouveau: constify all struct drm_*_helper funcs pointers
drm/radeon: constify all struct drm_*_helper funcs pointers
drm/gma500: constify all struct drm_*_helper funcs pointers
drm/mgag200: constify all struct drm_*_helper funcs pointers
drm/exynos: constify all struct drm_*_helper funcs pointers
drm: Fix some typos
nvbios_extend() returns 1 to indicate "extended the array" and 0 to
indicate the array is already big enough. This is used by the core
shadowing code to prevent re-fetching chunks of the image that have
already been shadowed.
The ACPI fetching code may possibly need to extend this further due
to requiring fetches to happen in 4KiB chunks.
Under certain circumstances (that happen if the total image size is
a multiple of 4KiB), the memory allocated to store the shadow will
already be big enough, causing the ACPI code's nvbios_extend() call
to return 0, which is misinterpreted as a failure.
The fix is simple, accept >= 0 as a successful condition here. The
core will have already made sure that we're not re-fetching data we
already have.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89047
v2 (Ben Skeggs):
- dropped hunk which would cause unnecessary re-fetching
- more descriptive explanation
Signed-off-by: Jan Vesely <jano.vesely@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Uncertain whether the GPC pack change is due to a newer driver version,
or a legitimate difference from GM204. My GM204 has broken vram, so
can't currently try a newer binary driver on it to confirm.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Under certain circumstances the trapped address will contain subc 7,
which GK104 GR doesn't have anymore.
Notice this case to avoid causing additional priv ring faults.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
No idea if "3" is a constant or derived from something else, but the
value is unchanged in the limited traces of gm107/gm204 I have here.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Make static a few functions and structures that should be.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
A "return 0" found its way in the middle of the error path of
nouveau_platform_probe(), remove it as it will make the kernel crash if
we try to unload the module afterwards.
While we are at it, also remove the IOMMU domain if it has been created,
as we should.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
nvkm_mm_fini() was not called when exiting the driver, resulting in a
memory leak. Fix this.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
On some of these chipsets, reading NV_PGRAPH_GPC_GPM_PD_PES_TPC_ID_MASK
can trigger a PRI fault and return an error code instead of a TPC mask,
unless PGOB has been disabled first.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Before we moved gk110's implementation of this to pmu, the functions were
identical. This commit just switches GK208 to use the new (more complete)
implementation of the power-up sequence.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Turns out the PTHERM part of this dance is bracketed by the same PMU
fiddling that occurs on GK104/6, let's assume it's also PGOB.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
If a memory allocation fails when using the DMA allocator,
gk20a_instobj_dtor_dma() will be called on the failed instmem object.
At this time, node->handle might not be NULL despite the call to
dma_alloc_attrs() having failed. node->cpuaddr is the right member to
check for such a failure, so use it instead.
Reported-by: Vince Hsu <vinceh@nvidia.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
User-space use mappable BOs notably for fences, and expects that a
value update by the GPU will be immediatly visible through the
user-space mapping.
ARM has a property that may prevent this from happening though: memory
can be mapped multiple times only if the different mappings share the
same caching properties. However all the lowmem memory is already
identity-mapped into the kernel with cache enabled, so when user-space
requests an uncached mapping, we actually get an "undefined caching
policy" one and this has strange side-effects described on Freedesktop
bug 86690.
To prevent this from happening, allow user-space to explicitly specify
which objects should be coherent, and create such objects with the
TTM_PL_FLAG_UNCACHED flag. This will make TTM allocate memory using the
DMA API, which will fix the identify mapping and allow us to safely map
the objects to user-space uncached.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Lucas Stach <dev@lynxeye.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Let GK20A's instmem take advantage of the IOMMU if it is present. Having
an IOMMU means that instmem is no longer allocated using the DMA API,
but instead obtained through page_alloc and made contiguous to the GPU
by IOMMU mappings.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Tegra SoCs have an IOMMU that can be used to present non-contiguous
physical memory as contiguous to the GPU and maximize the use of large
pages in the GPU MMU, leading to performance gains. This patch adds
support for probing such a IOMMU if present and make its properties
available in the nouveau_platform_gpu structure so subsystems can take
advantage of it.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
instmem for GK20A is allocated using dma_alloc_coherent(), which
provides us with a coherent CPU mapping that we never use because
instmem objects are accessed through PRAMIN. Switch to
dma_alloc_attrs() which gives us the option to dismiss that CPU mapping
and free up some CPU virtual space.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Now that Nouveau can operate even when there is no RAM device, remove
the dummy one used by GK20A.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
GK20A does not have dedicated RAM, thus having a RAM device for it does
not make sense. Move the contiguous physical memory allocation to
instmem.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Having a RAM device does not make sense for chips like GK20A which have
no dedicated video memory. The dummy RAM device that we used so far
works as a temporary band-aid, but in the longer term it is desirable
for the driver to be able to work without any kind of VRAM.
This patch adds a few conditionals in places where a RAM device was
assumed to be present and allows some more objects to be allocated from
the TT domain, allowing Nouveau to handle GPUs for which
pfb->ram == NULL.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>