Allocate a page for use as a status page by a group of timelines, as we
only need a dword of storage for each (rounded up to the cacheline for
safety) we can pack multiple timelines into the same page. Each timeline
will then be able to track its own HW seqno.
v2: Reuse the common per-engine HWSP for the solitary ringbuffer
timeline, so that we do not have to emit (using per-gen specialised
vfuncs) the breadcrumb into the distinct timeline HWSP and instead can
keep on using the common MI_STORE_DWORD_INDEX. However, to maintain the
sleight-of-hand for the global/per-context seqno switchover, we will
store both temporarily (and so use a custom offset for the shared timeline
HWSP until the switch over).
v3: Keep things simple and allocate a page for each timeline, page
sharing comes next.
v4: I was caught repeating the same MI_STORE_DWORD_IMM over and over
again in selftests.
v5: And caught red handed copying create timeline + check.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128181812.22804-3-chris@chris-wilson.co.uk
Currently, the list of timelines is serialised by the struct_mutex, but
to alleviate difficulties with using that mutex in future, move the
list management under its own dedicated mutex.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-5-chris@chris-wilson.co.uk
Currently we only allocate an object and vma if we are using a GGTT
virtual HWSP, and a plain struct page for a physical HWSP. For
convenience later on with global timelines, it will be useful to always
have the status page being tracked by a struct i915_vma. Make it so.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-4-chris@chris-wilson.co.uk
Remove the struct_mutex requirement for looking up the vma for an
object.
v2: Highlight how the race for duplicate vma creation is resolved on
reacquiring the lock with a short comment.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-3-chris@chris-wilson.co.uk
A starting point to counter the pervasive struct_mutex. For the goal of
avoiding (or at least blocking under them!) global locks during user
request submission, a simple but important step is being able to manage
each clients GTT separately. For which, we want to replace using the
struct_mutex as the guard for all things GTT/VM and switch instead to a
specific mutex inside i915_address_space.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-2-chris@chris-wilson.co.uk
Our goal is to remove struct_mutex and replace it with fine grained
locking. One of the thorny issues is our eviction logic for reclaiming
space for an execbuffer (or GTT mmaping, among a few other examples).
While eviction itself is easy to move under a per-VM mutex, performing
the activity tracking is less agreeable. One solution is not to do any
MRU tracking and do a simple coarse evaluation during eviction of
active/inactive, with a loose temporal ordering of last
insertion/evaluation. That keeps all the locking constrained to when we
are manipulating the VM itself, neatly avoiding the tricky handling of
possible recursive locking during execbuf and elsewhere.
Note that discarding the MRU (currently implemented as a pair of lists,
to avoid scanning the active list for a NONBLOCKING search) is unlikely
to impact upon our efficiency to reclaim VM space (where we think a LRU
model is best) as our current strategy is to use random idle replacement
first before doing a search, and over time the use of softpinned 48b
per-ppGTT is growing (thereby eliminating any need to perform any eviction
searches, in theory at least) with the remaining users being found on
much older devices (gen2-gen6).
v2: Changelog and commentary rewritten to elaborate on the duality of a
single list being both an inactive and active list.
v3: Consolidate bool parameters into a single set of flags; don't
comment on the duality of a single variable being a multiplicity of
bits.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-1-chris@chris-wilson.co.uk
Always perform the requested reset, even if we believe the engine is
idle. Presumably there was a reason the caller wanted the reset, and in
the near future we lose the easy tracking for whether the engine is
idle.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190125132230.22221-5-chris@chris-wilson.co.uk
Trim the struct_mutex hold and exclude the call to i915_gem_set_wedged()
as a reminder that it must be callable without struct_mutex held.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190125132230.22221-4-chris@chris-wilson.co.uk
Now that the submission backends are controlled via their own spinlocks,
with a wave of a magic wand we can lift the struct_mutex requirement
around GPU reset. That is we allow the submission frontend (userspace)
to keep on submitting while we process the GPU reset as we can suspend
the backend independently.
The major change is around the backoff/handoff strategy for performing
the reset. With no mutex deadlock, we no longer have to coordinate with
any waiter, and just perform the reset immediately.
Testcase: igt/gem_mmap_gtt/hang # regresses
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190125132230.22221-3-chris@chris-wilson.co.uk
Instead of tediously and fragilely counting up the number of dwords
required to emit the breadcrumb to seal a request, fake a request and
measure it automatically once during engine setup.
The downside is that this requires a fair amount of mocking to create a
proper breadcrumb. Still, should be less error prone in future as the
breadcrumb size fluctuates!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190125100520.20163-1-chris@chris-wilson.co.uk
Prior to adding a third instance of intel_context_init() and extending
the information stored therewithin, refactor out the common assignments.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190121222117.23305-8-chris@chris-wilson.co.uk
Before adding yet another copy of struct live_test and its handler,
refactor the existing code into a common framework for live selftests.
For many live selftests, we want to know if the GPU hung or otherwise
misbehaved during the execution of the test (beyond any infraction in
the behaviour under test), live_test provides this by comparing the
GPU state before and after, alerting if it unexpectedly changed (e.g.
the reset counter changed). It also ensures that the GPU is idle before
and after the test, so that residual code running on the GPU is flushed
before testing.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190121222117.23305-5-chris@chris-wilson.co.uk
Some tests (e.g. igt_vma_pin1) presume that we have a completely clean
GGTT so that it can probe boundaries without fear that something is
already allocated there. However, the mock device is starting to get
complicated and following similar rules to the live device, i.e. we
can't guarantee that i915->ggtt remains clean, so create a temporary
address_space equivalent to the mock ggtt for the purpose.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190121222117.23305-7-chris@chris-wilson.co.uk
During review of commit 71fc448c1a ("drm/i915/selftests: Make evict
tolerant of foreign objects"), Matthew mentioned it would be better if
we explicitly tracked the objects we created. We have an obj->st_link
hook for this purpose, so add the corresponding list of objects and
reduce our loops to only consider our own list.
References: 71fc448c1a ("drm/i915/selftests: Make evict tolerant of foreign objects")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190121222117.23305-6-chris@chris-wilson.co.uk
The evict selftests presumed that all objects in use had been allocated
by itself. This is a dubious claim and so instead of asserting complete
control over the object lists, take (temporary) ownership of them
instead.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190118113632.7056-1-chris@chris-wilson.co.uk
Since commit d4ccceb055 ("drm/i915/icl: Ringbuffer interrupt handling")
we have required a mechanism to avoid touching the interrupt hardware
for breadcrumbs, superseding our mock interface for selftests.
The residual problem (ideas welcome) is in probing the mock ring
registers for ring_is_idle. Hmm, maybe we should just install
mock handlers for i915->uncore.mmio__write and friends? Only problem
being is that we would to truly mock some expected reads. :(
References: d4ccceb055 ("drm/i915/icl: Ringbuffer interrupt handling")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190118112225.13780-1-chris@chris-wilson.co.uk
Since we have the ppgtt we want to test, we can ask it directly if it is
suitable for the hugepage test we intend to undertake.
v2: Not everyone has full-ppgtt
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190117230512.4789-1-chris@chris-wilson.co.uk
Make i915_gem_set_wedged() and i915_gem_unset_wedged() behaviour more
consistent if called concurrently, and only do the wedging and reporting
once, curtailing any possible race where we start unwedging in the middle
of a wedge.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114210408.4561-2-chris@chris-wilson.co.uk
We have two classes of VM, global GTT and per-process GTT. In order to
allow ourselves the freedom to mix both along call chains, distinguish
the two classes with regards to their mutex and lockdep maps.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114215956.32266-1-chris@chris-wilson.co.uk
Frequently, we use intel_runtime_pm_get/_put around a small block.
Formalise that usage by providing a macro to define such a block with an
automatic closure to scope the intel_runtime_pm wakeref to that block,
i.e. macro abuse smelling of python.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-15-chris@chris-wilson.co.uk
Track the temporary wakerefs used within the selftests so that leaks are
clear.
v2: Add a couple of coarse annotations for mock selftests as we now
loudly warn about the errors.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-14-chris@chris-wilson.co.uk
The majority of runtime-pm operations are bounded and scoped within a
function; these are easy to verify that the wakeref are handled
correctly. We can employ the compiler to help us, and reduce the number
of wakerefs tracked when debugging, by passing around cookies provided
by the various rpm_get functions to their rpm_put counterpart. This
makes the pairing explicit, and given the required wakeref cookie the
compiler can verify that we pass an initialised value to the rpm_put
(quite handy for double checking error paths).
For regular builds, the compiler should be able to eliminate the unused
local variables and the program growth should be minimal. Fwiw, it came
out as a net improvement as gcc was able to refactor rpm_get and
rpm_get_if_in_use together,
v2: Just s/rpm_put/rpm_put_unchecked/ everywhere, leaving the manual
mark up for smaller more targeted patches.
v3: Mention the cookie in Returns
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-2-chris@chris-wilson.co.uk
Everytime we take a wakeref, record the stack trace of where it was
taken; clearing the set if we ever drop back to no owners. For debugging
a rpm leak, we can look at all the current wakerefs and check if they
have a matching rpm_put.
v2: Use skip=0 for unwinding the stack as it appears our noinline
function doesn't appear on the stack (nor does save_stack_trace itself!)
v3: Allow rpm->debug_count to disappear between inspections and so
avoid calling krealloc(0) as that may return a ZERO_PTR not NULL! (Mika)
v4: Show who last acquire/released the runtime pm
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Tested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-1-chris@chris-wilson.co.uk
By using the wa lists inside the live driver structures, we won't
catch issues where those are incorrectly setup or corrupted.
To cover this gap, update the workaround framework to allow saving the
wa lists to independent structures and use them in the selftests.
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190110013232.8972-1-daniele.ceraolospurio@intel.com
[tursulin: Fixup checkpatch whitespace complaint in memset.]
UAPI Changes:
Cross-subsystem Changes:
- Turn dma-buf fence sequence numbers into 64 bit numbers
Core Changes:
- Move to a common helper for the DP MST hotplug for radeon, i915 and
amdgpu
- i2c improvements for drm_dp_mst
- Removal of drm_syncobj_cb
- Introduction of an helper to create and attach the TV margin properties
Driver Changes:
- Improve cache flushes for v3d
- Reflection support for vc4
- HDMI overscan support for vc4
- Add implicit fencing support for rockchip and sun4i
- Switch to generic fbdev emulation for virtio
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCXDOTqAAKCRDj7w1vZxhR
xZ8QAQD4j8m9Ea3bzY5Rr8BYUx1k+Cjj6Y6abZmot2rSvdyOHwD+JzJFIFAPZjdd
uOKhLnDlubaaoa6OGPDQShjl9p3gyQE=
=WQGO
-----END PGP SIGNATURE-----
Merge tag 'drm-misc-next-2019-01-07-1' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for 5.1:
UAPI Changes:
Cross-subsystem Changes:
- Turn dma-buf fence sequence numbers into 64 bit numbers
Core Changes:
- Move to a common helper for the DP MST hotplug for radeon, i915 and
amdgpu
- i2c improvements for drm_dp_mst
- Removal of drm_syncobj_cb
- Introduction of an helper to create and attach the TV margin properties
Driver Changes:
- Improve cache flushes for v3d
- Reflection support for vc4
- HDMI overscan support for vc4
- Add implicit fencing support for rockchip and sun4i
- Switch to generic fbdev emulation for virtio
Signed-off-by: Dave Airlie <airlied@redhat.com>
[airlied: applied amdgpu merge fixup]
From: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190107180333.amklwycudbsub3s5@flea
Generally catch up with 5.0-rc1, and specifically get the changes:
96d4f267e4 ("Remove 'type' argument from access_ok() function")
0b2c8f8b6b ("i915: fix missing user_access_end() in page fault exception case")
594cc251fd ("make 'user_access_begin()' do 'access_ok()'")
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
When we first introduced the reset to sanitize the GPU on taking over
from the BIOS and before returning control to third parties (the BIOS!),
we restricted it to only systems utilizing HW contexts as we were
uncertain of how stable our reset mechanism truly was. We now have
reasonable coverage across all machines that expose a GPU reset method,
and so we should be safe to sanitize the GPU state everywhere.
v2: We _have_ to skip the reset if it would clobber the display.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190103112104.19561-1-chris@chris-wilson.co.uk
With kasan on a slow machine, it can take an age to check all the
partial mappings in a single iteration, so break it up with a
cond_resched) to avoid RCU stall reports.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190102114431.23022-1-chris@chris-wilson.co.uk
Now that we perform the request flushing inline with emitting the
breadcrumb, we can remove the now redundant manual flush. And we can
also remove the infrastructure that remained only for its purpose.
v2: emit_breadcrumb_sz is in dwords, but rq->reserved_space is in bytes
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181228171641.16531-1-chris@chris-wilson.co.uk
totalram_pages and totalhigh_pages are made static inline function.
Main motivation was that managed_page_count_lock handling was complicating
things. It was discussed in length here,
https://lore.kernel.org/patchwork/patch/995739/#1181785 So it seemes
better to remove the lock and convert variables to atomic, with preventing
poteintial store-to-read tearing as a bonus.
[akpm@linux-foundation.org: coding style fixes]
Link: http://lkml.kernel.org/r/1542090790-21750-4-git-send-email-arunks@codeaurora.org
Signed-off-by: Arun KS <arunks@codeaurora.org>
Suggested-by: Michal Hocko <mhocko@suse.com>
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We currently require that our per-engine reset can be called from any
context, even hardirq, and in the future wish to perform the device
reset without holding struct_mutex (which requires some lockless
shenanigans that demand the lowlevel intel_reset_gpu() be able to be
used in atomic context). Test that we meet the current requirements by
calling i915_reset_engine() from under various atomic contexts.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181213091522.2926-2-chris@chris-wilson.co.uk
After declaring a terminally wedged device, we allow ourselves to
recover on the next GPU reset (manually triggered), or resume. Check
that resetting a wedged device does work.
v2: Add rpm (taken explicitly in the subtest in case we remove the outer
wakeref) and early warning to i915_reset() for missed wakerefs
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181213091522.2926-1-chris@chris-wilson.co.uk
Many errs of the form:
drivers/gpu/drm/i915/selftests/intel_hangcheck.c: In function ‘__igt_reset_evict_vma’:
./include/linux/kern_levels.h:5:18: error: format ‘%x’ expects argument of type ‘unsigned int’, but argum
Fixes: b312d8ca3a ("dma-buf: make fence sequence numbers 64 bit v2")
Cc: Christian König <christian.koenig@amd.com>
Cc: Chunming Zhou <david1.zhou@amd.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20181207123428.16257-1-mika.kuoppala@linux.intel.com
The mmio readback for verify_gt_engine_wa() also needs a runtime-pm
wakeref, so effectively do the entirety of both engine workarounds
tests. As such simplify the rpm behaviour here by acquiring the wakeref
for the whole of each subtest. It would be still useful to later verify
the registers retain their magic values across rpm suspend/resume.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181206180713.6827-1-chris@chris-wilson.co.uk
Impose a restraint that we have all vma pinned for a request prior to
its allocation. This is to simplify request construction, and should
facilitate unravelling the lock interdependencies later.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181204141522.13640-3-chris@chris-wilson.co.uk
Instead of having a separate list of white-listed registers we can
trivially move this to the common workarounds framework.
This brings us one step closer to the goal of driving all workaround
classes using the same code.
v2:
* Use GEM_DEBUG_WARN_ON for the sanity check. (Chris Wilson)
v3:
* API rename. (Chris Wilson)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20181203125014.3219-6-tvrtko.ursulin@linux.intel.com
Two simple selftests which test that both GT and engine workarounds are
not lost after either a full GPU reset, or after the per-engine ones.
(Including checks that one engine reset is not affecting workarounds not
belonging to itself.)
v2:
* Rebase for series refactoring.
* Add spinner for actual engine reset!
* Add idle reset test as well. (Chris Wilson)
* Share existing global_reset_lock. (Chris Wilson)
v3:
* intel_engine_verify_workarounds can be static.
* API rename. (Chris Wilson)
* Move global reset lock out of the loop. (Chris Wilson)
v4:
* Add missing rpm puts. (Chris Wilson)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20181203125014.3219-5-tvrtko.ursulin@linux.intel.com
The test was missing some magic ingredients to actually trigger the
resets.
In case of the full reset we need the I915_RESET_HANDOFF flag set, and in
case of engine reset we need a busy request.
Thanks to Chris for helping with reset magic.
v2:
* Grab RPM ref over reset.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20181130095211.23849-1-tvrtko.ursulin@linux.intel.com
When using softpin it's not enough to just pad the vma size, we also
need to ensure the vma offset is at the start of the pt boundary, if we
plan to utilize 64K pages. Therefore to improve test coverage we should
use both aligned and unaligned gtt offsets in igt_write_huge.
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20181029203734.21936-1-matthew.auld@intel.com
The vm of two contexts are supposed to be independent, such that a stray
write by one cannot be detected by another. Normally the GTT is filled
explicitly by userspace, but the space in between objects is filled with
a scratch page -- and that scratch page should not be able to form an
inter-context backchannel.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181029172925.10159-1-chris@chris-wilson.co.uk
Use the live_test struct to record the reset count before and compare it
at the end of the test to assert that no mystery hang occurred during the
test.
v2: Check per-engine resets as well
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181012122404.10874-1-chris@chris-wilson.co.uk
commit e346a991f4 ("drm/i915/guc: drop negative doorbell alloc
selftest") removed the negative case from the selftest and left no
code between the goto from the positive case of the test and the label
itself, so we can get rid of it.
Reported-by: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20181022230427.5616-5-daniele.ceraolospurio@intel.com
The test requires driver tweaks to avoid causing error messages
on intentionally-triggered errors and to stop accessing non
existing register. However, this is a pure GuC FW interface test
and should be covered by FW validation, so it isn't really worth
tweaking the driver for it and we're better off dropping it instead.
Testing the driver running out of doorbells is already covered by
igt_guc_doorbells
Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20181018004610.22895-1-daniele.ceraolospurio@intel.com
For mmap-exhaustion, we deliberately put the system under a large amount
of pressure to ensure that we are able to reap mmap-offsets from dead
objects. If background activity does that reaping for us, that defeats
the purpose of the test and in some cases will fail our sanity checks
(because of the fake activity we use to prevent the idle worker).
Fixes: 932cac10c8 ("drm/i915/selftests: Prevent background reaping of acti
ve objects")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181011103748.18387-1-chris@chris-wilson.co.uk
GuC stores some data in there, which might be stale after a reset.
We already reset the WQ head and tail, but more things are being moved
to the descriptor with the interface updates. Instead of trying to track
them one by one, always memset and init the descriptors from scratch
after GuC is loaded.
The code is also reorganized so that the above operations and the
doorbell creation are grouped as "client enabling"
v2: add proc_desc_fini for symmetry (Daniele), remove unneeded var init,
add guc_is_alive() (Michal)
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20181002215430.15049-1-daniele.ceraolospurio@intel.com
In the next few patches, we will want to give a small priority boost to
some requests/queues but not so much that we perturb the user controlled
order. As such we will shift the user priority bits higher leaving
ourselves a few low priority bits for our internal bumping.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181001123204.23982-3-chris@chris-wilson.co.uk
Now that we are confident in providing full-ppgtt where supported,
remove the ability to override the context isolation.
v2: Remove faked aliasing-ppgtt for testing as it no longer is accepted.
v3: s/USES/HAS/ to match usage and reject attempts to load the module on
old GVT-g setups that do not provide support for full-ppgtt.
v4: Insulate ABI ppGTT values from our internal enum (later plans
involve moving ppGTT depth out of the enum, thus potentially breaking
ABI unless we document the current values).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Zhi Wang <zhi.a.wang@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180926201222.5643-1-chris@chris-wilson.co.uk
Very light stress test to bombard the submission backends with a large
stream with requests of randomly assigned priorities. Preemption will be
occasionally requested, but unlikely to ever succeed! (Although we may
build a long queue of requests and so may trigger an attempt to inject a
preempt context, as we emit no batch, the arbitration window is limited
to between requests inside the ringbuffer. The likelihood of actually
causing a preemption event is therefore very small. A later variant
should try to improve the likelihood of preemption events!)
v2: Include a second pattern with more frequent preemption
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180925083205.2229-1-chris@chris-wilson.co.uk
If we copy all the contents of the sg across and not just the page link,
we can then also put it to work in fake_get_huge_pages and beyond.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180920142707.19659-1-matthew.auld@intel.com
We need to exercise the HW and submission paths for switching contexts
rapidly to check that features such as execlists' wa_tail are adequate.
Plus it's an interesting baseline latency metric.
v2: Check the initial request for allocation errors
v3: Use finite waits for more robust handling of broken code
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180920105809.1872-1-chris@chris-wilson.co.uk
Merge tag 'gvt-next-2018-09-04'
drm-intel-next-2018-09-06-1:
UAPI Changes:
- GGTT coherency GETPARAM: GGTT has turned out to be non-coherent for some
platforms, which we've failed to communicate to userspace so far. SNA was
modified to do extra flushing on non-coherent GGTT access, while Mesa will
mitigate by always requiring WC mapping (which is non-coherent anyway).
- Neuter Resource Streamer uAPI: There never really were users for the feature,
so neuter it while keeping the interface bits for compatibility. This is a
long due item from past.
Cross-subsystem Changes:
- Backmerge of branch drm-next-4.19 for DP_DPCD_REV_14 changes
Core Changes:
- None
Driver Changes:
- A load of Icelake (ICL) enabling patches (Paulo, Manasi)
- Enabled full PPGTT for IVB,VLV and HSW (Chris)
- Bugzilla #107113: Distribute DDB based on display resolutions (Mahesh)
- Bugzillas #100023,#107476,#94921: Support limited range DP displays (Jani)
- Bugzilla #107503: Increase LSPCON timeout (Fredrik)
- Avoid boosting GPU due to an occasional stall in interactive workloads (Chris)
- Apply GGTT coherency W/A only for affected systems instead of all (Chris)
- Fix for infinite link training loop for faulty USB-C MST hubs (Nathan)
- Keep KMS functional on Gen4 and earlier when GPU is wedged (Chris)
- Stop holding ppGTT reference from closed VMAs (Chris)
- Clear error registers after error capture (Lionel)
- Various Icelake fixes (Anusha, Jyoti, Ville, Tvrtko)
- Add missing Coffeelake (CFL) PCI IDs (Rodrigo)
- Flush execlists tasklet directly from reset-finish (Chris)
- Fix LPE audio runtime PM (Chris)
- Fix detection of out of range surface positions (GLK/CNL) (Ville)
- Remove wait-for-idle for PSR2 (Dhinakaran)
- Power down existing display hardware resources when display is disabled (Chris)
- Don't allow runtime power management if RC6 doesn't exist (Chris)
- Add debugging checks for runtime power management paths (Imre)
- Increase symmetry in display power init/fini paths (Imre)
- Isolate GVT specific macros from i915_reg.h (Lucas)
- Increase symmetry in power management enable/disable paths (Chris)
- Increase IP disable timeout to 100 ms to avoid DRM_ERROR (Imre)
- Fix memory leak from HDMI HDCP write function (Brian, Rodrigo)
- Reject Y/Yf tiling on interlaced modes (Ville)
- Use a cached mapping for the physical HWS on older gens (Chris)
- Force slow path of writing relocations to buffer if unable to write to userspace (Chris)
- Do a full device reset after being wedged (Chris)
- Keep forcewake counts over reset (in case of debugfs user) (Imre, Chris)
- Avoid false-positive errors from power wells during init (Imre)
- Reset engines forcibly in exchange of declaring whole device wedged (Mika)
- Reduce context HW ID lifetime in preparation for Icelake (Chris)
- Attempt to recover from module load failures (Chris)
- Keep select interrupts over a reset to avoid missing/losing them (Chris)
- GuC submission backend improvements (Jakub)
- Terminate context images with BB_END (Chris, Lionel)
- Make GCC evaluate GGTT view struct size assertions again (Ville)
- Add selftest to exercise suspend/hibernate code-paths for GEM (Chris)
- Use a full emulation of a user ppgtt context in selftests (Chris)
- Exercise resetting in the middle of a wait-on-fence in selftests (Chris)
- Fix coherency issues on selftests for Baytrail (Chris)
- Various other GEM fixes / self-test updates (Chris, Matt)
- GuC doorbell self-tests (Daniele)
- PSR mode control through debugfs for IGTs (Maarten)
- Degrade expected WM latency errors to DRM_DEBUG_KMS (Chris)
- Cope with errors better in MST link training (Dhinakaran)
- Fix WARN on KBL external displays (Azhar)
- Power well code cleanups (Imre)
- Fixes to PSR debugging (Dhinakaran)
- Make forcewake errors louder for easier catching in CI (WARNs) (Chris)
- Fortify tiling code against programmer errors (Chris)
- Bunch of fixes for CI exposed corner cases (multiple authors, mostly Chris)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180907105446.GA22860@jlahtine-desk.ger.corp.intel.com
Future gen reduce the number of bits we will have available to
differentiate between contexts, so reduce the lifetime of the ID
assignment from that of the context to its current active cycle (i.e.
only while it is pinned for use by the HW, will it have a constant ID).
This means that instead of a max of 2k allocated contexts (worst case
before fun with bit twiddling), we instead have a limit of 2k in flight
contexts (minus a few that have been pinned by the kernel or by perf).
To reduce the number of contexts id we require, we allocate a context id
on first and mark it as pinned for as long as the GEM context itself is,
that is we keep it pinned it while active on each engine. If we exhaust
our context id space, then we try to reclaim an id from an idle context.
In the extreme case where all context ids are pinned by active contexts,
we force the system to idle in order to recover ids.
We cannot reduce the scope of an HW-ID to an engine (allowing the same
gem_context to have different ids on each engine) as in the future we
will need to preassign an id before we know which engine the
context is being executed on.
v2: Improved commentary (Tvrtko) [I tried at least]
References: https://bugs.freedesktop.org/show_bug.cgi?id=107788
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180904153117.3907-1-chris@chris-wilson.co.uk
So far we have been relying on vm->file pointer being NULL to declare
something GGTT.
This has the unfortunate consequence that the default kernel context is
also declared GGTT and interferes with the following patch which wants to
instantiate VMA's and execute requests against the kernel context.
Change the is_ggtt test to use an explicit flag in struct address_space to
solve this issue.
Note that the bit used is free since there is an alignment hole in the
struct.
v2:
* Mark mock ggtt.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180831143643.12366-1-tvrtko.ursulin@linux.intel.com
Although we cannot do a full system-level test of suspend/hibernate from
deep with the kernel selftests, we can exercise the GEM subsystem in
isolation and simulate the external effects (such as losing stolen
contents and trashing the register state).
v2: Don't forget to hold rpm
v3: Suspend the GTT mappings, and more rpm!
References: https://bugs.freedesktop.org/show_bug.cgi?id=96526
References: 5ab57c7020 ("drm/i915: Flush logical context image out to memory upon suspend")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jakub Bartmiński <jakub.bartminski@intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Jakub Bartmiński <jakub.bartminski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180830134806.21939-1-chris@chris-wilson.co.uk
We currently verify that all doorbells can be registered with GuC and
HW but don't check that all works as expected after a db ring.
Do a nop ring of all doorbells to make sure we haven't misprogrammed
any WQ or stage descriptor data. This will also help validating
upcoming changes in the db programming flow.
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Acked-by: Katarzyna Dec <katarzyna.dec@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180827223614.22789-1-daniele.ceraolospurio@intel.com
dma_fence_default_wait is the default now, same for the trivial
enable_signaling implementation.
v2: Also remove the relase hook, dma_fence_free is the default.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Link: https://patchwork.freedesktop.org/patch/msgid/20180704092909.6599-2-daniel.vetter@ffwll.ch
The call to i915_gem_unpark() checks that we hold a rpm wakeref before
taking a long term wakeref for i915->gt.awake. We should therefore make
sure we do hold the wakeref when directly calling unpark to disable
the retire worker.
Fixes: 932cac10c8 ("drm/i915/selftests: Prevent background reaping of active objects")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180809063449.4474-1-chris@chris-wilson.co.uk
(cherry picked from commit 7b5ee80a5d)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
On suspend, we cancel the automatic forcewake and clear all other sources
of forcewake so the machine can sleep before we do suspend. However, we
expose the forcewake to userspace (only via debugfs, but nevertheless we
do) and want to restore that upon resume or else our accounting will be
off and we may not acquire the forcewake before we use it. So record
which domains we cleared on suspend and reacquire them early on resume.
v2: Hold the spinlock to appease our sanitychecks
v3: s/fw_domains_user/fw_domains_saved/ to convey intent more clearly
Reported-by: Imre Deak <imre.deak@linux.intel.com>
Fixes: b847305080 ("drm/i915: Fix forcewake active domain tracking")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Imre Deak <imre.deak@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180808210842.3555-1-chris@chris-wilson.co.uk
(cherry picked from commit d60996ab43)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The call to i915_gem_unpark() checks that we hold a rpm wakeref before
taking a long term wakeref for i915->gt.awake. We should therefore make
sure we do hold the wakeref when directly calling unpark to disable
the retire worker.
Fixes: 932cac10c8 ("drm/i915/selftests: Prevent background reaping of active objects")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180809063449.4474-1-chris@chris-wilson.co.uk
On suspend, we cancel the automatic forcewake and clear all other sources
of forcewake so the machine can sleep before we do suspend. However, we
expose the forcewake to userspace (only via debugfs, but nevertheless we
do) and want to restore that upon resume or else our accounting will be
off and we may not acquire the forcewake before we use it. So record
which domains we cleared on suspend and reacquire them early on resume.
v2: Hold the spinlock to appease our sanitychecks
v3: s/fw_domains_user/fw_domains_saved/ to convey intent more clearly
Reported-by: Imre Deak <imre.deak@linux.intel.com>
Fixes: b847305080 ("drm/i915: Fix forcewake active domain tracking")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Imre Deak <imre.deak@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180808210842.3555-1-chris@chris-wilson.co.uk
Experience teaches us over and over again that coherency on Baytrail
requires the odd heavy hammer, and in particular clflush alone is not
enough to guarrantee that writes from the CPU are picked up by the CS.
Do as we do elsewhere and ensure we have an unconditional
i915_gem_chipset_flush() after writing to memory and submitting a batch
to HW.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107499
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180806144604.8346-1-chris@chris-wilson.co.uk
We occasionally see that the clflush prior to a read of GPU data is
returning stale data, reminiscent of much earlier bugs fixed by adding a
second clflush for serialisation. As drm_clflush_virt_range() already
supplies the workaround, use it rather than open code the clflush
instruction.
References: 396f5d62d1 ("drm: Restore double clflush on the last partial cacheline")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180730075351.15569-3-chris@chris-wilson.co.uk
On older HW, gen2/3, fence registers are used for detiling GPU commands
and as such changing those registers requires serialisation with the
requests on the GPU. Anything running on the GPU is subject to a hang,
and so we must be able to recover cleanly in the middle of a stuck wait
on a fence register.
We can simulate using the fence on the GPU simply by marking the fence
as active on the request for this vma, the interface being common to all
gen, thus broadening the test.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180719194746.19111-2-chris@chris-wilson.co.uk
To test eviction from a ppgtt, we just want a ppgtt i.e. something other
than the Global GTT which is shared and used by the kernel for HW
features like fencing and scanout. However, we also need it to pass
!i915_is_ggtt() and the simplest way is to emulate a full user context
rather than the internal kernel context that is used for the GGTT.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180719194746.19111-1-chris@chris-wilson.co.uk
i915_gem_tile_height() asserts that the object is tiled, but inside the
error printer for the selftest we computed the row size regardless of
tiling, tripping over the assert.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180726104759.8684-1-chris@chris-wilson.co.uk
power well support and begin of DSI support addition. Also there were many improvements
on execlists and interrupts for minimal latency on command submission; and many fixes
on selftests, mostly caught by our CI.
General driver:
- Clean-up on aux irq (Lucas)
- Mark expected switch fall-through for dealing with static analysis tools (Gustavo)
Gem:
- Different fixes for GuC (Chris, Anusha, Michal)
- Avoid self-relocation BIAS if no relocation (Chris)
- Improve debugging cases in on EINVAL return and vma allocation (Chris)
- Fixes and improvements on context destroying and freeing (Chris)
- Wait for engines to idle before retiring (Chris)
- Many improvements on execlists and interrupts for minimal latency on command submission (Chris)
- Many fixes in selftests, specially on cases highlighted on CI (Chris)
- Other fixes and improvements around GGTT (Chris)
- Prevent background reaping of active objects (Chris)
Display:
- Parallel modeset cleanup to fix driver reset (Chris)
- Get AUX power domain for DP main link (Imre)
- Clean-up on PSR unused func pointers (Rodrigo)
- Many PSR/PSR2 fixes and improvements (DK, Jose, Tarun)
- Add a PSR1 live status (Vathsala)
- Replace old drm_*_{un/reference} with put,get functions (Thomas)
- FBC fixes (Maarten)
- Abstract and document the usage of picking macros (Jani)
- Remove unnecessary check for unsupported modifiers for NV12. (DK)
- Interrupt fixes for display (Ville)
- Clean up on sdvo code (Ville)
- Clean up on current DSI code (Jani)
- Remove support for legacy debugfs crc interface (Maarten)
- Simplify get_encoder_power_domains (Imre)
Icelake:
- MG PLL fixes (Imre)
- Add hw workaround for alpha blending (Vandita)
- Add power well support (Imre)
- Add Interrupt Support (Anusha)
- Start to add support for DSI on Ice Lake (Madhav)
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJbQ+ShAAoJEPpiX2QO6xPKas0H/igf9RFubtkMK7gHTef4FM+d
Bg+Qaq+O1vXlS/gidimL4NsVp1FxkejuCab0IffbTMvvjY0mv5NUA3kiIreAB0QZ
XO2hXr4fjjOINAQrdv5wiVMOqRjDws+fPgFFgZ8s5h1aJbofO27fjY/1MNtHwcA0
8VgtABpk+D3mkWvI8VTL0jCjYk2KocEvqUciz/Y7SQcPGV1iYFXqgBt5PR//rSvP
DU3u4R3KJGLDFbQwbe3uz2GxMfodAI6ijrqFeiizNSVqZORdTwnWlzKi6b6Cj9gl
SuleZacHPfv/+Ia7jmbmBqJEqi2GiAs948ne8QWL5/hsB9MMFO/UzwX/wYLNrP4=
=w6zC
-----END PGP SIGNATURE-----
Merge tag 'drm-intel-next-2018-07-09' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
Higlights here goes to many PSR fixes and improvements; to the Ice lake work with
power well support and begin of DSI support addition. Also there were many improvements
on execlists and interrupts for minimal latency on command submission; and many fixes
on selftests, mostly caught by our CI.
General driver:
- Clean-up on aux irq (Lucas)
- Mark expected switch fall-through for dealing with static analysis tools (Gustavo)
Gem:
- Different fixes for GuC (Chris, Anusha, Michal)
- Avoid self-relocation BIAS if no relocation (Chris)
- Improve debugging cases in on EINVAL return and vma allocation (Chris)
- Fixes and improvements on context destroying and freeing (Chris)
- Wait for engines to idle before retiring (Chris)
- Many improvements on execlists and interrupts for minimal latency on command submission (Chris)
- Many fixes in selftests, specially on cases highlighted on CI (Chris)
- Other fixes and improvements around GGTT (Chris)
- Prevent background reaping of active objects (Chris)
Display:
- Parallel modeset cleanup to fix driver reset (Chris)
- Get AUX power domain for DP main link (Imre)
- Clean-up on PSR unused func pointers (Rodrigo)
- Many PSR/PSR2 fixes and improvements (DK, Jose, Tarun)
- Add a PSR1 live status (Vathsala)
- Replace old drm_*_{un/reference} with put,get functions (Thomas)
- FBC fixes (Maarten)
- Abstract and document the usage of picking macros (Jani)
- Remove unnecessary check for unsupported modifiers for NV12. (DK)
- Interrupt fixes for display (Ville)
- Clean up on sdvo code (Ville)
- Clean up on current DSI code (Jani)
- Remove support for legacy debugfs crc interface (Maarten)
- Simplify get_encoder_power_domains (Imre)
Icelake:
- MG PLL fixes (Imre)
- Add hw workaround for alpha blending (Vandita)
- Add power well support (Imre)
- Add Interrupt Support (Anusha)
- Start to add support for DSI on Ice Lake (Madhav)
Signed-off-by: Dave Airlie <airlied@redhat.com>
# gpg: Signature made Tue 10 Jul 2018 08:41:37 AM AEST
# gpg: using RSA key FA625F640EEB13CA
# gpg: Good signature from "Rodrigo Vivi <rodrigo.vivi@intel.com>"
# gpg: aka "Rodrigo Vivi <rodrigo.vivi@gmail.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6D20 7068 EEDD 6509 1C2C E2A3 FA62 5F64 0EEB 13CA
Link: https://patchwork.freedesktop.org/patch/msgid/20180710234349.GA16562@intel.com
In the huge pages tests, we may have lots of objects being trapped on
the freelist as we hold the struct_mutex allowing the free worker no
opportunity to recover the backing store. We also have stricter
requirements and the desire for large contiguous pages, further
increasing the allocation pressure. To reduce the chance of running out
of memory, we could either drop the mutex and flush the free worker, or
we could release the backing store directly. We do the latter in this
patch for simplicity.
References: https://bugs.freedesktop.org/show_bug.cgi?id=107254
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180717082334.18774-1-chris@chris-wilson.co.uk
We must be able to reset the GPU while we are waiting on it to perform
an eviction (unbinding an active vma). So attach a spinning request to a
target vma and try and it evict it from a thread to see if that blocks
indefinitely.
v2: Add a wait for the thread to start just in case that takes more than
10ms...
v3: complete() not completion_done() to signal the completion.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180716134009.13143-1-chris@chris-wilson.co.uk
Inject a failure into preemption completion to pretend as if the HW
didn't successfully handle preemption and we are forced to do a reset in
the middle.
v2: Wait for preemption, to force testing with the missed preemption.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180716132154.12539-1-chris@chris-wilson.co.uk
If the user has created a read-only object, they should not be allowed
to circumvent the write protection by using a GGTT mmapping. Deny it.
Also most machines do not support read-only GGTT PTEs, so again we have
to reject attempted writes. Fortunately, this is known a priori, so we
can at least reject in the call to create the mmap (with a sanity check
in the fault handler).
v2: Check the vma->vm_flags during mmap() to allow readonly access.
v3: Remove VM_MAYWRITE to curtail mprotect()
Testcase: igt/gem_userptr_blits/readonly_mmap*
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: David Herrmann <dh.herrmann@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> #v1
Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180712185315.3288-4-chris@chris-wilson.co.uk
Hook up the flags to allow read-only ppGTT mappings for gen8+
v2: Include a selftest to check that writes to a readonly PTE are
dropped
v3: Don't duplicate cpu_check() as we can just reuse it, and even worse
don't wholesale copy the theory-of-operation comment from igt_ctx_exec
without changing it to explain the intention behind the new test!
v4: Joonas really likes magic mystery values
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180712185315.3288-2-chris@chris-wilson.co.uk
Since:
0d4b78b3d2 ("drm/i915/guc: Assert we have the doorbell before setting it up")
We have asserts in GuC doorbell related functions, which is a good thing.
Unfortunately, we were using those to check whether GuC FW is refusing
to allocate invalid doorbell - which makes the test fail.
Well, it would make the test WARN, except we fumbled cleanup ordering
and eat the BUG_ON instead.
Let's keep the asserts and use the internal implementation in the test.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107186
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180712112013.3253-1-chris@chris-wilson.co.uk
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Since live_workarounds poke around the w/a registers and checks to see
if they survive across a reset, we are prone to fouling the machine and
leaving it in a non-recoverable state. Wrap the probe inside a timeout
to abort the test if the reset fails.
v2: Include GEM_TRACE on declaring wedged.
v3: Add a few includes to make the header look standalone.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107188
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180711122952.18448-1-chris@chris-wilson.co.uk
In our swizzling selftests, we cannot predict the physical address of
the target page (at least not simply!) and so skip bit17 swizzles.
However, there are two bit17 swizzle modes and we only skipped one, with
the second being observed on the lab gdg causing the test to fail,
as soon as we hit a page with bit17 set in its address.
Testcase: igt/drv_selftest/live_objects #gdg
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180709194915.5789-1-chris@chris-wilson.co.uk
Be pessimistic and presume that we actually allocate every page we
exercise via the mock_gtt (e.g. for gvt). In which case we have to keep
our working set under the available physical memory to prevent oom.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180710080424.7821-1-chris@chris-wilson.co.uk
In igt_flush_test() we install a background timer in order to ensure
that the wait completes within a certain time. We can now tell the wait
that it has to complete within a timeout, and so no longer need the
background timer.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180709122044.7028-3-chris@chris-wilson.co.uk
Usually we have no idea about the upper bound we need to wait to catch
up with userspace when idling the device, but in a few situations we
know the system was idle beforehand and can provide a short timeout in
order to very quickly catch a failure, long before hangcheck kicks in.
In the following patches, we will use the timeout to curtain two overly
long waits, where we know we can expect the GPU to complete within a
reasonable time or declare it broken.
In particular, with a broken GPU we expect it to fail during the initial
GPU setup where do a couple of context switches to record the defaults.
This is a task that takes a few milliseconds even on the slowest of
devices, but we may have to wait 60s for hangcheck to give in and
declare the machine inoperable. In this a case where any gpu hang is
unacceptable, both from a timeliness and practical standpoint.
The other improvement is that in selftests, we do not need to arm an
independent timer to inject a wedge, as we can just limit the timeout on
the wait directly.
v2: Include the timeout parameter in the trace.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180709122044.7028-1-chris@chris-wilson.co.uk
In the next patch, we will want a third distinct class of timeline that
may overlap with the current pair of client and engine timeline classes.
Rather than use the ad hoc markup of SINGLE_DEPTH_NESTING, initialise
the different timeline classes with an explicit subclass.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180706210710.16251-1-chris@chris-wilson.co.uk
Inside the mock GEM device, we try to grab the runtime pm for the fake
device to prevent it from ever suspending. However, if CONFIG_PM is not
set, trying to obtain the wakref returns an error which we WARN about.
Suppress the expected warning.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180706205947.11209-1-chris@chris-wilson.co.uk
clflush is an unserialised instruction and the IA manual strongly advises
you to serialise it with a mb. To be cautious, apply one before and one
after, so that it is serialised with both writes and reads without
worrying too much about the required direction.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180706174926.4712-1-chris@chris-wilson.co.uk
Handling such a late error in request construction is tricky, but to
accommodate future patches which may allocate here, we potentially could
err. To handle the error after already adjusting global state to track
the new request, we must finish and submit the request. But we don't
want to use the request as not everything is being tracked by it, so we
opt to cancel the commands inside the request.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180706103947.15919-3-chris@chris-wilson.co.uk
Currently all callers are responsible for adding the vma to the active
timeline and then exporting its fence. Combine the two operations into
i915_vma_move_to_active() to move all the extra handling from the
callers to the single site.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180706103947.15919-1-chris@chris-wilson.co.uk
Replace the magic bit with the proper symbolic name for instructing
MI_STORE_DWORD_IMM to use a virtual address (on gen3) or the global GTT
address (still virtual!) on gen4+.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180706142323.25699-1-chris@chris-wilson.co.uk
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Limit the GTT size we try and allocate to ensure that it fits within RAM
and does not trigger the oomkiller indiscriminately.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180706125338.24432-1-chris@chris-wilson.co.uk
If the GPU is irrecoverably wedged, we can not execute any requests
making testing execlists (request execution) pointless. Skip!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180706114510.18467-1-chris@chris-wilson.co.uk
If the HW (or driver) doesn't support logical contexts, don't pretend we
gain anything from trying to execute GPU commands with them. At best it
reports -ENODEV, which is an unhelpful failure that we should just skip.
v2: Be more specific and check the driver/engine caps for logical (HW)
context support.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180706101923.28548-1-chris@chris-wilson.co.uk
If the GPU is terminally wedged we cannot submit any requests into a
context, completely unfulfilling our purpose of doing so. As this
expectedly fails, skip over the test.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180706065332.15214-9-chris@chris-wilson.co.uk
We test the GPU handling of huge pages by submitting requests that write
into a huge page, but if the GPU is irrecoverably wedged we cannot
submit any requests. As the test expectedly fails, skip over it.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180706065332.15214-8-chris@chris-wilson.co.uk
If the GPU is irrecoverably wedged, we cannot submit any requests and so
cannot make the GTT busy in order to test evicting active objects. As
this expectedly fails, skip over the test.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180706065332.15214-7-chris@chris-wilson.co.uk
If the GPU is irrecoverably wedged, we cannot submit any request and
therefore cannot query the register state of the context (which is done
using the GPU command stream). So skip over the test as it expectedly
fails.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180706065332.15214-6-chris@chris-wilson.co.uk
As we keep VMA around until the object is destroyed, when testing
partial tiling we instantiate many, many VMA (as the object is huge
allowing for many different partial regions). We test elsewhere our
handling of populating large objects with a full set of VMA and checking
we can retrieve them afterwards, but in this test we incur the cost of
flushing all VMA after every GTT write, dramatically slowing down the
test.
References: https://bugs.freedesktop.org/show_bug.cgi?id=107130
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180706065332.15214-2-chris@chris-wilson.co.uk
If the GPU is irrecoverably wedged on startup, it means that it failed
on initialisation and we have already tried to reset it but failed. We
can ignore all further testing, as it is already dead. Failing early,
prevents us from slowly failing in our endeavours later and timing out.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180705150214.28316-1-chris@chris-wilson.co.uk
i915_gem_detect_bit_6_swizzle() tries to hide unknown swizzling from
userspace (and ourselves) leaving us with the only clue inside
i915->quirks & QUIRK_PIN_SWIZZLED_PAGES. If we see this bit set, it
means that we really have no clue as to what the swizzle pattern is
being used in any one page and so cannot compute what the reference
value should be in our tiling selftests. We have to skip the test.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107133
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180705171523.18462-1-chris@chris-wilson.co.uk
Currently, the wc-stash used for providing flushed WC pages ready for
constructing the page directories is assumed to be protected by the
struct_mutex. However, we want to remove this global lock and so must
install a replacement global lock for accessing the global wc-stash (the
per-vm stash continues to be guarded by the vm).
We need to push ahead on this patch due to an oversight in hastily
removing the struct_mutex guard around the igt_ppgtt_alloc selftest. No
matter, it will prove very useful (i.e. will be required) in the near
future.
v2: Restore the onstack stash so that we can drop the vm->mutex in
future across the allocation.
v3: Restore the lost pagevec_init of the onstack allocation, and repaint
function names.
v4: Reorder init so that we don't try and use i915_address_space before
it is ininitialised.
Fixes: 1f6f00238a ("drm/i915/selftests: Drop struct_mutex around lowlevel pggtt allocation")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180704185518.4193-1-chris@chris-wilson.co.uk
For a ppgtt that we are constructing, there is no struct_mutex
dependence so skip it. In the process, also ping the scheduler
frequently to try and avoid the NMI watchdog.
v2: gen6 requires struct_mutex to clean up (currently)
Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
References: https://bugs.freedesktop.org/show_bug.cgi?id=107094
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180703135331.12265-1-chris@chris-wilson.co.uk
live_gtt is a very slow test to run, simply because it tries to allocate
and use as much as the 48b address space as possibly can and in the
process will try to own all of the system memory. This leads to resource
exhaustion and CPU starvation; the latter impacts us when the NMI
watchdog declares a task hung due to a mutex contention with ourselves.
This we can prevent by releasing the struct_mutex and forcing our
i915/rcu workers to run, and in particular flushing the freed object
worker that is the cause for concern.
References: https://bugs.freedesktop.org/show_bug.cgi?id=107094
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180703101829.7360-1-chris@chris-wilson.co.uk
make_obj_busy() makes a dummy busy object, but didn't attach the fence
to the reservation object, so it would not have registered as busy. For
completeness, attach the dummy request as the exclusive fence and mark
the object as written (in i915_vma_move_to_active)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180629133717.11761-2-chris@chris-wilson.co.uk
We correctly attach the exclusive fetch for the scratch object when
emitting a request that writes into it, but for completeness we should
also declared the write to i915_vma_move_to_active()
Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180629133717.11761-1-chris@chris-wilson.co.uk
This patch unifies the naming of DRM functions for reference counting
of struct drm_device. The resulting code is more aligned with the rest
of the Linux kernel interfaces.
Signed-off-by: Thomas Zimmermann <tdz@users.sourceforge.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20180618110154.30462-6-tdz@users.sourceforge.net
on all platforms back to HSW. As well many other fix and improvements,
Including:
- Use GEM suspend when aborting initialization (Chris)
- Change i915_gem_fault to return vm_fault_t (Chris)
- Expand VMA to Non gem object entities (Chris)
- Improve logs for load failure, but quite logging on fault injection to avoid noise on CI (Chris)
- Other page directory handling fixes and improvements for gen6 (Chris)
- Other gtt clean-up removing redundancies and unused checks (Chris)
- Reorder aliasing ppgtt fini (Chris)
- Refactor of unsetting obg->mm.pages (Chris)
- Apply batch location restrictions before pinning (Chris)
- Ringbuffer fixes for context restore (Chris)
- Execlist fixes on freeing error pointer on allocation error (Chris)
- Make closing request flush mandatory (Chris)
- Move GEM sanitize from resume_early to resume (Chris)
- Improve debug dumps (Chris)
- Silent compiler for selftest (Chris)
- Other execlists changes to improve hangcheck and reset.
- Many gtt page directory fixes and improvements (Chris)
- Reorg context workarounds (Chris)
- Avoid ERR_PTR dereference on selftest (Chris)
Other GEM related work:
- Stop trying to reset GPU if reset failed (Mika)
- Add HW workaround for KBL to fix GPU reset (Mika)
- Fix context ban and hang accounting for client (Mika)
- Fixes on OA perf (Michel, Jani)
- Refactor on GuC log mechanisms (Piotr)
- Enable provoking vertex fix on Gen9 system (Kenneth)
More ICL patches for Display enabling:
- ICL - 10-bit support for HDMI (RK)
- ICL - Start adding TBT PLL (Paulo)
- ICL - DDI HDMK level selection (Manasi)
- ICL - GMBUS GPIO pin mapping fix (Mahesh)
- ICL - Adding DP_AUX_E support (James)
- ICL - Display interrupts handling (DK)
Other display fixes and improvements:
- Fix sprite destination color keying on SKL+ (Ville)
- Fixes and improvements on PCH detection, specially for non PCH systems (Jani)
- Document PCH_NOP (Lucas)
- Allow DBLSCAN user modes with eDP/LVDS/DSI (Ville)
- Opregion and ACPI cleanup and organization (Jani)
- Kill delays when activation psr (Rodrigo)
- ...and a consequent fix of the psr activation flow (DK)
- Fix HDMI infoframe setting (Imre)
- Fix Display interrupts and modes on old gens (Ville)
- Start switching to kernel unsigned int types (Jani)
- Introduction to Amber Lake and Whiskey Lake platforms (Jose)
- Audio clock fixes for HBR3 (RK)
- Standardize i915_reg.h definitions according to our doc and checkpatch (Paulo)
- Remove unused timespec_to_jiffies_timeout function (Arnd)
- Increase the scope of PSR wake fix for other VBTs out there (Vathsala)
- Improve debug msgs with prop name/id (Ville)
- Other clean up on unecessary cursor size defines (Ville)
- Enforce max hdisplay/hblank_start limits on HSW/BDW (Ville)
- Make ELD pointers constant (Jani)
- Fix for PSR VBT parse (Colin)
- Add warn about unsupported CDCLK rates (Imre)
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJbKsMqAAoJEPpiX2QO6xPKI64H/0dHkMxw7/D83eODTJteDFBN
h3tdBnLFlPfeG3ZWDeSs04/dM4e9YacMN7v53j1ia4eW/F1ms0TLcegcuPqYafTW
H8fhwGB2B5gmr5hLfh5joQkxvaucQMFdg95fWRqir93VrKvVJAJEYNcaiGniejDf
qqiZue6DgAzli0zjAprfbQsnJ17TyRtnxm8lLIcFcHPoayHBzAUBZQEP6cA5qe/Y
/2ahGfkYOVVWY08DHaioDBOLUEUbxCC1AvMlv9VbtKmyPoQjTIW/1iTq0RRxDoGb
BwfDvigSiFAmpYEfVENB0qUd9e/0WhMboSnMrfzEcF2yUn4xoJx5nbmkRFkr1jI=
=mfO6
-----END PGP SIGNATURE-----
Merge tag 'drm-intel-next-2018-06-20' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
Chris is doing many reworks that allow us to get full-ppgtt supported
on all platforms back to HSW. As well many other fix and improvements,
Including:
- Use GEM suspend when aborting initialization (Chris)
- Change i915_gem_fault to return vm_fault_t (Chris)
- Expand VMA to Non gem object entities (Chris)
- Improve logs for load failure, but quite logging on fault injection to avoid noise on CI (Chris)
- Other page directory handling fixes and improvements for gen6 (Chris)
- Other gtt clean-up removing redundancies and unused checks (Chris)
- Reorder aliasing ppgtt fini (Chris)
- Refactor of unsetting obg->mm.pages (Chris)
- Apply batch location restrictions before pinning (Chris)
- Ringbuffer fixes for context restore (Chris)
- Execlist fixes on freeing error pointer on allocation error (Chris)
- Make closing request flush mandatory (Chris)
- Move GEM sanitize from resume_early to resume (Chris)
- Improve debug dumps (Chris)
- Silent compiler for selftest (Chris)
- Other execlists changes to improve hangcheck and reset.
- Many gtt page directory fixes and improvements (Chris)
- Reorg context workarounds (Chris)
- Avoid ERR_PTR dereference on selftest (Chris)
Other GEM related work:
- Stop trying to reset GPU if reset failed (Mika)
- Add HW workaround for KBL to fix GPU reset (Mika)
- Fix context ban and hang accounting for client (Mika)
- Fixes on OA perf (Michel, Jani)
- Refactor on GuC log mechanisms (Piotr)
- Enable provoking vertex fix on Gen9 system (Kenneth)
More ICL patches for Display enabling:
- ICL - 10-bit support for HDMI (RK)
- ICL - Start adding TBT PLL (Paulo)
- ICL - DDI HDMK level selection (Manasi)
- ICL - GMBUS GPIO pin mapping fix (Mahesh)
- ICL - Adding DP_AUX_E support (James)
- ICL - Display interrupts handling (DK)
Other display fixes and improvements:
- Fix sprite destination color keying on SKL+ (Ville)
- Fixes and improvements on PCH detection, specially for non PCH systems (Jani)
- Document PCH_NOP (Lucas)
- Allow DBLSCAN user modes with eDP/LVDS/DSI (Ville)
- Opregion and ACPI cleanup and organization (Jani)
- Kill delays when activation psr (Rodrigo)
- ...and a consequent fix of the psr activation flow (DK)
- Fix HDMI infoframe setting (Imre)
- Fix Display interrupts and modes on old gens (Ville)
- Start switching to kernel unsigned int types (Jani)
- Introduction to Amber Lake and Whiskey Lake platforms (Jose)
- Audio clock fixes for HBR3 (RK)
- Standardize i915_reg.h definitions according to our doc and checkpatch (Paulo)
- Remove unused timespec_to_jiffies_timeout function (Arnd)
- Increase the scope of PSR wake fix for other VBTs out there (Vathsala)
- Improve debug msgs with prop name/id (Ville)
- Other clean up on unecessary cursor size defines (Ville)
- Enforce max hdisplay/hblank_start limits on HSW/BDW (Ville)
- Make ELD pointers constant (Jani)
- Fix for PSR VBT parse (Colin)
- Add warn about unsupported CDCLK rates (Imre)
Signed-off-by: Dave Airlie <airlied@redhat.com>
# gpg: Signature made Thu 21 Jun 2018 07:12:10 AM AEST
# gpg: using RSA key FA625F640EEB13CA
# gpg: Good signature from "Rodrigo Vivi <rodrigo.vivi@intel.com>"
# gpg: aka "Rodrigo Vivi <rodrigo.vivi@gmail.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6D20 7068 EEDD 6509 1C2C E2A3 FA62 5F64 0EEB 13CA
Link: https://patchwork.freedesktop.org/patch/msgid/20180625165622.GA21761@intel.com
- Ice Lake's workarounds (Oscar and Yunwei)
- Ice Lake interrupt registers fixes (Oscar)
- Context switch timeline fixes and improvements (Chris)
- Spelling fixes (Colin)
- GPU reset fixes and improvements (Chris)
- Including fixes on execlist and preemption for a proper GPU reset (Chris)
- Clean-up the port pipe select bits (Ville)
- Other execlist improvements (Chris)
- Remove unused enable_cmd_parser parameter (Chris)
- Fix order of enabling pipe/transcoder/planes on HSW+ to avoid hang on ICL (Paulo)
- Simplification and changes on intel_context (Chris)
- Disable LVDS on Radiant P845 (Ondrej)
- Improve HSW/BDW voltage swing handling (Ville)
- Cleanup and renames on few parts of intel_dp code to make code clear and less confusing (Ville)
- Move acpi lid notification code for fixing LVDS (Chris)
- Speed up GPU idle detection (Chris)
- Make intel_engine_dump irqsafe (Chris)
- Fix GVT crash (Zhenyu)
- Move GEM BO inside drm_framebuffer and use intel_fb_obj everywhere (Chris)
- Revert edp's alternate fixed mode (Jani)
- Protect tainted function pointer lookup (Chris)
- And subsequent unsigned long size fix (Chris)
- Allow page directory allocation to fail (Chris)
- VBT's edp and lvds fix and clean-up (Ville)
- Many other reorganizations and cleanups on DDI and DP code, as well on scaler and planes (Ville)
- Selftest pin the mock kernel context (Chris)
- Many PSR Fixes, clean-up and improvements (Dhinakaran)
- PSR VBT fix (Vathsala)
- Fix i915_scheduler and intel_context declaration (Tvrtko)
- Improve PCH underruns detection on ILK-IVB (Ville)
- Few s/drm_priv/i915 (Chris, Michal)
- Notify opregion of the sanitized encoder state (Maarten)
- Guc's event handling improvements and fixes on initialization failures (Michal)
- Many gtt fixes and improvements (Chris)
- Fixes and improvements for Suspend and Freeze safely (Chris)
- i915_gem init and fini cleanup and fixes (Michal)
- Remove obsolete switch_mm for gen8+ (Chris)
- hw and context id fixes for GuC (Lionel)
- Add new vGPU cap info bit VGT_CAPS_HUGE_GTT (Changbin)
- Make context pin/unpin symmetric (Chris)
- vma: Move the bind_count vs pin_count assertion to a helper (Chris)
- Use available SZ_1M instead of 1 << 20 (Chris)
- Trace and PMU fixes and improvements (Tvrtko)
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJbGFxbAAoJEPpiX2QO6xPK618H/i+VkEGB+Qdr3h3bwhwVSWB1
TzHZKFSDxznm3rDGU9argGc/nk0af4Kbq1+jnG9FYou2bmW7+wRu9RwIiX4Dggmy
FJUHTZDm4lkP3KVlTGL9IbmS9/P6Opxdw9Hyn3WwpfDK2lg9KrRy3NwBtsxaLF6w
ZM8hrabsnv0p9RRbNNqb9PJmDJCyoCeyvKgQPeHxHrwiV3VLsqerbuWRAHAQ90Vz
/7hPvl6EcujpQR0xeaHt2+dFP2FTVVbVwyFyU4JMc5iPEDdQGOwPmxZCK8c7Khil
Uoy1iUtoE5YKrcutEfFhUDigkYIB4N6WSAVrWPxEaHYQzx3XyewZtKIxDHVHpMI=
=bbkZ
-----END PGP SIGNATURE-----
Merge tag 'drm-intel-next-2018-06-06' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
- Ice Lake's display enabling patches (Jose, Mahesh, Dhinakaran, Paulo, Manasi, Anusha, Arkadiusz)
- Ice Lake's workarounds (Oscar and Yunwei)
- Ice Lake interrupt registers fixes (Oscar)
- Context switch timeline fixes and improvements (Chris)
- Spelling fixes (Colin)
- GPU reset fixes and improvements (Chris)
- Including fixes on execlist and preemption for a proper GPU reset (Chris)
- Clean-up the port pipe select bits (Ville)
- Other execlist improvements (Chris)
- Remove unused enable_cmd_parser parameter (Chris)
- Fix order of enabling pipe/transcoder/planes on HSW+ to avoid hang on ICL (Paulo)
- Simplification and changes on intel_context (Chris)
- Disable LVDS on Radiant P845 (Ondrej)
- Improve HSW/BDW voltage swing handling (Ville)
- Cleanup and renames on few parts of intel_dp code to make code clear and less confusing (Ville)
- Move acpi lid notification code for fixing LVDS (Chris)
- Speed up GPU idle detection (Chris)
- Make intel_engine_dump irqsafe (Chris)
- Fix GVT crash (Zhenyu)
- Move GEM BO inside drm_framebuffer and use intel_fb_obj everywhere (Chris)
- Revert edp's alternate fixed mode (Jani)
- Protect tainted function pointer lookup (Chris)
- And subsequent unsigned long size fix (Chris)
- Allow page directory allocation to fail (Chris)
- VBT's edp and lvds fix and clean-up (Ville)
- Many other reorganizations and cleanups on DDI and DP code, as well on scaler and planes (Ville)
- Selftest pin the mock kernel context (Chris)
- Many PSR Fixes, clean-up and improvements (Dhinakaran)
- PSR VBT fix (Vathsala)
- Fix i915_scheduler and intel_context declaration (Tvrtko)
- Improve PCH underruns detection on ILK-IVB (Ville)
- Few s/drm_priv/i915 (Chris, Michal)
- Notify opregion of the sanitized encoder state (Maarten)
- Guc's event handling improvements and fixes on initialization failures (Michal)
- Many gtt fixes and improvements (Chris)
- Fixes and improvements for Suspend and Freeze safely (Chris)
- i915_gem init and fini cleanup and fixes (Michal)
- Remove obsolete switch_mm for gen8+ (Chris)
- hw and context id fixes for GuC (Lionel)
- Add new vGPU cap info bit VGT_CAPS_HUGE_GTT (Changbin)
- Make context pin/unpin symmetric (Chris)
- vma: Move the bind_count vs pin_count assertion to a helper (Chris)
- Use available SZ_1M instead of 1 << 20 (Chris)
- Trace and PMU fixes and improvements (Tvrtko)
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180611162737.GA2378@intel.com
Fix i915's CI build after the removal of the dmabuf->kmap interface that
left the mock routines intact.
In file included from drivers/gpu/drm/i915/i915_gem_dmabuf.c:335:0:
drivers/gpu/drm/i915/selftests/mock_dmabuf.c:104:13: error: ‘mock_dmabuf_kunmap_atomic’ defined but not used [-Werror=unused-function]
static void mock_dmabuf_kunmap_atomic(struct dma_buf *dma_buf, unsigned long page_num, void *addr)
drivers/gpu/drm/i915/selftests/mock_dmabuf.c:97:14: error: ‘mock_dmabuf_kmap_atomic’ defined but not used [-Werror=unused-function]
static void *mock_dmabuf_kmap_atomic(struct dma_buf *dma_buf, unsigned long page_num)
Fixes: f664a52695 ("dma-buf: remove kmap_atomic interface")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Christian König <christian.koenig@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20180620162152.1158-1-chris@chris-wilson.co.uk
Reviewed-by: Christian König <christian.koenig@amd.com>
We got a few conflicts in drm_atomic.c after merging the DRM writeback support,
now we need a backmerge to unlock develop development on drm-misc-next.
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
Neither used nor correctly implemented anywhere. Just completely remove
the interface.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
Link: https://patchwork.freedesktop.org/patch/226645/
Along the early error path for igt_switch_to_kernel_context we may try
to dereference an invalid error pointer. Instead, return early rather
than dump the GEM trace since we haven't yet emitted anything of
interest.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 09a4c02e58 ("drm/i915: Look for an active kernel context before switching")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180620112441.13085-1-chris@chris-wilson.co.uk
With an old (4.7.3 on 32bit) gcc, it emits a warning for
In file included from drivers/gpu/drm/i915/i915_request.c:1425:0:
drivers/gpu/drm/i915/selftests/i915_request.c: In function ‘live_nop_request’:
drivers/gpu/drm/i915/selftests/i915_request.c:380:21: error: ‘request’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
Silence it by just setting it to NULL on initialisation.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180614124923.18071-1-chris@chris-wilson.co.uk
For symmetry, simplicity and ensuring the request is always truly idle
upon its completion, always emit the closing flush prior to emitting the
request breadcrumb. Previously, we would only emit the flush if we had
started a user batch, but this just leaves all the other paths open to
speculation (do they affect the GPU caches or not?) With mm switching, a
key requirement is that the GPU is flushed and invalidated before hand,
so for absolute safety, we want that closing flush be mandatory.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180612105135.4459-1-chris@chris-wilson.co.uk
The legacy gen6 ppgtt needs a little more hand holding than gen8+, and
so requires a larger structure. As I intend to make this slightly more
complicated in the future, separate the gen6 from the core gen8 hw
struct by subclassing. This patch moves the gen6 only features out to
gen6_hw_ppgtt and pipes the new type everywhere that needs it.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180612081815.3585-1-chris@chris-wilson.co.uk
In the next patch, we will subclass the gen6 hw_ppgtt. In order, for the
two different generations of hw ppgtt stucts to be of different size,
push the allocation down to the constructor.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180607163040.9781-1-chris@chris-wilson.co.uk
To allow for future non-object backed vma, we need to be able to
specialise the callbacks for binding, et al, the vma. For example,
instead of calling vma->vm->bind_vma(), we now call
vma->ops->bind_vma(). This gives us the opportunity to later override the
operation for a custom vma.
v2: flip order of unbind/bind
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180607154047.9171-2-chris@chris-wilson.co.uk
In the near future, I want to subclass gen6_hw_ppgtt as it contains a
few specialised members and I wish to add more. To avoid the ugliness of
using ppgtt->base.base, rename the i915_hw_ppgtt base member
(i915_address_space) as vm, which is our common shorthand for an
i915_address_space local.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180605153758.18422-1-chris@chris-wilson.co.uk
We were not very carefully checking to see if an older request on the
engine was an earlier switch-to-kernel-context before deciding to emit a
new switch. The end result would be that we could get into a permanent
loop of trying to emit a new request to perform the switch simply to
flush the existing switch.
What we need is a means of tracking the completion of each timeline
versus the kernel context, that is to detect if a more recent request
has been submitted that would result in a switch away from the kernel
context. To realise this, we need only to look in our syncmap on the
kernel context and check that we have synchronized against all active
rings.
v2: Since all ringbuffer clients currently share the same timeline, we do
have to use the gem_context to distinguish clients.
As a bonus, include all the tracing used to debug the death inside
suspend.
v3: Test, test, test. Construct a selftest to exercise and assert the
expected behaviour that multiple switch-to-contexts do not emit
redundant requests.
Reported-by: Mika Kuoppala <mika.kuoppala@intel.com>
Fixes: a89d1f921c ("drm/i915: Split i915_gem_timeline into individual timelines")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180524081135.15278-1-chris@chris-wilson.co.uk
When testing reset, we wait for 1s on the main thread for the hang to
start. Meanwhile, we continue submitting requests on all the background
threads, and we may have more threads than cores and so potentially
starve the waiter from being woken within the timeout. As the hang
timeout and the active timeouts are the same, it is hard to distinguish
which caused the timeout. Bump the active thread timeouts to 5s,
compared to the 1s timeout for the hang, so that we preferentially
report the hang timing out, while hopefully ensuring that we do at least
wake up the hang thread first before declaring the background active
timeout.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180517142442.16979-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
As all backends implement the same pin_count mechanism and do a
dec-and-test as their first step, pull that into the common
intel_context_unpin(). This also pulls into the caller, eliminating the
indirect call in the usual steady state case. The intel_context_pin()
side is a little more complicated as it combines the lookup/alloc as
well as pinning the state, and so is left for a later date.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180517212633.24934-4-chris@chris-wilson.co.uk
To ease the frequent and ugly pointer dance of
&request->gem_context->engine[request->engine->id] during request
submission, store that pointer as request->hw_context. One major
advantage that we will exploit later is that this decouples the logical
context state from the engine itself.
v2: Set mock_context->ops so we don't crash and burn in selftests.
Cleanups from Tvrtko.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180517212633.24934-3-chris@chris-wilson.co.uk
In the next patch, we want to store the intel_context pointer inside
i915_request, as it is frequently access via a convoluted dance when
submitting the request to hw. Having two context pointers inside
i915_request leads to confusion so first rename the existing
i915_gem_context pointer to i915_request.gem_context.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180517212633.24934-1-chris@chris-wilson.co.uk
We write all 4K page entries, even when using 64K pages. In order to
verify that the HW isn't cheating by using the 4K PTE instead of the 64K
PTE, we want to remove all the surplus entries. If the HW skipped the
64K PTE, it will read/write into the scratch page instead - which we
detect as missing results during selftests.
v2: much improved commentary (Chris)
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Changbin Du <changbin.du@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180511095140.25590-1-matthew.auld@intel.com
In igt_flush_test() we try to switch back to the kernel context, but we
are only able to do so when we are called with struct_mutex held.
More of my CI fallout from lockdep being temporarily suppressed :(
Fixes: 4cdf65ce8c ("drm/i915/selftests: Return to kernel context after each test")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180509065926.19207-1-chris@chris-wilson.co.uk
Calling mock_engine() calls i915_timeline_init() and that requires
struct_mutex to be held as it adds itself to the global list of
timelines. This error was introduced by commit a89d1f921c ("drm/i915:
Split i915_gem_timeline into individual timelines") but the issue was
masked in CI by the earlier lockdep spam.
Fixes: a89d1f921c ("drm/i915: Split i915_gem_timeline into individual timelines")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180508211056.17151-1-chris@chris-wilson.co.uk
As we flush each test and wait for idle before the next, also switch
back to the kernel context. This helps limit the amount of collateral
damage a test may cause by resetting to the default state each time (and
also helps clean up temporaries used by the test).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180508115312.12628-1-chris@chris-wilson.co.uk
igt_ctx_exec() expects that we retire all active requests/objects before
completing, so that when we clean up the files afterwards they are ready
to be freed. Before we do so, it is then prudent to ensure that we have
indeed retired the GPU activity, raising an error if it fails. If we do
not, we run the risk of triggering an assertion when freeing the object:
__i915_gem_free_objects:4793 GEM_BUG_ON(i915_gem_object_is_active(obj))
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180505091014.26126-2-chris@chris-wilson.co.uk
When userspace is passing around swapbuffers using DRI, we frequently
have to open and close the same object in the foreign address space.
This shows itself as the same object being rebound at roughly 30fps
(with a second object also being rebound at 30fps), which involves us
having to rewrite the page tables and maintain the drm_mm range manager
every time.
However, since the object still exists and it is only the local handle
that disappears, if we are lazy and do not unbind the VMA immediately
when the local user closes the object but defer it until the GPU is
idle, then we can reuse the same VMA binding. We still have to be
careful to mark the handle and lookup tables as closed to maintain the
uABI, just allowing the underlying VMA to be resurrected if the user is
able to access the same object from the same context again.
If the object itself is destroyed (neither userspace keeping a handle to
it), the VMA will be reaped immediately as usual.
In the future, this will be even more useful as instantiating a new VMA
for use on the GPU will become heavier. A nuisance indeed, so nip it in
the bud.
v2: s/__i915_vma_final_close/i915_vma_destroy/ etc.
v3: Leave a hint as to why we deferred the unbind on close.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180503195115.22309-1-chris@chris-wilson.co.uk
We need to move to a more flexible timeline that doesn't assume one
fence context per engine, and so allow for a single timeline to be used
across a combination of engines. This means that preallocating a fence
context per engine is now a hindrance, and so we want to introduce the
singular timeline. From the code perspective, this has the notable
advantage of clearing up a lot of mirky semantics and some clumsy
pointer chasing.
By splitting the timeline up into a single entity rather than an array
of per-engine timelines, we can realise the goal of the previous patch
of tracking the timeline alongside the ring.
v2: Tweak wait_for_idle to stop the compiling thinking that ret may be
uninitialised.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180502163839.3248-2-chris@chris-wilson.co.uk
In the future, we want to move a request between engines. To achieve
this, we first realise that we have two timelines in effect here. The
first runs through the GTT is required for ordering vma access, which is
tracked currently by engine. The second is implied by sequential
execution of commands inside the ringbuffer. This timeline is one that
maps to userspace's expectations when submitting requests (i.e. given the
same context, batch A is executed before batch B). As the rings's
timelines map to userspace and the GTT timeline an implementation
detail, move the timeline from the GTT into the ring itself (per-context
in logical-ring-contexts/execlists, or a global per-engine timeline for
the shared ringbuffers in legacy submission.
The two timelines are still assumed to be equivalent at the moment (no
migrating requests between engines yet) and so we can simply move from
one to the other without adding extra ordering.
v2: Reinforce that one isn't allowed to mix the engine execution
timeline with the client timeline from userspace (on the ring).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180502163839.3248-1-chris@chris-wilson.co.uk
The old wait_on_atomic_t used a custom callback to perform the
schedule(), which used my return semantics of reporting an error code on
timeout. wait_var_event_timeout() uses the schedule() return semantics
of reporting the remaining jiffies (1 if it timed out with 0 jiffies
remaining!) and 0 on failure. This semantic mismatch lead to us falsely
claiming a time out occurred.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106085
Fixes: d224985a5e ("sched/wait, drivers/drm: Convert wait_on_atomic_t() usage to the new wait_var_event() API")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180417170638.20550-1-chris@chris-wilson.co.uk
We don't need to track every ring for its lifetime as they are managed
by the contexts/engines. What we do want to track are the live rings so
that we can sporadically clean up requests if userspace falls behind. We
can simply restrict the gt->rings list to being only gt->live_rings.
v2: s/live/active/ for consistency with gt.active_requests
Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180430131503.5375-4-chris@chris-wilson.co.uk
In the next patch, rings are the central timeline as requests may jump
between engines. Therefore in the future as we retire in order along the
engine timeline, we may retire out-of-order within a ring (as the ring now
occurs along multiple engines), leading to much hilarity in miscomputing
the position of ring->head.
As an added bonus, retiring along the ring reduces the penalty of having
one execlists client do cleanup for another (old legacy submission
shares a ring between all clients). The downside is that slow and
irregular (off the critical path) process of cleaning up stale requests
after userspace becomes a modicum less efficient.
In the long run, it will become apparent that the ordered
ring->request_list matches the ring->timeline, a fun challenge for the
future will be unifying the two lists to avoid duplication!
v2: We need both engine-order and ring-order processing to maintain our
knowledge of where individual rings have completed upto as well as
knowing what was last executing on any engine. And finally by decoupling
retiring the contexts on the engine and the timelines along the rings,
we do have to keep a reference to the context on each request
(previously it was guaranteed by the context being pinned).
v3: Not just a reference to the context, but we need to keep it pinned
as we manipulate the rings; i.e. we need a pin for both the manipulation
of the engine state during its retirements, and a separate pin for the
manipulation of the ring state.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180430131503.5375-3-chris@chris-wilson.co.uk
Make life easier in upcoming patches by moving the context_pin and
context_unpin vfuncs into inline helpers.
v2: Fixup mock_engine to mark the context as pinned on use.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180430131503.5375-2-chris@chris-wilson.co.uk
Even though we weren't injecting guilty requests to be reset, we could
still fall over the issue of resetting the same request too fast -- where
the GPU refuses to start again. (Although it is interesting to note that
reloading the driver is sufficient, suggesting that we could recover if
we delayed the setup after reset?) Continue to paper over the problem by
adding a small delay by waiting for the engine to idle between tests,
and ensure that the engines are idle before starting the idle tests.
v2: Replace single instance of 50 with a magic macro.
References: 028666793a ("drm/i915/selftests: Avoid repeatedly harming the same innocent context")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180411120346.27618-1-chris@chris-wilson.co.uk
There is a potential execution path in which variable err is
returned without being properly initialized previously.
Fix this by initializing variable err to 0.
Addresses-Coverity-ID: 1468362 ("Uninitialized scalar variable")
Fixes: f4ecfbfc32 ("drm/i915: Check whitelist registers across resets")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180424131545.GA4053@embeddedor.com
Today we only want to pass along the priority to engine->schedule(), but
in the future we want to have much more control over the various aspects
of the GPU during a context's execution, for example controlling the
frequency allowed. As we need an ever growing number of parameters for
scheduling, move those into a struct for convenience.
v2: Move the anonymous struct into its own function for legibility and
ye olde gcc.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180418184052.7129-3-chris@chris-wilson.co.uk
Silence smatch over:
drivers/gpu/drm/i915/selftests/intel_workarounds.c:58 read_nonprivs() error: 'cs' dereferencing possible ERR_PTR()
by handling a potential (but unlikely) failure of intel_ring_begin.
Fixes: f4ecfbfc32 ("drm/i915: Check whitelist registers across resets")
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/1523915821-30624-1-git-send-email-oscar.mateo@intel.com
Add a selftest to ensure that we restore the whitelisted registers after
rewrite the registers everytime they might be scrubbed, e.g. module
load, reset and resume. For the other volatile workaround registers, we
export their presence via debugfs and check in igt/gem_workarounds.
However, we don't export the whitelist and rather than do so, let's test
them directly in the kernel.
The test we use is to read the registers back from the CS (this helps us
be sure that the registers will be valid for MI_LRI etc). In order to
generate the expected list, we split intel_whitelist_workarounds_emit
into two phases, the first to build the list and the second to apply.
Inside the test, we only build the list and then check that list against
the hw.
v2: Filter out pre-gen8 as they do not have RING_NONPRIV.
v3: Drop unused engine parameter, no plans to use it now or future.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Oscar Mateo <oscar.mateo@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180414122754.569-1-chris@chris-wilson.co.uk
Currently, we rely on inspecting the hangcheck state from within the
i915_reset() routines to determine which engines were guilty of the
hang. This is problematic for cases where we want to run
i915_handle_error() and call i915_reset() independently of hangcheck.
Instead of relying on the indirect parameter passing, turn it into an
explicit parameter providing the set of stalled engines which then are
treated as guilty until proven innocent.
While we are removing the implicit stalled parameter, also make the
reason into an explicit parameter to i915_reset(). We still need a
back-channel for i915_handle_error() to hand over the task to the locked
waiter, but let's keep that its own channel rather than incriminate
another.
This leaves stalled/seqno as being private to hangcheck, with no more
nefarious snooping by reset, be it whole-device or per-engine. \o/
The only real issue now is that this makes it crystal clear that we
don't actually do any testing of hangcheck per se in
drv_selftest/live_hangcheck, merely of resets!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180406220354.18911-2-chris@chris-wilson.co.uk
If we are resetting just one engine, we know it has stalled. So we can
pass the stalled parameter directly to i915_gem_reset_engine(), which
alleviates the necessity to poke at the generic engine->hangcheck.stalled
magic variable, leaving that under control of hangcheck as its name
implies. Other than simplifying by removing the indirect parameter along
this path, this allows us to introduce new reset mechanisms that run
independently of hangcheck.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180406220354.18911-1-chris@chris-wilson.co.uk
Tvrtko mentioned that wait_for_hang() was confusing as it does not
actually wait for the aforementioned hang, just until the request is
running and we are *ready* to inject a hang. A quick
s/wait_for_hang/wait_until_running/ removes that confusion without
having to rethink the naming scheme, immediately at least.
Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180406100950.19033-1-chris@chris-wilson.co.uk
We don't handle resetting the kernel context very well, or presumably any
context executing its breadcrumb commands in the ring as opposed to the
batchbuffer and flush. If we trigger a device reset twice in quick
succession while the kernel context is executing, we may end up skipping
the breadcrumb. This is really only a problem for the selftest as
normally there is a large interlude between resets (hangcheck), or we
focus on resetting just one engine and so avoid repeatedly resetting
innocents.
Something to try would be a preempt-to-idle to quiesce the engine before
reset, so that innocent contexts would be spared the reset.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
CC: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180330131801.18327-1-chris@chris-wilson.co.uk
Before adding a new feature to execlists submission, we should endeavour
to cover the baseline behaviour with selftests. So start the ball
rolling.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
CC: Michel Thierry <michel.thierry@intel.com>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180404093329.5383-1-chris@chris-wilson.co.uk
Pull wait_var_event updates from Ingo Molnar:
"This introduces the new wait_var_event() API, which is a more flexible
waiting primitive than wait_on_atomic_t().
All wait_on_atomic_t() users are migrated over to the new API and
wait_on_atomic_t() is removed. The migration fixes one bug and should
result in no functional changes for the other usecases"
* 'sched-wait-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/wait: Improve __var_waitqueue() code generation
sched/wait: Remove the wait_on_atomic_t() API
sched/wait, arch/mips: Fix and convert wait_on_atomic_t() usage to the new wait_var_event() API
sched/wait, fs/ocfs2: Convert wait_on_atomic_t() usage to the new wait_var_event() API
sched/wait, fs/nfs: Convert wait_on_atomic_t() usage to the new wait_var_event() API
sched/wait, fs/fscache: Convert wait_on_atomic_t() usage to the new wait_var_event() API
sched/wait, fs/btrfs: Convert wait_on_atomic_t() usage to the new wait_var_event() API
sched/wait, fs/afs: Convert wait_on_atomic_t() usage to the new wait_var_event() API
sched/wait, drivers/media: Convert wait_on_atomic_t() usage to the new wait_var_event() API
sched/wait, drivers/drm: Convert wait_on_atomic_t() usage to the new wait_var_event() API
sched/wait: Introduce wait_var_event()
Watch what happens if we try to reset with a queue of requests with
varying priorities -- that may need reordering or preemption across the
reset.
v2: Tweak priorities to avoid starving the hanging thread.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180322073533.5313-2-chris@chris-wilson.co.uk
Reviewed-by: Jeff McGee <jeff.mcgee@intel.com>
If we fail to reset the GPU in a timely fashion, dump the GEM trace so
that we can see what operations were in flight when the GPU got stuck.
v2: There's more than one timeout that deserves tracing!
v3: Silence checkpatch by not even using a product at all!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Jeff McGee <jeff.mcgee@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180322074908.10838-1-chris@chris-wilson.co.uk
Not all callers want the GPU error to handled in the same way, so expose
a control parameter. In the first instance, some callers do not want the
heavyweight error capture so add a bit to request the state to be
captured and saved.
v2: Pass msg down to i915_reset/i915_reset_engine so that we include the
reason for the reset in the dev_notice(), superseding the earlier option
to not print that notice.
v3: Stash the reason inside the i915->gpu_error to handover to the direct
reset from the blocking waiter.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180320100449.1360-2-chris@chris-wilson.co.uk
The old wait_on_atomic_t() is going to get removed, use the more
flexible wait_var_event() API instead.
Unlike wake_up_atomic_t(), wake_up_var() will issue the wakeup
even if the variable is not 0.
No change in functionality.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Check that the entries are in reverse gen order and that all entries
with gen > 0 have an mmio base set.
v2: loop forward, simplify logic, use i915_subtests (Chris)
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180314182653.26981-2-daniele.ceraolospurio@intel.com
The main difference with previous GENs is that starting from Gen11
each VCS and VECS engine has its own power well, which only exist
if the related engine exists in the HW.
The fallback forcewake request workaround is only needed on gen9
according to the HSDES WA entry (1604254524), so we can go back to using
the simpler fw_domains_get/put functions.
BSpec: 18331
v2: fix fwtable, use array to test shadow tables, create new
accessors to avoid check on every access (Tvrtko)
v3 (from Paulo): Rebase.
v4:
- Range 09400-097FF should be FORCEWAKE_ALL (Daniele)
- Use the BIT macro for forcewake domains (Daniele)
- Add a comment about the range ordering (Oscar)
- Updated commit message (Oscar)
v5: Rebased
v6: Use I915_MAX_VCS/VECS (Michal)
v7: translate FORCEWAKE_ALL to available domains
v8: rebase, add clarification on fallback ack in commit message.
v9: fix rebase issue, change check in fw_domains_init from IS_GEN11
to GEN >= 11
v10: Generate is_genX_shadowed with a macro (Daniele)
Include gen11_fw_ranges in the selftest (Michel)
v11: Simplify FORCEWAKE_ALL, new line between NEEDS_FORCEWAKEs (Tvrtko)
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Acked-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302161501.28594-6-mika.kuoppala@linux.intel.com
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Mostly doc/print messages that were not updated after commit e61e0f51ba
("drm/i915: Rename drm_i915_gem_request to i915_request").
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180222172405.11386-1-michel.thierry@intel.com
We want to de-emphasize the link between the request (dependency,
execution and fence tracking) from GEM and so rename the struct from
drm_i915_gem_request to i915_request. That is we may implement the GEM
user interface on top of requests, but they are an abstraction for
tracking execution rather than an implementation detail of GEM. (Since
they are not tied to HW, we keep the i915 prefix as opposed to intel.)
In short, the spatch:
@@
@@
- struct drm_i915_gem_request
+ struct i915_request
A corollary to contracting the type name, we also harmonise on using
'rq' shorthand for local variables where space if of the essence and
repetition makes 'request' unwieldy. For globals and struct members,
'request' is still much preferred for its clarity.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180221095636.6649-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
i915 is the only driver using those fields in the drm_gem_object
structure, so they only waste memory for all other drivers.
Move the fields into drm_i915_gem_object instead and patch the i915 code
with the following sed commands:
sed -i "s/obj->base.read_domains/obj->read_domains/g" drivers/gpu/drm/i915/*.c drivers/gpu/drm/i915/*/*.c
sed -i "s/obj->base.write_domain/obj->write_domain/g" drivers/gpu/drm/i915/*.c drivers/gpu/drm/i915/*/*.c
Change is only compile tested.
v2: move fields around as suggested by Chris.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180216124338.9087-1-christian.koenig@amd.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Fix inconsistent IS_ERR and PTR_ERR in shrink_boom.
The proper pointer to use is _explode_ instead of _purge_.
This issue was detected with the help of Coccinelle.
Fixes: fe215c8bc4 ("drm/i915/selftests: add missing gtt shrinker test")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180214211234.GA22341@embeddedgus
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If we remove some hardcoded assumptions about the preempt context having
a fixed id, reserved from use by normal user contexts, we may only
allocate the i915_gem_context when required. Then the subsequent
decisions on using preemption reduce to having the preempt context
available.
v2: Include an assert that we don't allocate the preempt context twice.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180207210544.26351-3-chris@chris-wilson.co.uk
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
In the next patch, we may only conditionally allocate the preempt-client
if there is a global preempt context and so we need to be prepared in
case the preempt-client itself is NULL.
v2: Grep for more preempt_client.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180207210544.26351-1-chris@chris-wilson.co.uk
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Avoid injecting hangs in to the i915->kernel_context in case the GPU
reset leaves corruption in the context image in its wake (leading to
continual failures and system hangs after the selftests are ostensibly
complete). Use a sacrificial kernel_context instead.
v2: Closing a context is tricky; export a function (for selftests) from
i915_gem_context.c to get it right.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180205152431.12163-2-chris@chris-wilson.co.uk
When injecting rapid resets, we must be careful to at least wait for the
previous reset to have taken effect and the engine restarted. If we
perform a second reset before that has happened, we will notice that the
engine hasn't recovered and declare it lost, wedging the device and
failing. In practice, since we wait for each hanging batch to start
before injecting the reset, this too-fast-reset condition can only be
triggered when moving onto the next engine in the test, so we need only
wait for the existing reset to complete before switching engines.
v2: Wrap up the wait inside a safety net to bail out in case of angry hw.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180205152431.12163-1-chris@chris-wilson.co.uk
When testing that the timeout fired, we need to be sure we have waited
just long enough for the timeout to have occurred and for the softirq
(on another cpu) to have completed. Sleeping for an arbitrary amount is
prone to error, so wait for the timeout instead and complain if it was
too late.
v2: Use wait_event_timeout to provide an upper bound
v3: Fix inverted check for wait_event_timeout timing out
v4: Restore the check that the fences aren't signalled too early, by
inspecting them before the expected timeout.
References: https://bugs.freedesktop.org/show_bug.cgi?id=104670
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180117135713.2324-1-chris@chris-wilson.co.uk
Check that we can successfully wait upon a dma_fence using the
i915_sw_fence, including the optional timeout mechanism.
v2: Account for the rounding up of the timeout to the next second.
Unfortunately, the minimum delay is then 1 second.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180115204348.8480-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reduce the number of GGTT PTE operations to speed the test up, but we
reduce the likelihood of spotting a coherency error in those operations.
However, Broxton is sporadically timing on this test, presumably because
its GGTT operations are all uncached.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171223110407.21402-1-chris@chris-wilson.co.uk
Now that we skip a per-engine reset on an idle engine, we need to update
the selftest to take that into account. In the process, we find that we
were not stressing the per-engine reset very hard, so add those missing
active resets.
v2: Actually test i915_reset_engine() by loading it with requests.
Fixes: f6ba181ada ("drm/i915: Skip an engine reset if it recovered before our preparations")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104313
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171217132852.30642-3-chris@chris-wilson.co.uk
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
We have an existing helper for testing obj->mm.pages, so use it.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171218103855.25274-1-matthew.auld@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>