Commit Graph

465 Commits

Author SHA1 Message Date
Tvrtko Ursulin
a50134b198 drm/i915: Make for_each_engine_masked work on intel_gt
Medium term goal is to eliminate the i915->engine[] array and to get there
we have recently introduced equivalent array in intel_gt. Now we need to
migrate the code further towards this state.

This next step is to eliminate usage of i915->engines[] from the
for_each_engine_masked iterator.

For this to work we also need to use engine->id as index when populating
the gt->engine[] array and adjust the default engine set indexing to use
engine->legacy_idx instead of assuming gt->engines[] indexing.

v2:
  * Populate gt->engine[] earlier.
  * Check that we don't duplicate engine->legacy_idx

v3:
  * Work around the initialization order issue between default_engines()
    and intel_engines_driver_register() which sets engine->legacy_idx for
    now. It will be fixed properly later.

v4:
  * Merge with forgotten v2.5.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191017161852.8836-1-tvrtko.ursulin@linux.intel.com
2019-10-18 00:06:25 +01:00
Daniele Ceraolo Spurio
0b23e2a6ed drm/i915/huc: improve documentation
Better explain the usage of the microcontroller and what i915 is
responsible of. While at it, fix the documentation for the auth
function, which doesn't do any pinning anymore.

v2: add a comment on HuC being optional and descrive how HuC accesses
    memory (Martin)
v3: add extra newline for better text organization (Martin)

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Martin Peres <martin.peres@linux.intel.com>
Acked-by: Anna Karas <anna.karas@intel.com>
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191014183602.3643-3-daniele.ceraolospurio@intel.com
2019-10-17 09:30:34 -07:00
Daniele Ceraolo Spurio
218151e997 drm/i915/guc: improve documentation
Add a short description of what we expect from GuC and some minor
improvements to existing documentation. Also remove a comment about a
difference between GuC and HuC that is not true anymore.

v2: add that the GuC is not mandatory (Martin)
v3: add extra newline for better text organization (Martin)

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Martin Peres <martin.peres@linux.intel.com>
Acked-by: Anna Karas <anna.karas@intel.com>
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191014183602.3643-2-daniele.ceraolospurio@intel.com
2019-10-17 09:30:32 -07:00
Chris Wilson
e9d4c9245f drm/i915: Store i915_ggtt as the backpointer on fence registers
Now that i915_ggtt knows everything about its own paths to perform mmio,
we can use that as our primary backpointer for individual fence
registers. This reduces the amount of pointer dancing we have to perform
on the common paths, but more importantly finishes our fence register
encapsulation.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191016143234.4075-1-chris@chris-wilson.co.uk
2019-10-16 19:41:36 +01:00
Chris Wilson
eca0b72089 drm/i915: Do initial mocs configuration directly
Now that we record the default "goldenstate" context, we do not need to
emit the mocs registers at the start of each context and can simply do
mmio before the first context and capture the registers as part of its
default image. As a consequence, this means that we repeat the mmio
after each engine reset, fixing up any platform and registers that were
zapped by the reset (for those platforms with global not context-saved
settings).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111723
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111645
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
Reviewed-by: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191016090749.7092-1-chris@chris-wilson.co.uk
2019-10-16 19:35:37 +01:00
Chris Wilson
5f65d5a6e4 drm/i915/selftests: Teach timelines to take intel_gt as its argument
The timelines selftests are [mostly] hardware centric and so want to use
the gt as its target.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191016113840.1106-1-chris@chris-wilson.co.uk
2019-10-16 18:20:18 +01:00
Chris Wilson
bb3d4c9d63 drm/i915/selftests: Teach workarounds to take intel_gt as its argument
The workarounds selftests are hardware centric and so want to use the gt
as its target.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191016114902.24388-1-chris@chris-wilson.co.uk
2019-10-16 18:20:05 +01:00
Chris Wilson
3b05c4f832 drm/i915/selftests: Teach guc to take intel_gt as its argument
The guc selftests are hardware^W firmare centric and so want to use the
gt as its target.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191016115311.12894-1-chris@chris-wilson.co.uk
2019-10-16 18:19:46 +01:00
Chris Wilson
1357fa8136 drm/i915/selftests: Teach execlists to take intel_gt as its argument
The execlists selftests are hardware centric and so want to use the gt
as its target.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191016120249.22714-1-chris@chris-wilson.co.uk
2019-10-16 18:19:29 +01:00
Chris Wilson
2229adc813 drm/i915/execlist: Trim immediate timeslice expiry
We perform timeslicing immediately upon receipt of a request that may be
put into the second ELSP slot. The idea behind this was that since we
didn't install the timer if the second ELSP slot was empty, we would not
have any idea of how long ELSP[0] had been running and so giving the
newcomer a chance on the GPU was fair. However, this causes us extra
busy work that we may be able to avoid if we wait a jiffie for the first
timeslice as normal.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191016100851.4979-1-chris@chris-wilson.co.uk
2019-10-16 14:05:45 +01:00
Chris Wilson
8574685547 drm/i915/selftests: Drop stale struct_mutex
A lately added test was missed when applying the struct_mutex removal
patches. Do so now.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191015085911.10317-1-chris@chris-wilson.co.uk
2019-10-16 09:54:28 +01:00
Mika Kuoppala
08fff7aedd drm/i915/tgl: Wa_1607138340
Avoid possible cs hang with semaphores by disabling
lite restore.

Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191015154449.10338-11-mika.kuoppala@linux.intel.com
2019-10-15 18:25:52 +01:00
Mika Kuoppala
99db8c59e0 drm/i915/tgl: Wa_1607030317, Wa_1607186500, Wa_1607297627
Disable semaphore idle messages and wait for event
power downs.

Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191015154449.10338-10-mika.kuoppala@linux.intel.com
2019-10-15 18:25:45 +01:00
Mika Kuoppala
79bfa607e6 drm/i915/tgl: Wa_1607138336
Avoid possible deadlock on context switch.

Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191015154449.10338-9-mika.kuoppala@linux.intel.com
2019-10-15 18:25:14 +01:00
Mika Kuoppala
2e19af9438 drm/i915/tgl: Wa_1409600907
To avoid possible hang, we need to add depth stall if we flush the
depth cache.

Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191015154449.10338-8-mika.kuoppala@linux.intel.com
2019-10-15 18:23:10 +01:00
Mika Kuoppala
2cbe2d8c56 drm/i915/tgl: Wa_1409170338
Avoid possible hang in tsg,vfe units by keeping
l3 clocks runnings.

Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191015154449.10338-7-mika.kuoppala@linux.intel.com
2019-10-15 18:22:07 +01:00
Mika Kuoppala
65df78bda3 drm/i915/tgl: Wa_1409420604
Avoid possible hang in CPSS unit.

Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191015154449.10338-6-mika.kuoppala@linux.intel.com
2019-10-15 18:20:19 +01:00
Mika Kuoppala
99739f9431 drm/i915/tgl: Keep FF dop clock enabled for A0
To ensure correct state data for compute workloads, we
need to keep the ff dop clock enabled.

References: HSDES#1606700617
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191015154449.10338-5-mika.kuoppala@linux.intel.com
2019-10-15 18:17:34 +01:00
Mika Kuoppala
36a6b5d964 drm/i915/tgl: Add extra hdc flush workaround
In order to ensure constant caches are invalidated
properly with a0, we need extra hdc flush after invalidation.

v2: use IS_TGL_REVID (Chris)

References: HSDES#1604544889
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191015154449.10338-4-mika.kuoppala@linux.intel.com
2019-10-15 18:16:51 +01:00
Mika Kuoppala
4aa0b5d457 drm/i915/tgl: Add HDC Pipeline Flush
Add hdc pipeline flush to ensure memory state is coherent
in L3 when we are done.

v2: Flush also in breadcrumbs (Chris)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191015154449.10338-3-mika.kuoppala@linux.intel.com
2019-10-15 18:15:59 +01:00
Mika Kuoppala
62037ffff2 drm/i915/tgl: Include ro parts of l3 to invalidate
Aim for completeness and invalidate also the ro parts
in l3 cache. This might allow to get rid of the preparser
disable/enable workaround on invalidation path.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191015154449.10338-2-mika.kuoppala@linux.intel.com
2019-10-15 18:13:50 +01:00
Mika Kuoppala
da5d2ca8ad drm/i915/icl: Wa_1607087056
Avoid possible hang in tsg,vfe units by keeping
l3 clocks runnings.

Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191015154411.9984-1-mika.kuoppala@linux.intel.com
2019-10-15 18:12:40 +01:00
Chris Wilson
8b390c1581 drm/i915/execlists: Clear semaphore immediately upon ELSP promotion
There is no significance to our delay before clearing the semaphore the
engine is waiting on, so release it as soon as we acknowledge the CS
update following our preemption request. This should allow the GPU to
resume work earlier, if it was stuck on the semaphore at the end of a
request.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191015093204.25693-1-chris@chris-wilson.co.uk
2019-10-15 11:51:13 +01:00
Chris Wilson
3c00660db1 drm/i915/execlists: Assert tasklet is locked for process_csb()
We rely on only the tasklet being allowed to call into process_csb(), so
assert that is locked when we do. As the tasklet uses a simple bitlock,
there is no strong lockdep checking so we must make do with a plain
assertion that the tasklet is running and assume that we are the
tasklet!

v2: Fixup intel_gt_sanitize() to prepare each engine for the reset so
that the locks are marked as held during the reset
v3: Check for existent function pointers for very early sanitisation.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191014121336.30137-1-chris@chris-wilson.co.uk
2019-10-14 21:10:59 +01:00
Chris Wilson
89b6d1831d drm/i915/execlists: Tweak virtual unsubmission
Since commit e2144503bf ("drm/i915: Prevent bonded requests from
overtaking each other on preemption") we have restricted requests to run
on their chosen engine across preemption events. We can take this
restriction into account to know that we will want to resubmit those
requests onto the same physical engine, and so can shortcircuit the
virtual engine selection process and keep the request on the same
engine during unwind.

References: e2144503bf ("drm/i915: Prevent bonded requests from overtaking each other on preemption")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Ramlingam C <ramalingam.c@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191013203012.25208-1-chris@chris-wilson.co.uk
2019-10-14 12:51:17 +01:00
Chris Wilson
9506c23dfa drm/i915/selftests: Check that GPR are cleared for new contexts
We want the general purpose registers to be clear in all new contexts so
that we can be confident that no information is leaked from one to the
next.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191014090757.32111-7-chris@chris-wilson.co.uk
2019-10-14 11:10:28 +01:00
Chris Wilson
9c27462c89 drm/i915/selftests: Check known register values within the context
Check the logical ring context by asserting that the registers hold
expected start during execution. (It's a bit chicken-and-egg for how
could we manage to execute our request if the registers were not being
updated. Still, it's nice to verify that the HW is working as expected.)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191014090757.32111-6-chris@chris-wilson.co.uk
2019-10-14 11:10:18 +01:00
Lionel Landwerlin
daed3e4439 drm/i915/perf: implement active wait for noa configurations
NOA configuration take some amount of time to apply. That amount of
time depends on the size of the GT. There is no documented time for
this. For example, past experimentations with powergating
configuration changes seem to indicate a 60~70us delay. We go with
500us as default for now which should be over the required amount of
time (according to HW architects).

v2: Don't forget to save/restore registers used for the wait (Chris)

v3: Name used CS_GPR registers (Chris)
    Fix compile issue due to rebase (Lionel)

v4: Fix save/restore helpers (Umesh)

v5: Move noa_wait from drm_i915_private to i915_perf_stream (Lionel)

v6: Add missing struct declarations in i915_perf.h

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191012072308.30312-2-chris@chris-wilson.co.uk
2019-10-12 09:08:33 +01:00
Lionel Landwerlin
6a45008ab7 drm/i915/perf: allow for CS OA configs to be created lazily
Here we introduce a mechanism by which the execbuf part of the i915
driver will be able to request that a batch buffer containing the
programming for a particular OA config be created.

We'll execute these OA configuration buffers right before executing a
set of userspace commands so that a particular user batchbuffer be
executed with a given OA configuration.

This mechanism essentially allows the userspace driver to go through
several OA configuration without having to open/close the i915/perf
stream.

v2: No need for locking on object OA config object creation (Chris)
    Flush cpu mapping of OA config (Chris)

v3: Properly deal with the perf_metric lock (Chris/Lionel)

v4: Fix oa config unref/put when not found (Lionel)

v5: Allocate BOs for configurations on the stream instead of globally
    (Lionel)

v6: Fix 64bit division (Chris)

v7: Store allocated config BOs into the stream (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191012072308.30312-1-chris@chris-wilson.co.uk
2019-10-12 09:08:27 +01:00
Chris Wilson
c3eb54aad9 drm/i915: Mark up "sentinel" requests
Sometimes we want to emit a terminator request, a request that flushes
the pipeline and allows no request to come after it. This can be used
for a "preempt-to-idle" to ensure that upon processing the
context-switch to that request, all other active contexts have been
flushed.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191012070136.32058-1-chris@chris-wilson.co.uk
2019-10-12 08:51:17 +01:00
Chris Wilson
d8ad5f5261 drm/i915/execlists: Prevent merging requests with conflicting flags
We set out-of-bound parameters inside the i915_requests.flags field,
such as disabling preemption or marking the end-of-context. We should
not coalesce consecutive requests if they have differing instructions
as we only inspect the last active request in a context. Thus if we
allow a later request to be merged into the same execution context, it
will mask any of the earlier flags.

References: 2a98f4e65b ("drm/i915: add infrastructure to hold off preemption on a request")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191011190325.10979-9-chris@chris-wilson.co.uk
2019-10-12 07:54:52 +01:00
Chris Wilson
cd9ba7b6e4 drm/i915/selftests: Serialise write to scratch with its vma binding
Add the missing serialisation on the request for a write into a vma to
wait until that vma is bound before being executed by the GPU.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191011193620.14026-1-chris@chris-wilson.co.uk
2019-10-11 22:42:31 +01:00
Chris Wilson
cbbf278778 drm/i915/execlists: Only mark incomplete requests as -EIO on cancelling
Only the requests that have not completed do we want to change the
status of to signal the -EIO when cancelling the inflight set of requests
upon wedging.

Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191011103345.26013-1-chris@chris-wilson.co.uk
2019-10-11 13:07:24 +01:00
Chris Wilson
c97fb526ca drm/i915/execlists: Leave tell-tales as to why pending[] is bad
Before we BUG out with bad pending state, leave a telltale as to which
test failed.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191010071434.31195-2-chris@chris-wilson.co.uk
2019-10-11 09:43:06 +01:00
Chris Wilson
86027e312c drm/i915/selftests: Check that registers are preserved between virtual engines
Make sure that we copy across the registers from one engine to the next,
as we hop around a virtual engine.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191010110252.17289-1-chris@chris-wilson.co.uk
2019-10-10 13:53:58 +01:00
Chris Wilson
bd9bec5b6a drm/i915/execlists: Mark up expected state during reset
Move the BUG_ON around slightly and add some explanations for each to
try and capture the expected state more carefully. We want to compare
the expected active state of our bookkeeping as compared to the tracked
HW state.

References: https://bugs.freedesktop.org/show_bug.cgi?id=111937
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191010083242.1387-1-chris@chris-wilson.co.uk
2019-10-10 13:52:34 +01:00
Chris Wilson
542a5c66e0 drm/i915/gt: Warn CI about an unrecoverable wedge
If we have a wedged GPU that we need to recover, but fail, add a taint
for CI to pickup and schedule a reboot.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Petri Latvala <petri.latvala@intel.com>
Reviewed-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191002160034.5121-1-chris@chris-wilson.co.uk
2019-10-10 11:19:32 +01:00
Daniele Ceraolo Spurio
9d41318c4e drm/i915/tgl: simplify the lrc register list for !RCS
There are small differences between the blitter and the video engines in
the xcs context image (e.g. registers 0x200 and 0x204 only exist on the
blitter). Since we never explicitly set a value for those register and
given that we don't need to update the offsets in the lrc image when we
change engine within the class for virtual engine because the HW can
handle that, instead of having a separate define for the BCS we can
just restrict the programming to the part we're interested in, which is
common across the engines.

Bspec: 45584
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191009230424.6507-2-daniele.ceraolospurio@intel.com
2019-10-10 10:14:42 +01:00
Daniele Ceraolo Spurio
ba2c74da52 drm/i915/tgl: the BCS engine supports relative MMIO
The specs don't mention any specific HW limitation on the blitter and
manual inspection shows that the HW does set the relative MMIO bit in
the LRI of the blitter context image, so we can remove our limitations.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191009230424.6507-1-daniele.ceraolospurio@intel.com
2019-10-10 10:12:18 +01:00
Chris Wilson
c36eebd9ba drm/i915/gt: execlists->active is serialised by the tasklet
The active/pending execlists is no longer protected by the
engine->active.lock, but is serialised by the tasklet instead. Update
the locking around the debug and stats to follow suit.

v2: local_bh_disable() to prevent recursing into the tasklet in case we
trigger a softirq (Tvrtko)

Fixes: df40306902 ("drm/i915/execlists: Lift process_csb() out of the irq-off spinlock")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191009160906.16195-1-chris@chris-wilson.co.uk
2019-10-09 19:54:46 +01:00
Chris Wilson
c949ae4314 drm/i915/execlists: Protect peeking at execlists->active
Now that we dropped the engine->active.lock serialisation from around
process_csb(), direct submission can run concurrently to the interrupt
handler. As such execlists->active may be advanced as we dequeue,
dropping the reference to the request. We need to employ our RCU request
protection to ensure that the request is not freed too early.

Fixes: df40306902 ("drm/i915/execlists: Lift process_csb() out of the irq-off spinlock")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191009100955.21477-1-chris@chris-wilson.co.uk
2019-10-09 19:46:40 +01:00
Chris Wilson
41f0bc49f7 drm/i915/selftests: Hold request reference over waits
Take a reference on the request before submitting it to the HW and then
waiting on it for selftest_workarounds. Once submitted, the request may
be freed by a background worker, unless we take an extra reference for
ourselves.

References: https://bugs.freedesktop.org/show_bug.cgi?id=111926
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191009061759.3189-1-chris@chris-wilson.co.uk
2019-10-09 08:58:39 +01:00
Chris Wilson
6ad145fe02 drm/i915/gt: Give engine->kernel_context distinct timeline lock classes
Assign a separate lockclass to the perma-pinned timelines of the
kernel_context, such that we can use them from within the user timelines
should we ever need to inject GPU operations to fixup faults during
request construction.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191008185941.15228-1-chris@chris-wilson.co.uk
2019-10-08 22:19:00 +01:00
Chris Wilson
d99f7b079c drm/i915/gt: Flush submission tasklet before waiting/retiring
A common bane of ours is arbitrary delays in ksoftirqd processing our
submission tasklet. Give the submission tasklet a kick before we wait to
avoid those delays eating into a tight timeout.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191008105655.13256-1-chris@chris-wilson.co.uk
2019-10-08 16:23:55 +01:00
Chris Wilson
d14a701b00 drm/i915/selftests: Assign the intel_runtime_pm pointer for mock_uncore
Couple up our mock_uncore to know about the fake global device and its
runtime powermanagement.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191008145045.23157-1-chris@chris-wilson.co.uk
2019-10-08 16:21:50 +01:00
Chris Wilson
3de1627851 drm/i915/selftests: Assign the mock_engine->uncore shortcut
Set up the engine->uncore shortcut on mock_engine creation.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191008071121.25088-1-chris@chris-wilson.co.uk
2019-10-08 10:14:46 +01:00
Chris Wilson
20af04f3dd drm/i915/execlists: Assign virtual_engine->uncore from first sibling
Copy across the engine->uncore shortcut to the virtual_engine from its
first physical engine, similar to the handling of the engine->gt
backpointer.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191008070342.4045-1-chris@chris-wilson.co.uk
2019-10-08 10:14:29 +01:00
Chris Wilson
a1b58ee3cb drm/i915/gt: Treat a busy timeline as 'active' while waiting
If we cannot claim the timeline->mutex while preparing for a wait on it,
we have to skip the timeline. In doing so, treat it as active so that
under a intel_gt_wait_for_idle() loop, we repeat the wait after
scheduling away.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191006165002.30312-4-chris@chris-wilson.co.uk
2019-10-07 21:44:02 +01:00
Chris Wilson
1664f35aa7 drm/i915/selftests: Appease lockdep
Disable irqs around updating the context image to keep lockdep happy:

<4>[  673.483340] WARNING: possible irq lock inversion dependency detected
<4>[  673.483342] 5.4.0-rc1-CI-Trybot_5118+ #1 Tainted: G     U
<4>[  673.483342] --------------------------------------------------------
<4>[  673.483343] swapper/2/0 just changed the state of lock:
<4>[  673.483344] ffff88845db885a0 (&i915_request_get(rq)->submit/1){-...}, at: __i915_sw_fence_complete+0x1b2/0x250 [i915]
<4>[  673.483387] but this lock took another, HARDIRQ-unsafe lock in the past:
<4>[  673.483388]  (&ce->pin_mutex/2){+...}
<4>[  673.483389]

                  and interrupts could create inverse lock ordering between them.

<4>[  673.483390]
                  other info that might help us debug this:
<4>[  673.483390] Chain exists of:
                    &i915_request_get(rq)->submit/1 --> &engine->active.lock --> &ce->pin_mutex/2

<4>[  673.483392]  Possible interrupt unsafe locking scenario:

<4>[  673.483392]        CPU0                    CPU1
<4>[  673.483393]        ----                    ----
<4>[  673.483393]   lock(&ce->pin_mutex/2);
<4>[  673.483394]                                local_irq_disable();
<4>[  673.483395]                                lock(&i915_request_get(rq)->submit/1);
<4>[  673.483396]                                lock(&engine->active.lock);
<4>[  673.483396]   <Interrupt>
<4>[  673.483397]     lock(&i915_request_get(rq)->submit/1);
<4>[  673.483398]
                   *** DEADLOCK ***

<4>[  673.483398] 2 locks held by swapper/2/0:
<4>[  673.483399]  #0: ffff8883f61ac9b0 (&(&gt->irq_lock)->rlock){-.-.}, at: gen11_gt_irq_handler+0x42/0x280 [i915]
<4>[  673.483433]  #1: ffff88845db8c418 (&(&rq->lock)->rlock){-.-.}, at: intel_engine_breadcrumbs_irq+0x34a/0x5a0 [i915]
<4>[  673.483463]
                  the shortest dependencies between 2nd lock and 1st lock:
<4>[  673.483466]   -> (&ce->pin_mutex/2){+...} ops: 614520 {
<4>[  673.483468]      HARDIRQ-ON-W at:
<4>[  673.483471]                         lock_acquire+0xa7/0x1c0
<4>[  673.483501]                         live_unlite_restore+0x1d8/0x6c0 [i915]
<4>[  673.483543]                         __i915_subtests+0xb8/0x210 [i915]
<4>[  673.483581]                         __run_selftests+0x112/0x170 [i915]
<4>[  673.483615]                         i915_live_selftests+0x2c/0x60 [i915]
<4>[  673.483644]                         i915_pci_probe+0x93/0x1b0 [i915]
<4>[  673.483646]                         pci_device_probe+0x9e/0x120
<4>[  673.483648]                         really_probe+0xea/0x420
<4>[  673.483649]                         driver_probe_device+0x10b/0x120
<4>[  673.483651]                         device_driver_attach+0x4a/0x50
<4>[  673.483652]                         __driver_attach+0x97/0x130
<4>[  673.483653]                         bus_for_each_dev+0x74/0xc0
<4>[  673.483654]                         bus_add_driver+0x142/0x220
<4>[  673.483655]                         driver_register+0x56/0xf0
<4>[  673.483657]                         do_one_initcall+0x58/0x2ff
<4>[  673.483659]                         do_init_module+0x56/0x1f8
<4>[  673.483660]                         load_module+0x243e/0x29f0
<4>[  673.483661]                         __do_sys_finit_module+0xe9/0x110
<4>[  673.483662]                         do_syscall_64+0x4f/0x210
<4>[  673.483665]                         entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>[  673.483665]      INITIAL USE at:
<4>[  673.483667]                        lock_acquire+0xa7/0x1c0
<4>[  673.483698]                        live_unlite_restore+0x1d8/0x6c0 [i915]
<4>[  673.483733]                        __i915_subtests+0xb8/0x210 [i915]
<4>[  673.483764]                        __run_selftests+0x112/0x170 [i915]
<4>[  673.483793]                        i915_live_selftests+0x2c/0x60 [i915]
<4>[  673.483821]                        i915_pci_probe+0x93/0x1b0 [i915]
<4>[  673.483822]                        pci_device_probe+0x9e/0x120
<4>[  673.483824]                        really_probe+0xea/0x420
<4>[  673.483825]                        driver_probe_device+0x10b/0x120
<4>[  673.483826]                        device_driver_attach+0x4a/0x50
<4>[  673.483827]                        __driver_attach+0x97/0x130
<4>[  673.483828]                        bus_for_each_dev+0x74/0xc0
<4>[  673.483829]                        bus_add_driver+0x142/0x220
<4>[  673.483830]                        driver_register+0x56/0xf0
<4>[  673.483831]                        do_one_initcall+0x58/0x2ff
<4>[  673.483833]                        do_init_module+0x56/0x1f8
<4>[  673.483834]                        load_module+0x243e/0x29f0
<4>[  673.483835]                        __do_sys_finit_module+0xe9/0x110
<4>[  673.483836]                        do_syscall_64+0x4f/0x210
<4>[  673.483837]                        entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>[  673.483838]    }
<4>[  673.483868]    ... key      at: [<ffffffffa0a8f132>] __key.70113+0x2/0xffffffffffef2ed0 [i915]
<4>[  673.483869]    ... acquired at:
<4>[  673.483935]    __execlists_reset+0xfb/0xc20 [i915]
<4>[  673.483965]    execlists_reset+0x3d/0x50 [i915]
<4>[  673.483995]    intel_engine_reset+0xdf/0x230 [i915]
<4>[  673.484022]    live_preempt_hang+0x1d7/0x2e0 [i915]
<4>[  673.484064]    __i915_subtests+0xb8/0x210 [i915]
<4>[  673.484130]    __run_selftests+0x112/0x170 [i915]
<4>[  673.484163]    i915_live_selftests+0x2c/0x60 [i915]
<4>[  673.484193]    i915_pci_probe+0x93/0x1b0 [i915]
<4>[  673.484194]    pci_device_probe+0x9e/0x120
<4>[  673.484195]    really_probe+0xea/0x420
<4>[  673.484196]    driver_probe_device+0x10b/0x120
<4>[  673.484197]    device_driver_attach+0x4a/0x50
<4>[  673.484198]    __driver_attach+0x97/0x130
<4>[  673.484199]    bus_for_each_dev+0x74/0xc0
<4>[  673.484200]    bus_add_driver+0x142/0x220
<4>[  673.484202]    driver_register+0x56/0xf0
<4>[  673.484203]    do_one_initcall+0x58/0x2ff
<4>[  673.484204]    do_init_module+0x56/0x1f8
<4>[  673.484205]    load_module+0x243e/0x29f0
<4>[  673.484206]    __do_sys_finit_module+0xe9/0x110
<4>[  673.484207]    do_syscall_64+0x4f/0x210
<4>[  673.484208]    entry_SYSCALL_64_after_hwframe+0x49/0xbe

<4>[  673.484209]  -> (&engine->active.lock){..-.} ops: 972791 {
<4>[  673.484211]     IN-SOFTIRQ-W at:
<4>[  673.484213]                       lock_acquire+0xa7/0x1c0
<4>[  673.484214]                       _raw_spin_lock_irqsave+0x33/0x50
<4>[  673.484244]                       execlists_submission_tasklet+0xaf/0x100 [i915]
<4>[  673.484246]                       tasklet_action_common.isra.18+0x6c/0x1c0
<4>[  673.484247]                       __do_softirq+0xdf/0x47f
<4>[  673.484248]                       irq_exit+0xba/0xc0
<4>[  673.484249]                       do_IRQ+0x83/0x160
<4>[  673.484250]                       ret_from_intr+0x0/0x1d
<4>[  673.484252]                       cpuidle_enter_state+0xb2/0x450
<4>[  673.484253]                       cpuidle_enter+0x24/0x40
<4>[  673.484254]                       do_idle+0x1e7/0x250
<4>[  673.484256]                       cpu_startup_entry+0x14/0x20
<4>[  673.484257]                       start_secondary+0x15f/0x1b0
<4>[  673.484258]                       secondary_startup_64+0xa4/0xb0
<4>[  673.484259]     INITIAL USE at:
<4>[  673.484261]                      lock_acquire+0xa7/0x1c0
<4>[  673.484290]                      intel_engine_init_active+0x7e/0xb0 [i915]
<4>[  673.484305]                      intel_engines_setup+0x1cd/0x3b0 [i915]
<4>[  673.484305]                      i915_gem_init+0x12d/0x900 [i915]
<4>[  673.484305]                      i915_driver_probe+0xb70/0x15d0 [i915]
<4>[  673.484305]                      i915_pci_probe+0x43/0x1b0 [i915]
<4>[  673.484305]                      pci_device_probe+0x9e/0x120
<4>[  673.484305]                      really_probe+0xea/0x420
<4>[  673.484305]                      driver_probe_device+0x10b/0x120
<4>[  673.484305]                      device_driver_attach+0x4a/0x50
<4>[  673.484305]                      __driver_attach+0x97/0x130
<4>[  673.484305]                      bus_for_each_dev+0x74/0xc0
<4>[  673.484305]                      bus_add_driver+0x142/0x220
<4>[  673.484305]                      driver_register+0x56/0xf0
<4>[  673.484305]                      do_one_initcall+0x58/0x2ff
<4>[  673.484305]                      do_init_module+0x56/0x1f8
<4>[  673.484305]                      load_module+0x243e/0x29f0
<4>[  673.484305]                      __do_sys_finit_module+0xe9/0x110
<4>[  673.484305]                      do_syscall_64+0x4f/0x210
<4>[  673.484305]                      entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>[  673.484305]   }
<4>[  673.484305]   ... key      at: [<ffffffffa0a8f160>] __key.70307+0x0/0xffffffffffef2ea0 [i915]
<4>[  673.484305]   ... acquired at:
<4>[  673.484305]    _raw_spin_lock_irqsave+0x33/0x50
<4>[  673.484305]    execlists_submit_request+0x2b/0x1e0 [i915]
<4>[  673.484305]    submit_notify+0xa8/0x13c [i915]
<4>[  673.484305]    __i915_sw_fence_complete+0x81/0x250 [i915]
<4>[  673.484305]    i915_sw_fence_wake+0x51/0x70 [i915]
<4>[  673.484305]    __i915_sw_fence_complete+0x1ee/0x250 [i915]
<4>[  673.484305]    dma_i915_sw_fence_wake+0x1b/0x30 [i915]
<4>[  673.484305]    dma_fence_signal_locked+0x9e/0x1b0
<4>[  673.484305]    dma_fence_signal+0x1f/0x40
<4>[  673.484305]    fence_work+0x28/0x80 [i915]
<4>[  673.484305]    process_one_work+0x26a/0x620
<4>[  673.484305]    worker_thread+0x37/0x380
<4>[  673.484305]    kthread+0x119/0x130
<4>[  673.484305]    ret_from_fork+0x24/0x50

<4>[  673.484305] -> (&i915_request_get(rq)->submit/1){-...} ops: 857694 {
<4>[  673.484305]    IN-HARDIRQ-W at:
<4>[  673.484305]                     lock_acquire+0xa7/0x1c0
<4>[  673.484305]                     _raw_spin_lock_irqsave_nested+0x39/0x50
<4>[  673.484305]                     __i915_sw_fence_complete+0x1b2/0x250 [i915]
<4>[  673.484305]                     intel_engine_breadcrumbs_irq+0x3d0/0x5a0 [i915]
<4>[  673.484305]                     cs_irq_handler+0x39/0x50 [i915]
<4>[  673.484305]                     gen11_gt_irq_handler+0x17b/0x280 [i915]
<4>[  673.484305]                     gen11_irq_handler+0x54/0xf0 [i915]
<4>[  673.484305]                     __handle_irq_event_percpu+0x41/0x2c0
<4>[  673.484305]                     handle_irq_event_percpu+0x2b/0x70
<4>[  673.484305]                     handle_irq_event+0x2f/0x50
<4>[  673.484305]                     handle_edge_irq+0x99/0x1b0
<4>[  673.484305]                     do_IRQ+0x7e/0x160
<4>[  673.484305]                     ret_from_intr+0x0/0x1d
<4>[  673.484305]                     cpuidle_enter_state+0xb2/0x450
<4>[  673.484305]                     cpuidle_enter+0x24/0x40
<4>[  673.484305]                     do_idle+0x1e7/0x250
<4>[  673.484305]                     cpu_startup_entry+0x14/0x20
<4>[  673.484305]                     start_secondary+0x15f/0x1b0
<4>[  673.484305]                     secondary_startup_64+0xa4/0xb0
<4>[  673.484305]    INITIAL USE at:
<4>[  673.484305]                    lock_acquire+0xa7/0x1c0
<4>[  673.484305]                    _raw_spin_lock_irqsave_nested+0x39/0x50
<4>[  673.484305]                    __i915_sw_fence_complete+0x1b2/0x250 [i915]
<4>[  673.484305]                    __engine_park+0x233/0x420 [i915]
<4>[  673.484305]                    ____intel_wakeref_put_last+0x1c/0x70 [i915]
<4>[  673.484305]                    intel_gt_resume+0x202/0x2c0 [i915]
<4>[  673.484305]                    i915_gem_init+0x36e/0x900 [i915]
<4>[  673.484305]                    i915_driver_probe+0xb70/0x15d0 [i915]
<4>[  673.484305]                    i915_pci_probe+0x43/0x1b0 [i915]
<4>[  673.484305]                    pci_device_probe+0x9e/0x120
<4>[  673.484305]                    really_probe+0xea/0x420
<4>[  673.484305]                    driver_probe_device+0x10b/0x120
<4>[  673.484305]                    device_driver_attach+0x4a/0x50
<4>[  673.484305]                    __driver_attach+0x97/0x130
<4>[  673.484305]                    bus_for_each_dev+0x74/0xc0
<4>[  673.484305]                    bus_add_driver+0x142/0x220
<4>[  673.484305]                    driver_register+0x56/0xf0
<4>[  673.484305]                    do_one_initcall+0x58/0x2ff
<4>[  673.484305]                    do_init_module+0x56/0x1f8
<4>[  673.484305]                    load_module+0x243e/0x29f0
<4>[  673.484305]                    __do_sys_finit_module+0xe9/0x110
<4>[  673.484305]                    do_syscall_64+0x4f/0x210
<4>[  673.484305]                    entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>[  673.484305]  }
<4>[  673.484305]  ... key      at: [<ffffffffa0a8f6a1>] __key.80173+0x1/0xffffffffffef2960 [i915]
<4>[  673.484305]  ... acquired at:
<4>[  673.484305]    mark_lock+0x382/0x500
<4>[  673.484305]    __lock_acquire+0x7e1/0x15d0
<4>[  673.484305]    lock_acquire+0xa7/0x1c0
<4>[  673.484305]    _raw_spin_lock_irqsave_nested+0x39/0x50
<4>[  673.484305]    __i915_sw_fence_complete+0x1b2/0x250 [i915]
<4>[  673.484305]    intel_engine_breadcrumbs_irq+0x3d0/0x5a0 [i915]
<4>[  673.484305]    cs_irq_handler+0x39/0x50 [i915]
<4>[  673.484305]    gen11_gt_irq_handler+0x17b/0x280 [i915]
<4>[  673.484305]    gen11_irq_handler+0x54/0xf0 [i915]
<4>[  673.484305]    __handle_irq_event_percpu+0x41/0x2c0
<4>[  673.484305]    handle_irq_event_percpu+0x2b/0x70
<4>[  673.484305]    handle_irq_event+0x2f/0x50
<4>[  673.484305]    handle_edge_irq+0x99/0x1b0
<4>[  673.484305]    do_IRQ+0x7e/0x160
<4>[  673.484305]    ret_from_intr+0x0/0x1d
<4>[  673.484305]    cpuidle_enter_state+0xb2/0x450
<4>[  673.484305]    cpuidle_enter+0x24/0x40
<4>[  673.484305]    do_idle+0x1e7/0x250
<4>[  673.484305]    cpu_startup_entry+0x14/0x20
<4>[  673.484305]    start_secondary+0x15f/0x1b0
<4>[  673.484305]    secondary_startup_64+0xa4/0xb0

<4>[  673.484305]
                  stack backtrace:
<4>[  673.484305] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G     U            5.4.0-rc1-CI-Trybot_5118+ #1
<4>[  673.484305] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3183.A00.1905020411 05/02/2019
<4>[  673.484305] Call Trace:
<4>[  673.484305]  <IRQ>
<4>[  673.484305]  dump_stack+0x67/0x9b
<4>[  673.484305]  check_usage_forwards+0x13c/0x150
<4>[  673.484305]  ? mark_lock+0x382/0x500
<4>[  673.484305]  mark_lock+0x382/0x500
<4>[  673.484305]  ? check_usage_backwards+0x140/0x140
<4>[  673.484305]  __lock_acquire+0x7e1/0x15d0
<4>[  673.484305]  ? debug_object_deactivate+0x17e/0x190
<4>[  673.484305]  lock_acquire+0xa7/0x1c0
<4>[  673.484305]  ? __i915_sw_fence_complete+0x1b2/0x250 [i915]
<4>[  673.484305]  _raw_spin_lock_irqsave_nested+0x39/0x50
<4>[  673.484305]  ? __i915_sw_fence_complete+0x1b2/0x250 [i915]
<4>[  673.484305]  __i915_sw_fence_complete+0x1b2/0x250 [i915]
<4>[  673.484305]  intel_engine_breadcrumbs_irq+0x3d0/0x5a0 [i915]
<4>[  673.484305]  cs_irq_handler+0x39/0x50 [i915]
<4>[  673.484305]  gen11_gt_irq_handler+0x17b/0x280 [i915]
<4>[  673.484305]  gen11_irq_handler+0x54/0xf0 [i915]
<4>[  673.484305]  __handle_irq_event_percpu+0x41/0x2c0
<4>[  673.484305]  handle_irq_event_percpu+0x2b/0x70
<4>[  673.484305]  handle_irq_event+0x2f/0x50
<4>[  673.484305]  handle_edge_irq+0x99/0x1b0
<4>[  673.484305]  do_IRQ+0x7e/0x160
<4>[  673.484305]  common_interrupt+0xf/0xf
<4>[  673.484305]  </IRQ>

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004203121.31138-1-chris@chris-wilson.co.uk
2019-10-07 21:44:02 +01:00
Chris Wilson
08ad9a3846 drm/i915/execlists: Fix annotation for decoupling virtual request
As we may signal a request and take the engine->active.lock within the
signaler, the engine submission paths have to use a nested annotation on
their requests -- but we guarantee that we can never submit on the same
engine as the signaling fence.

<4>[  723.763281] WARNING: possible circular locking dependency detected
<4>[  723.763285] 5.3.0-g80fa0e042cdb-drmtip_379+ #1 Tainted: G     U
<4>[  723.763288] ------------------------------------------------------
<4>[  723.763291] gem_exec_await/1388 is trying to acquire lock:
<4>[  723.763294] ffff93a7b53221d8 (&engine->active.lock){..-.}, at: execlists_submit_request+0x2b/0x1e0 [i915]
<4>[  723.763378]
                  but task is already holding lock:
<4>[  723.763381] ffff93a7c25f6d20 (&i915_request_get(rq)->submit/1){-.-.}, at: __i915_sw_fence_complete+0x1b2/0x250 [i915]
<4>[  723.763420]
                  which lock already depends on the new lock.

<4>[  723.763423]
                  the existing dependency chain (in reverse order) is:
<4>[  723.763427]
                  -> #2 (&i915_request_get(rq)->submit/1){-.-.}:
<4>[  723.763434]        _raw_spin_lock_irqsave_nested+0x39/0x50
<4>[  723.763478]        __i915_sw_fence_complete+0x1b2/0x250 [i915]
<4>[  723.763513]        intel_engine_breadcrumbs_irq+0x3aa/0x5e0 [i915]
<4>[  723.763600]        cs_irq_handler+0x49/0x50 [i915]
<4>[  723.763659]        gen11_gt_irq_handler+0x17b/0x280 [i915]
<4>[  723.763690]        gen11_irq_handler+0x54/0xf0 [i915]
<4>[  723.763695]        __handle_irq_event_percpu+0x41/0x2d0
<4>[  723.763699]        handle_irq_event_percpu+0x2b/0x70
<4>[  723.763702]        handle_irq_event+0x2f/0x50
<4>[  723.763706]        handle_edge_irq+0xee/0x1a0
<4>[  723.763709]        do_IRQ+0x7e/0x160
<4>[  723.763712]        ret_from_intr+0x0/0x1d
<4>[  723.763717]        __slab_alloc.isra.28.constprop.33+0x4f/0x70
<4>[  723.763720]        kmem_cache_alloc+0x28d/0x2f0
<4>[  723.763724]        vm_area_dup+0x15/0x40
<4>[  723.763727]        dup_mm+0x2dd/0x550
<4>[  723.763730]        copy_process+0xf21/0x1ef0
<4>[  723.763734]        _do_fork+0x71/0x670
<4>[  723.763737]        __se_sys_clone+0x6e/0xa0
<4>[  723.763741]        do_syscall_64+0x4f/0x210
<4>[  723.763744]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>[  723.763747]
                  -> #1 (&(&rq->lock)->rlock#2){-.-.}:
<4>[  723.763752]        _raw_spin_lock+0x2a/0x40
<4>[  723.763789]        __unwind_incomplete_requests+0x3eb/0x450 [i915]
<4>[  723.763825]        __execlists_submission_tasklet+0x9ec/0x1d60 [i915]
<4>[  723.763864]        execlists_submission_tasklet+0x34/0x50 [i915]
<4>[  723.763874]        tasklet_action_common.isra.5+0x47/0xb0
<4>[  723.763878]        __do_softirq+0xd8/0x4ae
<4>[  723.763881]        irq_exit+0xa9/0xc0
<4>[  723.763883]        smp_apic_timer_interrupt+0xb7/0x280
<4>[  723.763887]        apic_timer_interrupt+0xf/0x20
<4>[  723.763892]        cpuidle_enter_state+0xae/0x450
<4>[  723.763895]        cpuidle_enter+0x24/0x40
<4>[  723.763899]        do_idle+0x1e7/0x250
<4>[  723.763902]        cpu_startup_entry+0x14/0x20
<4>[  723.763905]        start_secondary+0x15f/0x1b0
<4>[  723.763908]        secondary_startup_64+0xa4/0xb0
<4>[  723.763911]
                  -> #0 (&engine->active.lock){..-.}:
<4>[  723.763916]        __lock_acquire+0x15d8/0x1ea0
<4>[  723.763919]        lock_acquire+0xa6/0x1c0
<4>[  723.763922]        _raw_spin_lock_irqsave+0x33/0x50
<4>[  723.763956]        execlists_submit_request+0x2b/0x1e0 [i915]
<4>[  723.764002]        submit_notify+0xa8/0x13c [i915]
<4>[  723.764035]        __i915_sw_fence_complete+0x81/0x250 [i915]
<4>[  723.764054]        i915_sw_fence_wake+0x51/0x64 [i915]
<4>[  723.764054]        __i915_sw_fence_complete+0x1ee/0x250 [i915]
<4>[  723.764054]        dma_i915_sw_fence_wake_timer+0x14/0x20 [i915]
<4>[  723.764054]        dma_fence_signal_locked+0x9e/0x1c0
<4>[  723.764054]        dma_fence_signal+0x1f/0x40
<4>[  723.764054]        vgem_fence_signal_ioctl+0x67/0xc0 [vgem]
<4>[  723.764054]        drm_ioctl_kernel+0x83/0xf0
<4>[  723.764054]        drm_ioctl+0x2f3/0x3b0
<4>[  723.764054]        do_vfs_ioctl+0xa0/0x6f0
<4>[  723.764054]        ksys_ioctl+0x35/0x60
<4>[  723.764054]        __x64_sys_ioctl+0x11/0x20
<4>[  723.764054]        do_syscall_64+0x4f/0x210
<4>[  723.764054]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>[  723.764054]
                  other info that might help us debug this:

<4>[  723.764054] Chain exists of:
                    &engine->active.lock --> &(&rq->lock)->rlock#2 --> &i915_request_get(rq)->submit/1

<4>[  723.764054]  Possible unsafe locking scenario:

<4>[  723.764054]        CPU0                    CPU1
<4>[  723.764054]        ----                    ----
<4>[  723.764054]   lock(&i915_request_get(rq)->submit/1);
<4>[  723.764054]                                lock(&(&rq->lock)->rlock#2);
<4>[  723.764054]                                lock(&i915_request_get(rq)->submit/1);
<4>[  723.764054]   lock(&engine->active.lock);
<4>[  723.764054]
                   *** DEADLOCK ***

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111862
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191004194758.19679-1-chris@chris-wilson.co.uk
2019-10-07 21:44:02 +01:00