Commit Graph

207 Commits

Author SHA1 Message Date
Chris Wilson
795d4d7fa3 drm/i915: Mark the addition of the initial-breadcrumb in the request
The initial-breadcrumb is used to mark the end of the awaiting and the
beginning of the user payload. We verify that we do not start the user
payload before all signaler are completed, checking our semaphore setup
by looking for the initial breadcrumb being written too early. We also
want to ensure that we do not add semaphore waits after we have already
closed the semaphore section, an issue for later deferred waits.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200513165937.9508-2-chris@chris-wilson.co.uk
2020-05-13 21:09:54 +01:00
Chris Wilson
16dc224f1c drm/i915: Replace the hardcoded I915_FENCE_TIMEOUT
Expose the hardcoded timeout for unsignaled foreign fences as a Kconfig
option, primarily to allow brave systems to disable the timeout and
solely rely on correct signaling.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200509105021.12542-1-chris@chris-wilson.co.uk
2020-05-09 12:57:57 +01:00
Chris Wilson
fcae496153 drm/i915: Prevent using semaphores to chain up to external fences
The downside of using semaphores is that we lose metadata passing
along the signaling chain. This is particularly nasty when we
need to pass along a fatal error such as EFAULT or EDEADLK. For
fatal errors we want to scrub the request before it is executed,
which means that we cannot preload the request onto HW and have
it wait upon a semaphore.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200508092933.738-3-chris@chris-wilson.co.uk
2020-05-08 21:02:33 +01:00
Lionel Landwerlin
3136deb7ba drm/i915: Peel dma-fence-chains for await
To allow faster engine to engine synchronization, peel the layer of
dma-fence-chain to expose potential i915 fences so that the
i915_request code can emit HW semaphore wait/signal operations in the
ring which is faster than waking up the host to submit unblocked
workloads after interrupt notification.

This is similar to the peeling we do for e.g. dma_fence_array.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200508185448.29709-1-chris@chris-wilson.co.uk
2020-05-08 20:57:56 +01:00
Chris Wilson
ac938052e5 drm/i915: Pull waiting on an external dma-fence into its routine
As a means for a small code consolidation, but primarily to start
thinking more carefully about internal-vs-external linkage, pull the
pair of i915_sw_fence_await_dma_fence() calls into a common routine.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200508092933.738-2-chris@chris-wilson.co.uk
2020-05-08 12:39:05 +01:00
Chris Wilson
2045d666ae drm/i915: Ignore submit-fences on the same timeline
While we ordinarily do not skip submit-fences due to the accompanying
hook that we want to callback on execution, a submit-fence on the same
timeline is meaningless.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200508092933.738-1-chris@chris-wilson.co.uk
2020-05-08 12:38:54 +01:00
Chris Wilson
eec39e441c drm/i915: Remove wait priority boosting
Upon waiting a request (when asked), we gave that request a small
priority boost, not enough for it to cause preemption, but enough for it
to be scheduled next before all equals. We also used that bit to give
new clients a small priority boost, similar to FQ_CODEL, such that we
favoured short interactive tasks ahead of long running streams.

However, this is causing lots of complications with timeslicing where we
both want to honour the boost and yet ignore it. Those complications
cause unexpected user behaviour (tasks not being timesliced and run
concurrently as epxected), and the easiest way to resolve that is to
remove the boost. Hopefully, we can find a compromise again if we need
to, but in theory timeslicing itself and future more advanced schedulers
should give us the interactivity boost we seek.

Testcase: igt/gem_exec_schedule/lateslice
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200507152338.7452-3-chris@chris-wilson.co.uk
2020-05-07 20:08:58 +01:00
Chris Wilson
6b6cd2ebd8 drm/i915: Mark concurrent submissions with a weak-dependency
We recorded the dependencies for WAIT_FOR_SUBMIT in order that we could
correctly perform priority inheritance from the parallel branches to the
common trunk. However, for the purpose of timeslicing and reset
handling, the dependency is weak -- as we the pair of requests are
allowed to run in parallel and not in strict succession.

The real significance though is that this allows us to rearrange
groups of WAIT_FOR_SUBMIT linked requests along the single engine, and
so can resolve user level inter-batch scheduling dependencies from user
semaphores.

Fixes: c81471f5e9 ("drm/i915: Copy across scheduler behaviour flags across submit fences")
Testcase: igt/gem_exec_fence/submit
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: <stable@vger.kernel.org> # v5.6+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200507155109.8892-1-chris@chris-wilson.co.uk
2020-05-07 19:49:21 +01:00
Chris Wilson
24fe5f2ab2 drm/i915: Propagate error from completed fences
We need to preserve fatal errors from fences that are being terminated
as we hook them up.

Fixes: ef46884975 ("drm/i915: Propagate fence errors")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200506162136.3325-1-chris@chris-wilson.co.uk
2020-05-06 18:23:14 +01:00
Chris Wilson
43acd6516c drm/i915: Keep a per-engine request pool
Add a tiny per-engine request mempool so that we should always have a
request available for powermanagement allocations from tricky
contexts. This reserve is expected to be only used for kernel
contexts when barriers must be emitted [almost] without fail.

The main consumer for this reserved request is expected to be engine-pm,
for which we know that there will always be at least the previous pm
request that we can reuse under mempressure (so there should always be
a spare request for engine_park()).

This is an alternative to using a comparatively bulky mempool, which
requires custom handling for both our reserved allocation requirement
and to protect our TYPESAFE_BY_RCU slab cache. The advantage of mempool
would be that it would allow us to keep a larger per-engine request
pool. However, converting over to mempool is straightforward should the
need arise.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-and-tested-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200402184037.21630-1-chris@chris-wilson.co.uk
2020-04-03 15:20:03 +01:00
Chris Wilson
41e4065a6b drm/i915: Rely on direct submission to the queue
Drop the pretense of kicking the tasklet (used only for the defunct guc
submission backend, it should just take ownership of the submit!) and so
remove the bh-kicking from around submission.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200323092841.22240-5-chris@chris-wilson.co.uk
2020-03-23 11:51:39 +00:00
Chris Wilson
326611ddff drm/i915: Mark up racy read of active rq->engine
As a virtual engine may change the rq->engine to point to the active
request in flight, we need to warn the compiler that an active request's
engine is volatile.

[   95.017686] write (marked) to 0xffff8881e8386b10 of 8 bytes by interrupt on cpu 2:
[   95.018123]  execlists_dequeue+0x762/0x2150 [i915]
[   95.018539]  __execlists_submission_tasklet+0x48/0x60 [i915]
[   95.018955]  execlists_submission_tasklet+0xd3/0x170 [i915]
[   95.018986]  tasklet_action_common.isra.0+0x42/0xa0
[   95.019016]  __do_softirq+0xd7/0x2cd
[   95.019043]  irq_exit+0xbe/0xe0
[   95.019068]  irq_work_interrupt+0xf/0x20
[   95.019491]  i915_request_retire+0x2c5/0x670 [i915]
[   95.019937]  retire_requests+0xa1/0xf0 [i915]
[   95.020348]  engine_retire+0xa1/0xe0 [i915]
[   95.020376]  process_one_work+0x3b1/0x690
[   95.020403]  worker_thread+0x80/0x670
[   95.020429]  kthread+0x19a/0x1e0
[   95.020454]  ret_from_fork+0x1f/0x30
[   95.020476]
[   95.020498] read to 0xffff8881e8386b10 of 8 bytes by task 8909 on cpu 3:
[   95.020918]  __i915_request_commit+0x177/0x220 [i915]
[   95.021329]  i915_gem_do_execbuffer+0x38c4/0x4e50 [i915]
[   95.021750]  i915_gem_execbuffer2_ioctl+0x2c3/0x580 [i915]
[   95.021784]  drm_ioctl_kernel+0xe4/0x120
[   95.021809]  drm_ioctl+0x297/0x4c7
[   95.021832]  ksys_ioctl+0x89/0xb0
[   95.021865]  __x64_sys_ioctl+0x42/0x60
[   95.021901]  do_syscall_64+0x6e/0x2c0
[   95.021927]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200310142403.5953-1-chris@chris-wilson.co.uk
2020-03-10 23:12:38 +00:00
Chris Wilson
209df10bb4 drm/i915: Defer semaphore priority bumping to a workqueue
Since the semaphore fence may be signaled from inside an interrupt
handler from inside a request holding its request->lock, we cannot then
enter into the engine->active.lock for processing the semaphore priority
bump as we may traverse our call tree and end up on another held
request.

CPU 0:
[ 2243.218864]  _raw_spin_lock_irqsave+0x9a/0xb0
[ 2243.218867]  i915_schedule_bump_priority+0x49/0x80 [i915]
[ 2243.218869]  semaphore_notify+0x6d/0x98 [i915]
[ 2243.218871]  __i915_sw_fence_complete+0x61/0x420 [i915]
[ 2243.218874]  ? kmem_cache_free+0x211/0x290
[ 2243.218876]  i915_sw_fence_complete+0x58/0x80 [i915]
[ 2243.218879]  dma_i915_sw_fence_wake+0x3e/0x80 [i915]
[ 2243.218881]  signal_irq_work+0x571/0x690 [i915]
[ 2243.218883]  irq_work_run_list+0xd7/0x120
[ 2243.218885]  irq_work_run+0x1d/0x50
[ 2243.218887]  smp_irq_work_interrupt+0x21/0x30
[ 2243.218889]  irq_work_interrupt+0xf/0x20

CPU 1:
[ 2242.173107]  _raw_spin_lock+0x8f/0xa0
[ 2242.173110]  __i915_request_submit+0x64/0x4a0 [i915]
[ 2242.173112]  __execlists_submission_tasklet+0x8ee/0x2120 [i915]
[ 2242.173114]  ? i915_sched_lookup_priolist+0x1e3/0x2b0 [i915]
[ 2242.173117]  execlists_submit_request+0x2e8/0x2f0 [i915]
[ 2242.173119]  submit_notify+0x8f/0xc0 [i915]
[ 2242.173121]  __i915_sw_fence_complete+0x61/0x420 [i915]
[ 2242.173124]  ? _raw_spin_unlock_irqrestore+0x39/0x40
[ 2242.173137]  i915_sw_fence_complete+0x58/0x80 [i915]
[ 2242.173140]  i915_sw_fence_commit+0x16/0x20 [i915]

Closes: https://gitlab.freedesktop.org/drm/intel/issues/1318
Fixes: b7404c7ecb ("drm/i915: Bump ready tasks ahead of busywaits")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: <stable@vger.kernel.org> # v5.2+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200310101720.9944-1-chris@chris-wilson.co.uk
2020-03-10 23:12:38 +00:00
Chris Wilson
798fa870ab drm/i915: Improve the start alignment of bonded pairs
Always wait on the start of the signaler request to reduce the problem
of dequeueing the bonded pair too early -- we want both payloads to
start at the same time, with no latency, and yet still allow others to
make full use of the slack in the system. This reduce the amount of time
we spend waiting on the semaphore used to synchronise the start of the
bonded payload.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200306133852.3420322-3-chris@chris-wilson.co.uk
2020-03-10 11:13:33 +00:00
Chris Wilson
60900add85 drm/i915: Mark racy read of intel_engine_cs.saturated
[ 3783.276728] BUG: KCSAN: data-race in __i915_request_submit [i915] / i915_request_await_dma_fence [i915]
[ 3783.276766]
[ 3783.276787] write to 0xffff8881f1bc60a0 of 1 bytes by interrupt on cpu 2:
[ 3783.277187]  __i915_request_submit+0x47e/0x4a0 [i915]
[ 3783.277580]  __execlists_submission_tasklet+0x997/0x2780 [i915]
[ 3783.277973]  execlists_submission_tasklet+0xd3/0x170 [i915]
[ 3783.278006]  tasklet_action_common.isra.0+0x42/0xa0
[ 3783.278035]  __do_softirq+0xd7/0x2cd
[ 3783.278063]  irq_exit+0xbe/0xe0
[ 3783.278089]  do_IRQ+0x51/0x100
[ 3783.278114]  ret_from_intr+0x0/0x1c
[ 3783.278140]  finish_task_switch+0x72/0x260
[ 3783.278170]  __schedule+0x1e5/0x510
[ 3783.278198]  schedule+0x45/0xb0
[ 3783.278226]  smpboot_thread_fn+0x23e/0x300
[ 3783.278256]  kthread+0x19a/0x1e0
[ 3783.278283]  ret_from_fork+0x1f/0x30
[ 3783.278305]
[ 3783.278327] read to 0xffff8881f1bc60a0 of 1 bytes by task 19440 on cpu 3:
[ 3783.278724]  i915_request_await_dma_fence+0x2a6/0x530 [i915]
[ 3783.279130]  i915_request_await_object+0x2fe/0x470 [i915]
[ 3783.279524]  i915_gem_do_execbuffer+0x45dc/0x4c20 [i915]
[ 3783.279908]  i915_gem_execbuffer2_ioctl+0x2c3/0x580 [i915]
[ 3783.279940]  drm_ioctl_kernel+0xe4/0x120
[ 3783.279968]  drm_ioctl+0x297/0x4c7
[ 3783.279996]  ksys_ioctl+0x89/0xb0
[ 3783.280021]  __x64_sys_ioctl+0x42/0x60
[ 3783.280047]  do_syscall_64+0x6e/0x2c0
[ 3783.280074]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200309132726.28358-1-chris@chris-wilson.co.uk
2020-03-09 17:08:52 +00:00
Chris Wilson
dff2a11b06 drm/i915: Do not poison i915_request.link on removal
Do not poison the timeline link on the i915_request to allow both
forward/backward list traversal under RCU.

[ 9759.139229] RIP: 0010:active_request+0x2a/0x90 [i915]
[ 9759.139240] Code: 41 56 41 55 41 54 55 48 89 fd 53 48 89 f3 48 83 c5 60 e8 49 de dc e0 48 8b 83 e8 01 00 00 48 39 c5 74 12 48 8d 90 20 fe ff ff <48> 8b 80 50 fe ff ff a8 01 74 11 e8 66 20 dd e0 48 89 d8 5b 5d 41
[ 9759.139251] RSP: 0018:ffffc9000014ce80 EFLAGS: 00010012
[ 9759.139260] RAX: dead000000000122 RBX: ffff888817cac040 RCX: 0000000000022000
[ 9759.139267] RDX: deacffffffffff42 RSI: ffff888817cac040 RDI: ffff888851fee900
[ 9759.139275] RBP: ffff888851fee960 R08: 000000000000001a R09: ffffffffa04702e0
[ 9759.139282] R10: ffffffff82187ea0 R11: 0000000000000002 R12: 0000000000000004
[ 9759.139289] R13: ffffffffa04d5179 R14: ffff8887f994ae40 R15: ffff888857b9a068
[ 9759.139296] FS:  0000000000000000(0000) GS:ffff88885ed80000(0000) knlGS:0000000000000000
[ 9759.139304] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9759.139311] CR2: 00007fff5bdec000 CR3: 00000008534fe001 CR4: 00000000001606e0
[ 9759.139318] Call Trace:
[ 9759.139325]  <IRQ>
[ 9759.139389]  execlists_reset+0x14d/0x310 [i915]
[ 9759.139400]  ? _raw_spin_unlock_irqrestore+0xf/0x30
[ 9759.139445]  ? fwtable_read32+0x90/0x230 [i915]
[ 9759.139499]  execlists_submission_tasklet+0xf6/0x150 [i915]
[ 9759.139508]  tasklet_action_common.isra.17+0x32/0xa0
[ 9759.139516]  __do_softirq+0x114/0x3dc
[ 9759.139525]  ? handle_irq_event_percpu+0x59/0x70
[ 9759.139533]  irq_exit+0xa1/0xc0
[ 9759.139540]  do_IRQ+0x76/0x150
[ 9759.139547]  common_interrupt+0xf/0xf
[ 9759.139554]  </IRQ>

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200306140115.3495686-1-chris@chris-wilson.co.uk
2020-03-07 00:05:54 +00:00
Chris Wilson
1eaa251b66 drm/i915: Assert requests within a context are submitted in order
Check the flow of requests into the hardware to verify that are
submitted in order along their timeline.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200306071614.2846708-1-chris@chris-wilson.co.uk
2020-03-06 10:53:54 +00:00
Chris Wilson
ab7a69020f drm/i915: Return early for await_start on same timeline
Requests within a timeline are ordered by that timeline, so awaiting for
the start of a request within the timeline is a no-op. This used to work
by falling out of the mutex_trylock() as the signaler and waiter had the
same timeline and not returning an error.

Fixes: 6a79d84840 ("drm/i915: Lock signaler timeline while navigating")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.5+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200305134822.2750496-1-chris@chris-wilson.co.uk
2020-03-05 15:28:15 +00:00
Chris Wilson
07e9c59d63 drm/i915: Actually emit the await_start
Fix the inverted test to emit the wait on the end of the previous
request if we /haven't/ already.

Fixes: 6a79d84840 ("drm/i915: Lock signaler timeline while navigating")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.5+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200305104210.2619967-1-chris@chris-wilson.co.uk
2020-03-05 14:26:22 +00:00
Chris Wilson
36e191f064 drm/i915: Apply i915_request_skip() on submission
Trying to use i915_request_skip() prior to i915_request_add() causes us
to try and fill the ring upto request->postfix, which has not yet been
set, and so may cause us to memset() past the end of the ring.

Instead of skipping the request immediately, just flag the error on the
request (only accepting the first fatal error we see) and then clear the
request upon submission.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200304121849.2448028-1-chris@chris-wilson.co.uk
2020-03-04 14:29:50 +00:00
Chris Wilson
61231f6bd0 drm/i915/gem: Check that the context wasn't closed during setup
As setup takes a long time, the user may close the context during the
construction of the execbuf. In order to make sure we correctly track
all outstanding work with non-persistent contexts, we need to serialise
the submission with the context closure and mop up any leaks.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200303080546.1140508-3-chris@chris-wilson.co.uk
2020-03-03 17:30:20 +00:00
Chris Wilson
062444bbc6 drm/i915/gt: Expose busywait duration to sysfs
We busywait on an inflight request (one that is currently executing on
HW, and so might complete quickly) prior to setting up an interrupt and
sleeping. The trade off is that we keep an expensive CPU core busy in
order to avoid wake up latency: where that trade off should lie is best
left to the sysadmin.

The busywait mechanism can be compiled out with

	./scripts/config --set-val DRM_I915_SPIN_REQUEST 0

The maximum busywait duration can be adjusted per-engine using,

	/sys/class/drm/card?/engine/*/ms_busywait_duration_ns

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Steve Carbonari <steven.carbonari@intel.com>
Tested-by: Steve Carbonari <steven.carbonari@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200228131716.3243616-4-chris@chris-wilson.co.uk
2020-02-28 22:03:41 +00:00
Chris Wilson
d22d2d073e drm/i915: Protect i915_request_await_start from early waits
We need to be extremely careful inside i915_request_await_start() as it
needs to walk the list of requests in the foreign timeline with very
little protection. As we hold our own timeline mutex, we can not nest
inside the signaler's timeline mutex, so all that remains is our RCU
protection. However, to be safe we need to tell the compiler that we may
be traversing the list only under RCU protection, and furthermore we
need to start declaring requests as elements of the timeline from their
construction.

Fixes: 9ddc8ec027 ("drm/i915: Eliminate the trylock for awaiting an earlier request")
Fixes: 6a79d84840 ("drm/i915: Lock signaler timeline while navigating")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200227085723.1961649-11-chris@chris-wilson.co.uk
2020-02-28 13:35:11 +00:00
Matthew Auld
df6b1f3da8 drm/i915: remove the other slab_dependencies
The real one can be found in i915_scheduler.c.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200220105707.344522-1-matthew.auld@intel.com
2020-02-20 12:11:31 +00:00
Chris Wilson
89dd019a8a drm/i915: Poison rings after use
On retiring the request, we should not re-use these elements in the ring
(at least not until we fill the ringbuffer and knowingly reuse the space).
Leave behind some poison to (hopefully) trap ourselves if we make a
mistake.

Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200211205615.1190127-1-chris@chris-wilson.co.uk
2020-02-12 10:07:13 +00:00
Chris Wilson
f16ccb6445 drm/i915: Disable use of hwsp_cacheline for kernel_context
Currently on execlists, we use a local hwsp for the kernel_context,
rather than the engine's HWSP, as this is the default for execlists.
However, seqno wrap requires allocating a new HWSP cacheline, and may
require pinning a new HWSP page in the GGTT. This operation requiring
pinning in the GGTT is not allowed within the kernel_context timeline,
as doing so may require re-entering the kernel_context in order to evict
from the GGTT. As we want to avoid requiring a new HWSP for the
kernel_context, we can use the permanently pinned engine's HWSP instead.
However to do so we must prevent the use of semaphores reading the
kernel_context's HWSP, as the use of semaphores do not support rollover
onto the same cacheline. Fortunately, the kernel_context is mostly
isolated, so unlikely to give benefit to semaphores.

Reported-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200210205722.794180-5-chris@chris-wilson.co.uk
2020-02-11 17:42:17 +00:00
Chris Wilson
602ddb410d drm/i915: Flush execution tasklets before checking request status
Rather than flushing the submission tasklets just before we sleep, flush
before we check the request status. Ideally this gives us a moment to
process the tasklets after sleeping just before we timeout.

v2: Compromise by pushing the flush prior to the timeout, but after the
check on completion so that we do not further delay the ready client.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200205095441.1769599-1-chris@chris-wilson.co.uk
2020-02-05 18:51:52 +00:00
Chris Wilson
855e39e65c drm/i915: Initialise basic fence before acquiring seqno
Inside the intel_timeline_get_seqno(), we currently track the retirement
of the old cachelines by listening to the new request. This requires
that the new request is ready to be used and so requires a minimum bit
of initialisation prior to getting the new seqno.

Fixes: b1e3177bd1 ("drm/i915: Coordinate i915_active with its own mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200203094152.4150550-2-chris@chris-wilson.co.uk
2020-02-03 11:27:00 +00:00
Chris Wilson
b4a9a149f9 drm/i915: Mark the removal of the i915_request from the sched.link
Keep the rq->fence.flags consistent with the status of the
rq->sched.link, and clear the associated bits when decoupling the link
on retirement (as we may wish to inspect those flags independent of
other state).

Fixes: 32ff621fd7 ("drm/i915/gt: Allow temporary suspension of inflight requests")
References: https://gitlab.freedesktop.org/drm/intel/issues/997
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200122140243.495621-3-chris@chris-wilson.co.uk
2020-01-22 17:10:16 +00:00
Chris Wilson
672c368f93 drm/i915: Keep track of request among the scheduling lists
If we keep track of when the i915_request.sched.link is on the HW
runlist, or in the priority queue we can simplify our interactions with
the request (such as during rescheduling). This also simplifies the next
patch where we introduce a new in-between list, for requests that are
ready but neither on the run list or in the queue.

v2: Update i915_sched_node.link explanation for current usage where it
is a link on both the queue and on the runlists.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200116184754.2860848-1-chris@chris-wilson.co.uk
2020-01-16 19:56:15 +00:00
Chris Wilson
e1c31fb5dd drm/i915: Merge i915_request.flags with i915_request.fence.flags
As we already have a flags field buried within i915_request, reuse it!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200106114234.2529613-3-chris@chris-wilson.co.uk
2020-01-06 14:38:55 +00:00
Chris Wilson
6a8679c048 drm/i915: Mark the GEM context link as RCU protected
The only protection for intel_context.gem_cotext is granted by RCU, so
annotate it as a rcu protected pointer and carefully dereference it in
the few occasions we need to use it.

Fixes: 9f3ccd40ac ("drm/i915: Drop GEM context as a direct link from i915_request")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Acked-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191222233558.2201901-1-chris@chris-wilson.co.uk
2019-12-23 13:08:47 +00:00
Chris Wilson
e6ba764802 drm/i915: Remove i915->kernel_context
Allocate only an internal intel_context for the kernel_context, forgoing
a global GEM context for internal use as we only require a separate
address space (for our own protection).

Now having weaned GT from requiring ce->gem_context, we can stop
referencing it entirely. This also means we no longer have to create random
and unnecessary GEM contexts for internal use.

GEM contexts are now entirely for tracking GEM clients, and intel_context
the execution environment on the GPU.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Acked-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191221160324.1073045-1-chris@chris-wilson.co.uk
2019-12-21 16:37:10 +00:00
Chris Wilson
0f100b7048 drm/i915: Push the use-semaphore marker onto the intel_context
Instead of rummaging through the intel_context to peek at the GEM
context in the middle of request submission to decide whether to use
semaphores, store that information on the intel_context itself.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191220101230.256839-2-chris@chris-wilson.co.uk
2019-12-20 10:57:10 +00:00
Chris Wilson
9f3ccd40ac drm/i915: Drop GEM context as a direct link from i915_request
Keep the intel_context as being the primary state for i915_request, with
the GEM context a backpointer from the low level state for the rarer
cases we need client information. Our goal is to remove such references
to clients from the backend, and leave the HW submission agnostic to
client interfaces and self-contained.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191220101230.256839-1-chris@chris-wilson.co.uk
2019-12-20 10:52:21 +00:00
Chris Wilson
54400257ae drm/i915/gt: Remove direct invocation of breadcrumb signaling
Only signal the breadcrumbs from inside the irq_work, simplifying our
interface and calling conventions. The micro-optimisation here is that
by always using the irq_work interface, we know we are always inside an
irq-off critical section for the breadcrumb signaling and can ellide
save/restore of the irq flags.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191217095642.3124521-7-chris@chris-wilson.co.uk
2019-12-18 17:11:28 +00:00
Chris Wilson
85bedbf191 drm/i915/gt: Eliminate the trylock for reading a timeline's hwsp
As we stash a pointer to the HWSP cacheline on the request, when reading
it we only need confirm that the cacheline is still valid by checking
that the request and timeline are still intact.

v2: Protect hwsp_cachline with RCU

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191217011659.3092130-1-chris@chris-wilson.co.uk
2019-12-17 16:59:48 +00:00
Chris Wilson
9ddc8ec027 drm/i915: Eliminate the trylock for awaiting an earlier request
We currently use an error-prone mutex_trylock to grab another timeline
to find an earlier request along it. However, with a bit of a
sleight-of-hand, we can reduce the mutex_trylock to a spin_lock on the
immediate request and careful pointer chasing to acquire a reference on
the previous request.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191216165317.2742896-1-chris@chris-wilson.co.uk
2019-12-16 23:25:49 +00:00
Chris Wilson
f1925f3309 drm/i915: Use EAGAIN for trylock failures
While not good behaviour, it is, however, established behaviour that we
can punt EAGAIN to userspace if we need to retry the ioctl. When trying
to acquire a mutex, prefer to use EAGAIN to propagate losing the race
so that if it does end up back in userspace, we try again.

Fixes: c81471f5e9 ("drm/i915: Copy across scheduler behaviour flags across submit fences")
Closes: https://gitlab.freedesktop.org/drm/intel/issues/800
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191213160347.1789004-1-chris@chris-wilson.co.uk
2019-12-13 20:16:23 +00:00
Venkata Sandeep Dhanalakota
639f2f2489 drm/i915: Introduce new macros for tracing
New macros ENGINE_TRACE(), CE_TRACE(), RQ_TRACE() and
GT_TRACE() are introduce to tag device name and engine
name with contexts and requests tracing in i915.

Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191213155152.69182-2-venkata.s.dhanalakota@intel.com
2019-12-13 20:16:23 +00:00
Chris Wilson
65c29dbb19 drm/i915: Use the i915_device name for identifying our request fences
Use the dev_name(i915) to identify the requests for debugging, so we can
tell different device timelines apart.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Reviewed-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191211150204.133471-1-chris@chris-wilson.co.uk
2019-12-11 21:34:31 +00:00
Chris Wilson
c81471f5e9 drm/i915: Copy across scheduler behaviour flags across submit fences
We want the bonded request to have the same scheduler properties as its
master so that it is placed at the same depth in the queue. For example,
consider we have requests A, B and B', where B & B' are a bonded pair to
run in parallel on two engines.

	A -> B
     	     \- B'

B will run after A and so may be scheduled on an idle engine and wait on
A using a semaphore. B' sees B being executed and so enters the queue on
the same engine as A. As B' did not inherit the semaphore-chain from B,
it may have higher precedence than A and so preempts execution. However,
B' then sits on a semaphore waiting for B, who is waiting for A, who is
blocked by B.

Ergo B' needs to inherit the scheduler properties from B (i.e. the
semaphore chain) so that it is scheduled with the same priority as B and
will not be executed ahead of Bs dependencies.

Furthermore, to prevent the priorities changing via the expose fence on
B', we need to couple in the dependencies for PI. This requires us to
relax our sanity-checks that dependencies are strictly in order.

v2: Synchronise (B, B') execution on all platforms, regardless of using
a scheduler, any no-op syncs should be elided.

Fixes: ee1136908e ("drm/i915/execlists: Virtual engine bonding")
Closes: https://gitlab.freedesktop.org/drm/intel/issues/464
Testcase: igt/gem_exec_balancer/bonded-chain
Testcase: igt/gem_exec_balancer/bonded-semaphore
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191210151332.3902215-1-chris@chris-wilson.co.uk
2019-12-11 10:28:56 +00:00
Jani Nikula
023265ed75 Merge drm/drm-next into drm-intel-next-queued
Sync up with v5.5-rc1 to get the updated lock_release() API among other
things. Fix the conflict reported by Stephen Rothwell [1].

[1] http://lore.kernel.org/r/20191210093957.5120f717@canb.auug.org.au

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2019-12-11 11:13:50 +02:00
Chris Wilson
9e31c1fe45 drm/i915: Propagate errors on awaiting already signaled fences
If we see an already signaled fence that we want to await on, we skip
adding to the i915_sw_fence. However, we should pay attention to whether
there was an error on that fence and if so propagate it for our future
request.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191206160428.1503343-2-chris@chris-wilson.co.uk
2019-12-06 19:09:40 +00:00
Linus Torvalds
a6ed68d646 drm main pull for 5.5-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJd3YntAAoJEAx081l5xIa+dcQP/ikABkpm+q23FLKteRpL1rtX
 xqlg5+KHW+YVCDls2BrINF6vYzyisoa8fNPlKMmOHse/IgMhFe9vBbCj1KQQOUR1
 apNycI1wrcw/mn2WDikoIcF6C5cjqK9YVknnYoM6HnF1VmpGd1ecSGrOHrunEkrK
 cMAWYIeqWGU8Gj/HUOitAFpLWFUMNle0BJuRoGLcoMusgS8yuCIEcpNzRhgL8fvJ
 bW4imuyv24OjPoQzbKD0oQ0VIP86H0eM4LIeGZ2uyK/BSPKmMDqI4z4isUheS7RL
 w4a6BdobMIdhew5dBXS0LsUJ3JniVJdHy123q9KgpmQAhGpiNoLT6BujfoUTUeWx
 Mu0vM8Xmv9n4npdBYC+fLEFQXYJlu9uBA490jP84Kz6Fg1c6GyBebDY7/c2O4Zmg
 7pvygmUF6boD6v2sIC/3161crgwU4g8zoxm2V4i9naxes2QB13LiEuJWlaI/FdxY
 fd3zpglFGdoF1ThNne4QDh6gMKpXvjITyu/QxZeZ67Dt6i0Aqw9cRGHSpiVhYyDc
 cx2hAp+rDvUi5SHkJKFpVImjB2DDn2xUG2uFMHz0cy9wNg203L3fRDi0hVtnM1+W
 VpCxyLs2Upz6kEjDRVsfMZ9chCcWAWpVuKhtuuMUDw/IKnbP3uV8kzgJpVpaRVkD
 76s5uYWHHBlk1IVlkOUP
 =Hj7G
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2019-11-27' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie:
 "Lots of stuff in here, though it hasn't been too insane this merge
  apart from dealing with the security fun.

  uapi:
   - export different colorspace properties on DP vs HDMI
   - new fourcc for ARM 16x16 block format
   - syncobj: allow querying last submitted timeline value
   - DRM_FORMAT_BIG_ENDIAN defined as unsigned

  core:
   - allow using gem vma manager in ttm
   - connector/encoder/bridge doc fixes
   - allow more than 3 encoders for a connector
   - displayport mst suspend/resume reprobing support
   - vram lazy unmapping, uniform vram mm and gem vram
   - edid cleanups + AVI informframe bar info
   - displayport helpers - dpcd parser added

  dp_cec:
   - Allow a connector to be associated with a cec device

  ttm:
   - pipelining with no_gpu_wait fix
   - always keep BOs on the LRU

  sched:
   - allow free_job routine to sleep

  i915:
   - Block userptr from mappable GTT
   - i915 perf uapi versioning
   - OA stream dynamic reconfiguration
   - make context persistence optional
   - introduce DRM_I915_UNSTABLE Kconfig
   - add fake lmem testing under unstable
   - BT.2020 support for DP MSA
   - struct mutex elimination
   - Tigerlake display/PLL/power management improvements
   - Jasper Lake PCH support
   - refactor PMU for multiple GPUs
   - Icelake firmware update
   - Split out vga + switcheroo code

  amdgpu:
   - implement dma-buf import/export without helpers
   - vega20 RAS enablement
   - DC i2c over aux fixes
   - renoir GPU reset
   - DC HDCP support
   - BACO support for CI/VI asics
   - MSI-X support
   - Arcturus EEPROM support
   - Arcturus VCN encode support
   - VCN dynamic powergating on RV/RV2

  amdkfd:
   - add navi12/14/renoir support to kfd

  radeon:
   - SI dpm fix ported from amdgpu
   - fix bad DMA on ppc platforms

  gma500:
   - memory leak fixes

  qxl:
   - convert to new gem mmap

  exynos:
   - build warning fix

  komeda:
   - add aclk sysfs attribute

  v3d:
   - userspace cleanup uapi change

  i810:
   - fix for underflow in dispatch ioctls

  ast:
   - refactor show_cursor

  mgag200:
   - refactor show_cursor

  arcgpu:
   - encoder finding improvements

  mediatek:
   - mipi_tx, dsi and partial crtc support for MT8183 SoC
   - rotation support

  meson:
   - add suspend/resume support

  omap:
   - misc refactors

  tegra:
   - DisplayPort support for Tegra 210, 186 and 194.
   - IOMMU-backed DMA API fixes

  panfrost:
   - fix lockdep issue
   - simplify devfreq integration

  rcar-du:
   - R8A774B1 SoC support
   - fixes for H2 ES2.0

  sun4i:
   - vcc-dsi regulator support

  virtio-gpu:
   - vmexit vs spinlock fix
   - move to gem shmem helpers
   - handle large command buffers with cma"

* tag 'drm-next-2019-11-27' of git://anongit.freedesktop.org/drm/drm: (1855 commits)
  drm/amdgpu: invalidate mmhub semaphore workaround in gmc9/gmc10
  drm/amdgpu: initialize vm_inv_eng0_sem for gfxhub and mmhub
  drm/amd/amdgpu/sriov skip RLCG s/r list for arcturus VF.
  drm/amd/amdgpu/sriov temporarily skip ras,dtm,hdcp for arcturus VF
  drm/amdgpu/gfx10: re-init clear state buffer after gpu reset
  merge fix for "ftrace: Rework event_create_dir()"
  drm/amdgpu: Update Arcturus golden registers
  drm/amdgpu/gfx10: fix out-of-bound mqd_backup array access
  drm/amdgpu/gfx10: explicitly wait for cp idle after halt/unhalt
  Revert "drm/amd/display: enable S/G for RAVEN chip"
  drm/amdgpu: disable gfxoff on original raven
  drm/amdgpu: remove experimental flag for Navi14
  drm/amdgpu: disable gfxoff when using register read interface
  drm/amdgpu/powerplay: properly set PP_GFXOFF_MASK (v2)
  drm/amdgpu: fix bad DMA from INTERRUPT_CNTL2
  drm/radeon: fix bad DMA from INTERRUPT_CNTL2
  drm/amd/display: Fix debugfs on MST connectors
  drm/amdgpu/nv: add asic func for fetching vbios from rom directly
  drm/amdgpu: put flush_delayed_work at first
  drm/amdgpu/vcn2.5: fix the enc loop with hw fini
  ...
2019-11-27 17:45:48 -08:00
Linus Torvalds
168829ad09 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
 "The main changes in this cycle were:

   - A comprehensive rewrite of the robust/PI futex code's exit handling
     to fix various exit races. (Thomas Gleixner et al)

   - Rework the generic REFCOUNT_FULL implementation using
     atomic_fetch_* operations so that the performance impact of the
     cmpxchg() loops is mitigated for common refcount operations.

     With these performance improvements the generic implementation of
     refcount_t should be good enough for everybody - and this got
     confirmed by performance testing, so remove ARCH_HAS_REFCOUNT and
     REFCOUNT_FULL entirely, leaving the generic implementation enabled
     unconditionally. (Will Deacon)

   - Other misc changes, fixes, cleanups"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
  lkdtm: Remove references to CONFIG_REFCOUNT_FULL
  locking/refcount: Remove unused 'refcount_error_report()' function
  locking/refcount: Consolidate implementations of refcount_t
  locking/refcount: Consolidate REFCOUNT_{MAX,SATURATED} definitions
  locking/refcount: Move saturation warnings out of line
  locking/refcount: Improve performance of generic REFCOUNT_FULL code
  locking/refcount: Move the bulk of the REFCOUNT_FULL implementation into the <linux/refcount.h> header
  locking/refcount: Remove unused refcount_*_checked() variants
  locking/refcount: Ensure integer operands are treated as signed
  locking/refcount: Define constants for saturation and max refcount values
  futex: Prevent exit livelock
  futex: Provide distinct return value when owner is exiting
  futex: Add mutex around futex exit
  futex: Provide state handling for exec() as well
  futex: Sanitize exit state handling
  futex: Mark the begin of futex exit explicitly
  futex: Set task::futex_state to DEAD right after handling futex exit
  futex: Split futex_mm_release() for exit/exec
  exit/exec: Seperate mm_release()
  futex: Replace PF_EXITPIDONE with a state
  ...
2019-11-26 16:02:40 -08:00
Chris Wilson
67a3acaab7 drm/i915: Use a ctor for TYPESAFE_BY_RCU i915_request
As we start peeking into requests for longer and longer, e.g.
incorporating use of spinlocks when only protected by an
rcu_read_lock(), we need to be careful in how we reset the request when
recycling and need to preserve any barriers that may still be in use as
the request is reset for reuse.

Quoting Linus Torvalds:

> If there is refcounting going on then why use SLAB_TYPESAFE_BY_RCU?

  .. because the object can be accessed (by RCU) after the refcount has
  gone down to zero, and the thing has been released.

  That's the whole and only point of SLAB_TYPESAFE_BY_RCU.

  That flag basically says:

  "I may end up accessing this object *after* it has been free'd,
  because there may be RCU lookups in flight"

  This has nothing to do with constructors. It's ok if the object gets
  reused as an object of the same type and does *not* get
  re-initialized, because we're perfectly fine seeing old stale data.

  What it guarantees is that the slab isn't shared with any other kind
  of object, _and_ that the underlying pages are free'd after an RCU
  quiescent period (so the pages aren't shared with another kind of
  object either during an RCU walk).

  And it doesn't necessarily have to have a constructor, because the
  thing that a RCU walk will care about is

    (a) guaranteed to be an object that *has* been on some RCU list (so
    it's not a "new" object)

    (b) the RCU walk needs to have logic to verify that it's still the
    *same* object and hasn't been re-used as something else.

  In contrast, a SLAB_TYPESAFE_BY_RCU memory gets free'd and re-used
  immediately, but because it gets reused as the same kind of object,
  the RCU walker can "know" what parts have meaning for re-use, in a way
  it couidn't if the re-use was random.

  That said, it *is* subtle, and people should be careful.

> So the re-use might initialize the fields lazily, not necessarily using a ctor.

  If you have a well-defined refcount, and use "atomic_inc_not_zero()"
  to guard the speculative RCU access section, and use
  "atomic_dec_and_test()" in the freeing section, then you should be
  safe wrt new allocations.

  If you have a completely new allocation that has "random stale
  content", you know that it cannot be on the RCU list, so there is no
  speculative access that can ever see that random content.

  So the only case you need to worry about is a re-use allocation, and
  you know that the refcount will start out as zero even if you don't
  have a constructor.

  So you can think of the refcount itself as always having a zero
  constructor, *BUT* you need to be careful with ordering.

  In particular, whoever does the allocation needs to then set the
  refcount to a non-zero value *after* it has initialized all the other
  fields. And in particular, it needs to make sure that it uses the
  proper memory ordering to do so.

  NOTE! One thing to be very worried about is that re-initializing
  whatever RCU lists means that now the RCU walker may be walking on the
  wrong list so the walker may do the right thing for this particular
  entry, but it may miss walking *other* entries. So then you can get
  spurious lookup failures, because the RCU walker never walked all the
  way to the end of the right list. That ends up being a much more
  subtle bug.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191122094924.629690-1-chris@chris-wilson.co.uk
2019-11-22 10:47:38 +00:00
Andi Shyti
3e7abf8141 drm/i915: Extract GT render power state management
i915_irq.c is large. One reason for this is that has a large chunk of
the GT render power management stashed away in it. Extract that logic
out of i915_irq.c and intel_pm.c and put it under one roof.

Based on a patch by Chris Wilson.

Signed-off-by: Andi Shyti <andi.shyti@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191024211642.7688-1-chris@chris-wilson.co.uk
2019-10-26 19:28:59 +01:00
Chris Wilson
babaab2f47 drm/i915: Encapsulate kconfig constant values inside boolean predicates
Avoid angering clang and smatch by using a constant value in a '&&' test,
by forcing that constant value into a boolean.

E.g.,
drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c:159:13: warning: use of logical '&&' with constant operand [-Wconstant-logical-operand]
	if (!delay && CONFIG_DRM_I915_PREEMPT_TIMEOUT) {
                      ^  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Nathan Chancellor <natechancellor@gmail.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191025135943.12524-1-chris@chris-wilson.co.uk
2019-10-26 09:25:25 +01:00
Chris Wilson
2871ea85c1 drm/i915/gt: Split intel_ring_submission
Split the legacy submission backend from the common CS ring buffer
handling.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191024100344.5041-1-chris@chris-wilson.co.uk
2019-10-24 12:14:21 +01:00