linux_dsm_epyc7002/drivers/dma-buf
Chris Wilson 0152b3b3f4 drm/i915: Seal races between async GPU cancellation, retirement and signaling
Currently there is an underlying assumption that i915_request_unsubmit()
is synchronous wrt the GPU -- that is the request is no longer in flight
as we remove it. In the near future that may change, and this may upset
our signaling as we can process an interrupt for that request while it
is no longer in flight.

CPU0					CPU1
intel_engine_breadcrumbs_irq
(queue request completion)
					i915_request_cancel_signaling
...					...
					i915_request_enable_signaling
dma_fence_signal

Hence in the time it took us to drop the lock to signal the request, a
preemption event may have occurred and re-queued the request. In the
process, that request would have seen I915_FENCE_FLAG_SIGNAL clear and
so reused the rq->signal_link that was in use on CPU0, leading to bad
pointer chasing in intel_engine_breadcrumbs_irq.

A related issue was that if someone started listening for a signal on a
completed but no longer in-flight request, we missed the opportunity to
immediately signal that request.

Furthermore, as intel_contexts may be immediately released during
request retirement, in order to be entirely sure that
intel_engine_breadcrumbs_irq may no longer dereference the intel_context
(ce->signals and ce->signal_link), we must wait for irq spinlock.

In order to prevent the race, we use a bit in the fence.flags to signal
the transfer onto the signal list inside intel_engine_breadcrumbs_irq.
For simplicity, we use the DMA_FENCE_FLAG_SIGNALED_BIT as it then
quickly signals to any outside observer that the fence is indeed signaled.

v2: Sketch out potential dma-fence API for manual signaling
v3: And the test_and_set_bit()

Fixes: 52c0fdb25c ("drm/i915: Replace global breadcrumbs with per-context interrupt tracking")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190508112452.18942-1-chris@chris-wilson.co.uk
2019-05-08 16:02:41 +01:00
..
dma-buf.c dma-buf: Change to use DEFINE_SHOW_ATTRIBUTE macro 2018-12-24 11:17:04 +01:00
dma-fence-array.c dma-fence: Make ->wait callback optional 2018-07-03 13:12:57 +02:00
dma-fence-chain.c dma-buf: explicitely note that dma-fence-chains use 64bit seqno 2019-04-16 14:49:10 +02:00
dma-fence.c drm/i915: Seal races between async GPU cancellation, retirement and signaling 2019-05-08 16:02:41 +01:00
Kconfig udmabuf: add MEMFD_CREATE dependency 2018-09-12 08:21:30 +02:00
Makefile dma-buf: add new dma_fence_chain container v7 2019-04-01 12:05:02 +02:00
reservation.c dma-buf: add some lockdep asserts to the reservation object implementation 2019-02-27 23:51:43 +01:00
seqno-fence.c dma-buf: Rename struct fence to dma_fence 2016-10-25 14:40:39 +02:00
sw_sync.c dma-buf: explicitely note that dma-fence-chains use 64bit seqno 2019-04-16 14:49:10 +02:00
sync_debug.c dma-buf: Change to use DEFINE_SHOW_ATTRIBUTE macro 2018-12-24 11:17:04 +01:00
sync_debug.h dma-buf: Remove unneeded stubs around sync_debug interfaces 2018-05-07 15:58:07 +02:00
sync_file.c dma-buf: explicitely note that dma-fence-chains use 64bit seqno 2019-04-16 14:49:10 +02:00
sync_trace.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
udmabuf.c drivers/dma-buf/udmabuf.c: convert to use vm_fault_t 2019-01-04 13:13:46 -08:00