linux_dsm_epyc7002/drivers/gpu/drm/i915/gt
Chris Wilson ca1711d199 drm/i915/gt: Close race between engine_park and intel_gt_retire_requests
The general concept was that intel_timeline.active_count was locked by
the intel_timeline.mutex. The exception was for power management, where
the engine->kernel_context->timeline could be manipulated under the
global wakeref.mutex.

This was quite solid, as we always manipulated the timeline only while
we held an engine wakeref.

And then we started retiring requests outside of struct_mutex, only
using the timelines.active_list and the timeline->mutex. There we
started manipulating intel_timeline.active_count outside of an engine
wakeref, and so introduced a race between __engine_park() and
intel_gt_retire_requests(), a race that could result in the
engine->kernel_context not being added to the active timelines and so
losing requests, which caused us to keep the system permanently powered
up [and unloadable].

The race would be easy to close if we could take the engine wakeref for
the timeline before we retire -- except timelines are not bound to any
engine and so we would need to keep all active engines awake. The
alternative is to guard intel_timeline_enter/intel_timeline_exit for use
outside of the timeline->mutex.

Fixes: e5dadff4b0 ("drm/i915: Protect request retirement with timeline->mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191120165514.3955081-1-chris@chris-wilson.co.uk
(cherry picked from commit a6edbca74b)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-11-25 15:29:42 +02:00
..
selftests drm/i915: Coordinate i915_active with its own mutex 2019-10-04 15:39:12 +01:00
uc drm/i915/guc: Skip suspend/resume GuC action on platforms w/o GuC submission 2019-11-18 16:36:44 +02:00
gen6_renderstate.c
gen7_renderstate.c
gen8_renderstate.c
gen9_renderstate.c
intel_breadcrumbs.c drm/i915: Don't disable interrupts for intel_engine_breadcrumbs_irq() 2019-09-26 18:44:35 +01:00
intel_context_types.h drm/i915: Remove logical HW ID 2019-10-04 15:39:30 +01:00
intel_context.c drm/i915/gt: Split intel_ring_submission 2019-10-24 12:14:21 +01:00
intel_context.h drm/i915/gt: Split intel_ring_submission 2019-10-24 12:14:21 +01:00
intel_engine_cs.c drm/i915: Protect context while grabbing its name for the request 2019-11-12 10:22:41 +02:00
intel_engine_heartbeat.c drm/i915/gt: Only drop heartbeat.systole if the sole owner 2019-11-11 10:29:47 +02:00
intel_engine_heartbeat.h drm/i915/gt: Replace hangcheck by heartbeats 2019-10-23 23:52:10 +01:00
intel_engine_pm.c drm/i915: Mark up the calling context for intel_wakeref_put() 2019-11-25 15:29:17 +02:00
intel_engine_pm.h drm/i915: Mark up the calling context for intel_wakeref_put() 2019-11-25 15:29:17 +02:00
intel_engine_pool_types.h drm/i915: Replace struct_mutex for batch pool serialisation 2019-08-04 14:31:18 +01:00
intel_engine_pool.c drm/i915: Coordinate i915_active with its own mutex 2019-10-04 15:39:12 +01:00
intel_engine_pool.h drm/i915: Mark i915_request.timeline as a volatile, rcu pointer 2019-09-20 10:24:09 +01:00
intel_engine_types.h Backmerge i915 security patches from commit 'ea0b163b13ff' into drm-next 2019-11-14 11:09:06 +10:00
intel_engine_user.c drm/i915: Make for_each_engine_masked work on intel_gt 2019-10-18 00:06:25 +01:00
intel_engine_user.h drm/i915: Rename engines to match their user interface 2019-08-07 14:30:55 +01:00
intel_engine.h drm/i915/gt: Make timeslice duration configurable 2019-10-29 16:23:55 +00:00
intel_gpu_commands.h drm/i915/tgl: Add HDC Pipeline Flush 2019-10-15 18:15:59 +01:00
intel_gt_irq.c drm/i915: Extract GT render power state management 2019-10-26 19:28:59 +01:00
intel_gt_irq.h drm/i915: Extract general GT interrupt handlers 2019-08-12 15:36:13 +01:00
intel_gt_pm_irq.c drm/i915: Extract general GT interrupt handlers 2019-08-12 15:36:13 +01:00
intel_gt_pm_irq.h drm/i915: Extract GT powermanagement interrupt handling 2019-08-12 15:36:06 +01:00
intel_gt_pm.c drm/i915: Mark up the calling context for intel_wakeref_put() 2019-11-25 15:29:17 +02:00
intel_gt_pm.h drm/i915: Mark up the calling context for intel_wakeref_put() 2019-11-25 15:29:17 +02:00
intel_gt_requests.c drm/i915/gt: Close race between engine_park and intel_gt_retire_requests 2019-11-25 15:29:42 +02:00
intel_gt_requests.h drm/i915: Move request runtime management onto gt 2019-10-04 15:39:26 +01:00
intel_gt_types.h drm/i915: Extract GT render power state management 2019-10-26 19:28:59 +01:00
intel_gt.c drm/i915/gt: Call intel_gt_sanitize() directly 2019-11-05 16:04:16 +02:00
intel_gt.h drm/i915/gt: Call intel_gt_sanitize() directly 2019-11-05 16:04:16 +02:00
intel_llc_types.h drm/i915: Extract GT ring management 2019-10-20 20:45:18 +01:00
intel_llc.c drm/i915: Extract GT render power state management 2019-10-26 19:28:59 +01:00
intel_llc.h drm/i915: Extract GT ring management 2019-10-20 20:45:18 +01:00
intel_lrc_reg.h drm/i915/selftests: Verify the LRC register layout between init and HW 2019-09-24 17:27:19 +01:00
intel_lrc.c drm/i915: Mark up the calling context for intel_wakeref_put() 2019-11-25 15:29:17 +02:00
intel_lrc.h drm/i915: drop lrc header page 2019-10-31 16:47:22 +00:00
intel_mocs.c drm/i915: do not set MOCS control values on dgfx 2019-10-25 13:55:49 -07:00
intel_mocs.h drm/i915: Do initial mocs configuration directly 2019-10-16 19:35:37 +01:00
intel_rc6_types.h drm/i915/gen8+: Add RC6 CTX corruption WA 2019-11-14 10:51:54 +10:00
intel_rc6.c drm/i915: Restore GT coarse power gating workaround 2019-11-18 16:36:34 +02:00
intel_rc6.h drm/i915/gen8+: Add RC6 CTX corruption WA 2019-11-14 10:51:54 +10:00
intel_renderstate.c drm/i915/gt: Split intel_ring_submission 2019-10-24 12:14:21 +01:00
intel_renderstate.h
intel_reset_types.h drm/i915: Define explicit wedged on init reset state 2019-09-26 18:44:35 +01:00
intel_reset.c drm/i915: Mark up the calling context for intel_wakeref_put() 2019-11-25 15:29:17 +02:00
intel_reset.h drm/i915: Don't mix srcu tag and negative error codes 2019-10-07 10:44:48 -07:00
intel_ring_submission.c drm/i915/gt: Split intel_ring_submission 2019-10-24 12:14:21 +01:00
intel_ring_types.h drm/i915/gt: Split intel_ring_submission 2019-10-24 12:14:21 +01:00
intel_ring.c drm/i915: don't allocate the ring in stolen if we lack aperture 2019-10-29 10:35:47 +00:00
intel_ring.h drm/i915/gt: Split intel_ring_submission 2019-10-24 12:14:21 +01:00
intel_rps_types.h drm/i915: Extract GT render power state management 2019-10-26 19:28:59 +01:00
intel_rps.c drm/i915/gt: Always track callers to intel_rps_mark_interactive() 2019-10-30 13:23:00 +00:00
intel_rps.h drm/i915/gt: Always track callers to intel_rps_mark_interactive() 2019-10-30 13:23:00 +00:00
intel_sseu.c drm/i915: Expand subslice mask 2019-08-23 19:14:27 +01:00
intel_sseu.h drm/i915/tgl: s/ss/eu fuse reading support 2019-09-21 08:31:08 +01:00
intel_timeline_types.h drm/i915/gt: Close race between engine_park and intel_gt_retire_requests 2019-11-25 15:29:42 +02:00
intel_timeline.c drm/i915/gt: Close race between engine_park and intel_gt_retire_requests 2019-11-25 15:29:42 +02:00
intel_timeline.h drm/i915/gt: Track timeline activeness in enter/exit 2019-08-15 23:16:05 +01:00
intel_workarounds_types.h drm/i915: Add engine name to workaround debug print 2019-07-12 09:55:30 +01:00
intel_workarounds.c drm/i915/tgl: whitelist PS_(DEPTH|INVOCATION)_COUNT 2019-10-24 23:34:38 +01:00
intel_workarounds.h
Makefile drm/i915: use upstream version of header tests 2019-07-30 12:11:57 +03:00
mock_engine.c drm/i915/gt: Split intel_ring_submission 2019-10-24 12:14:21 +01:00
mock_engine.h
selftest_context.c drm/i915: drop lrc header page 2019-10-31 16:47:22 +00:00
selftest_engine_cs.c drm/i915: Rename engines to match their user interface 2019-08-07 14:30:55 +01:00
selftest_engine_heartbeat.c drm/i915/selftests: Pretty print the i915_active 2019-10-31 14:43:14 +00:00
selftest_engine_pm.c drm/i915: Mark up the calling context for intel_wakeref_put() 2019-11-25 15:29:17 +02:00
selftest_engine.c drm/i915: Defer final intel_wakeref_put to process context 2019-08-08 21:28:51 +01:00
selftest_engine.h drm/i915: Defer final intel_wakeref_put to process context 2019-08-08 21:28:51 +01:00
selftest_gt_pm.c drm/i915/selftests: Add intel_gt_suspend_prepare 2019-11-05 16:06:25 +02:00
selftest_hangcheck.c drm/i915/selftests: Start kthreads before stopping 2019-11-01 10:12:29 +00:00
selftest_llc.c drm/i915: Extract GT render power state management 2019-10-26 19:28:59 +01:00
selftest_llc.h drm/i915: Extract GT ring management 2019-10-20 20:45:18 +01:00
selftest_lrc.c drm/i915/selftests: Start kthreads before stopping 2019-11-01 10:12:29 +00:00
selftest_reset.c drm/i915/selftests: Flush interrupts before disabling tasklets 2019-10-24 09:18:52 +01:00
selftest_timeline.c drm/i915/gt: Split intel_ring_submission 2019-10-24 12:14:21 +01:00
selftest_workarounds.c drm/i915: Remove nonpriv flags when srm/lrm 2019-10-24 23:34:38 +01:00