linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-21 22:37:01 +07:00

Author	SHA1	Message	Date
Tvrtko Ursulin	07bcfd1291	drm/i915/gen12: Disable preemption timeout Allow super long OpenCL workloads which cannot be preempted within the default timeout to run out of the box. v2: * Make it stick out more and apply only to RCS. (Chris) v3: * Mention platform override in kconfig. (Joonas) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michal Mrozek <michal.mrozek@intel.com> Cc: <stable@vger.kernel.org> # v5.6+ Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Michal Mrozek <Michal.mrozek@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200312115748.29970-1-tvrtko.ursulin@linux.intel.com	2020-03-12 13:46:01 +00:00
Chris Wilson	9a40bddd47	drm/i915/gt: Expose heartbeat interval via sysfs We monitor the health of the system via periodic heartbeat pulses. The pulses also provide the opportunity to perform garbage collection. However, we interpret an incomplete pulse (a missed heartbeat) as an indication that the system is no longer responsive, i.e. hung, and perform an engine or full GPU reset. Given that the preemption granularity can be very coarse on a system, we let the sysadmin override our legacy timeouts which were "optimised" for desktop applications. The heartbeat interval can be adjusted per-engine using, /sys/class/drm/card?/engine/*/heartbeat_interval_ms Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Steve Carbonari <steven.carbonari@intel.com> Tested-by: Steve Carbonari <steven.carbonari@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200228131716.3243616-7-chris@chris-wilson.co.uk	2020-02-28 22:03:49 +00:00
Chris Wilson	db3d8338ba	drm/i915/gt: Expose preempt reset timeout via sysfs After initialising a preemption request, we give the current resident a small amount of time to vacate the GPU. The preemption request is for a higher priority context and should be immediate to maintain high quality of service (and avoid priority inversion). However, the preemption granularity of the GPU can be quite coarse and so we need a compromise. The preempt timeout can be adjusted per-engine using, /sys/class/drm/card?/engine/*/preempt_timeout_ms and can be disabled by setting it to 0. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Steve Carbonari <steven.carbonari@intel.com> Tested-by: Steve Carbonari <steven.carbonari@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200228131716.3243616-6-chris@chris-wilson.co.uk	2020-02-28 22:03:46 +00:00
Chris Wilson	72338a1f5e	drm/i915/gt: Expose reset stop timeout via sysfs When we allow ourselves to sleep before a GPU reset after disabling submission, even for a few milliseconds, gives an innocent context the opportunity to clear the GPU before the reset occurs. However, how long to sleep depends on the typical non-preemptible duration (a similar problem to determining the ideal preempt-reset timeout or even the heartbeat interval). As this seems of a hard policy decision, punt it to userspace. The timeout can be adjusted using /sys/class/drm/card?/engine/*/stop_timeout_ms Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Reviewed-by: Steve Carbonari <steven.carbonari@intel.com> Tested-by: Steve Carbonari <steven.carbonari@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200228131716.3243616-5-chris@chris-wilson.co.uk	2020-02-28 22:03:43 +00:00
Chris Wilson	062444bbc6	drm/i915/gt: Expose busywait duration to sysfs We busywait on an inflight request (one that is currently executing on HW, and so might complete quickly) prior to setting up an interrupt and sleeping. The trade off is that we keep an expensive CPU core busy in order to avoid wake up latency: where that trade off should lie is best left to the sysadmin. The busywait mechanism can be compiled out with ./scripts/config --set-val DRM_I915_SPIN_REQUEST 0 The maximum busywait duration can be adjusted per-engine using, /sys/class/drm/card?/engine/*/ms_busywait_duration_ns Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Steve Carbonari <steven.carbonari@intel.com> Tested-by: Steve Carbonari <steven.carbonari@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200228131716.3243616-4-chris@chris-wilson.co.uk	2020-02-28 22:03:41 +00:00
Chris Wilson	1a2695a746	drm/i915/gt: Expose timeslice duration to sysfs Execlists uses a scheduling quantum (a timeslice) to alternate execution between ready-to-run contexts of equal priority. This ensures that all users (though only if they of equal importance) have the opportunity to run and prevents livelocks where contexts may have implicit ordering due to userspace semaphores. The timeslicing mechanism can be compiled out with ./scripts/config --set-val DRM_I915_TIMESLICE_DURATION 0 The timeslice duration can be adjusted per-engine using, /sys/class/drm/card?/engine/*/timeslice_duration_ms Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Steve Carbonari <steven.carbonari@intel.com> Tested-by: Steve Carbonari <steven.carbonari@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200228131716.3243616-3-chris@chris-wilson.co.uk	2020-02-28 22:03:39 +00:00
Chris Wilson	5766a5ffc6	drm/i915: Default to a more lenient forced preemption timeout Based on a sampling of a number of benchmarks across platforms, by default opt for a much more lenient timeout so that we should not adversely affect existing "good" clients. 640ms ought to be enough for anyone. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112169 Fixes: `3a7a92aba8` ("drm/i915/execlists: Force preemption") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Eero Tamminen <eero.t.tamminen@intel.com> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191125162737.2161069-1-chris@chris-wilson.co.uk	2019-11-26 09:38:44 +00:00
Chris Wilson	b79029b2e8	drm/i915/gt: Make timeslice duration configurable Execlists uses a scheduling quantum (a timeslice) to alternate execution between ready-to-run contexts of equal priority. This ensures that all users (though only if they of equal importance) have the opportunity to run and prevents livelocks where contexts may have implicit ordering due to userspace semaphores. However, not all workloads necessarily benefit from timeslicing and in the extreme some sysadmin may want to disable or reduce the timeslicing granularity. The timeslicing mechanism can be compiled out^W^W disabled (but should DCE!) with ./scripts/config --set-val DRM_I915_TIMESLICE_DURATION 0 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191029091632.26281-1-chris@chris-wilson.co.uk	2019-10-29 16:23:55 +00:00
Chris Wilson	058179e72e	drm/i915/gt: Replace hangcheck by heartbeats Replace sampling the engine state every so often with a periodic heartbeat request to measure the health of an engine. This is coupled with the forced-preemption to allow long running requests to survive so long as they do not block other users. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191023133108.21401-5-chris@chris-wilson.co.uk	2019-10-23 23:52:10 +01:00
Chris Wilson	3a7a92aba8	drm/i915/execlists: Force preemption If the preempted context takes too long to relinquish control, e.g. it is stuck inside a shader with arbitration disabled, evict that context with an engine reset. This ensures that preemptions are reasonably responsive, providing a tighter QoS for the more important context at the cost of flagging unresponsive contexts more frequently (i.e. instead of using an ~10s hangcheck, we now evict at ~100ms). The challenge of lies in picking a timeout that can be reasonably serviced by HW for typical workloads, balancing the existing clients against the needs for responsiveness. Note that coupled with timeslicing, this will lead to rapid GPU "hang" detection with multiple active contexts vying for GPU time. The forced preemption mechanism can be compiled out with ./scripts/config --set-val DRM_I915_PREEMPT_TIMEOUT 0 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191023133108.21401-2-chris@chris-wilson.co.uk	2019-10-23 23:52:10 +01:00
Chris Wilson	a8c51ed22b	drm/i915/gt: Try to more gracefully quiesce the system before resets If we are doing a normal GPU reset triggered after detecting a long period of stalled work, we can take our time and allow the engines to quiesce. Since we've stopped submission to the engine, and if we wait long enough an innocent context should complete, leaving the engine idle. So by waiting a short amount of time, we should prevent clobbering other users when resetting a stuck context. Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Suggested-by: Jon Bloomfield <jon.bloomfield@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191023133108.21401-1-chris@chris-wilson.co.uk	2019-10-23 23:52:10 +01:00
Chris Wilson	ea60f4bdc4	drm/i915: Add a label for config DRM_I915_SPIN_REQUEST If we don't give it a label, it does not appear as a configuration option. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190612093111.11684-9-chris@chris-wilson.co.uk	2019-06-12 11:18:55 +01:00
Chris Wilson	b27e35ae5b	drm/i915: Keep user GGTT alive for a minimum of 250ms Do not allow runtime pm autosuspend to remove userspace GGTT mmaps too quickly. For example, igt sets the autosuspend delay to 0, and so we immediately attempt to perform runtime suspend upon releasing the wakeref. Unfortunately, that involves tearing down GGTT mmaps as they require an active device. Override the autosuspend for GGTT mmaps, by keeping the wakeref around for 250ms after populating the PTE for a fresh mmap. v2: Prefer refcount_t for its under/overflow error detection v3: Flush the user runtime autosuspend prior to system system. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190527115114.13448-1-chris@chris-wilson.co.uk	2019-05-28 08:23:09 +01:00
Chris Wilson	7ce99d24ed	drm/i915: Expose the busyspin durations for i915_wait_request An interesting discussion regarding "hybrid interrupt polling" for NVMe came to the conclusion that the ideal busyspin before sleeping was half of the expected request latency (and better if it was already halfway through that request). This suggested that we too should look again at our tradeoff between spinning and waiting. Currently, our spin simply tries to hide the cost of enabling the interrupt, which is good to avoid penalising nop requests (i.e. test throughput) and not much else. Studying real world workloads suggests that a spin of upto 500us can dramatically boost performance, but the suggestion is that this is not from avoiding interrupt latency per-se, but from secondary effects of sleeping such as allowing the CPU reduce cstate and context switch away. In a truly hybrid interrupt polling scheme, we would aim to sleep until just before the request completed and then wake up in advance of the interrupt and do a quick poll to handle completion. This is tricky for ourselves at the moment as we are not recording request times, and since we allow preemption, our requests are not on as a nicely ordered timeline as IO. However, the idea is interesting, for it will certainly help us decide when busyspinning is worthwhile. v2: Expose the spin setting via Kconfig options for easier adjustment and testing. v3: Don't get caught sneaking in a change to the busyspin parameters. v4: Explain more about the "hybrid interrupt polling" scheme that we want to migrate towards. Suggested-by: Sagar Kamble <sagar.a.kamble@intel.com> References: http://events.linuxfoundation.org/sites/events/files/slides/lemoal-nvme-polling-vault-2017-final_0.pdf Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Sagar Kamble <sagar.a.kamble@intel.com> Cc: Eero Tamminen <eero.t.tamminen@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Ben Widawsky <ben@bwidawsk.net> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Sagar Kamble <sagar.a.kamble@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190419182625.11186-1-chris@chris-wilson.co.uk	2019-04-19 20:33:38 +01:00

14 Commits