linux_dsm_epyc7002/drivers/gpu/drm/i915/Kconfig.profile

config DRM_I915_USERFAULT_AUTOSUSPEND
	int "Runtime autosuspend delay for userspace GGTT mmaps (ms)"
	default 250 # milliseconds
	help
	  On runtime suspend, as we suspend the device, we have to revoke
	  userspace GGTT mmaps and force userspace to take a pagefault on
	  their next access. The revocation and subsequent recreation of
	  the GGTT mmap can be very slow and so we impose a small hysteris
	  that complements the runtime-pm autosuspend and provides a lower
	  floor on the autosuspend delay.

	  May be 0 to disable the extra delay and solely use the device level
	  runtime pm autosuspend delay tunable.

config DRM_I915_HEARTBEAT_INTERVAL
	int "Interval between heartbeat pulses (ms)"
	default 2500 # milliseconds
	help
	  The driver sends a periodic heartbeat down all active engines to
	  check the health of the GPU and undertake regular house-keeping of
	  internal driver state.

	  May be 0 to disable heartbeats and therefore disable automatic GPU
	  hang detection.

config DRM_I915_PREEMPT_TIMEOUT
	int "Preempt timeout (ms, jiffy granularity)"
	default 640 # milliseconds
	help
	  How long to wait (in milliseconds) for a preemption event to occur
	  when submitting a new context via execlists. If the current context
	  does not hit an arbitration point and yield to HW before the timer
	  expires, the HW will be reset to allow the more important context
	  to execute.

	  May be 0 to disable the timeout.

config DRM_I915_SPIN_REQUEST
	int "Busywait for request completion (us)"
	default 5 # microseconds
	help
	  Before sleeping waiting for a request (GPU operation) to complete,
	  we may spend some time polling for its completion. As the IRQ may
	  take a non-negligible time to setup, we do a short spin first to
	  check if the request will complete in the time it would have taken
	  us to enable the interrupt.

	  May be 0 to disable the initial spin. In practice, we estimate
	  the cost of enabling the interrupt (if currently disabled) to be
	  a few microseconds.

config DRM_I915_STOP_TIMEOUT
	int "How long to wait for an engine to quiesce gracefully before reset (ms)"
	default 100 # milliseconds
	help
	  By stopping submission and sleeping for a short time before resetting
	  the GPU, we allow the innocent contexts also on the system to quiesce.
	  It is then less likely for a hanging context to cause collateral
	  damage as the system is reset in order to recover. The corollary is
	  that the reset itself may take longer and so be more disruptive to
	  interactive or low latency workloads.

config DRM_I915_TIMESLICE_DURATION
	int "Scheduling quantum for userspace batches (ms, jiffy granularity)"
	default 1 # milliseconds
	help
	  When two user batches of equal priority are executing, we will
	  alternate execution of each batch to ensure forward progress of
	  all users. This is necessary in some cases where there may be
	  an implicit dependency between those batches that requires
	  concurrent execution in order for them to proceed, e.g. they
	  interact with each other via userspace semaphores. Each context
	  is scheduled for execution for the timeslice duration, before
	  switching to the next context.

	  May be 0 to disable timeslicing.
drm/i915: Keep user GGTT alive for a minimum of 250ms Do not allow runtime pm autosuspend to remove userspace GGTT mmaps too quickly. For example, igt sets the autosuspend delay to 0, and so we immediately attempt to perform runtime suspend upon releasing the wakeref. Unfortunately, that involves tearing down GGTT mmaps as they require an active device. Override the autosuspend for GGTT mmaps, by keeping the wakeref around for 250ms after populating the PTE for a fresh mmap. v2: Prefer refcount_t for its under/overflow error detection v3: Flush the user runtime autosuspend prior to system system. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190527115114.13448-1-chris@chris-wilson.co.uk 2019-05-27 18:51:14 +07:00			`config DRM_I915_USERFAULT_AUTOSUSPEND`
			`int "Runtime autosuspend delay for userspace GGTT mmaps (ms)"`
			`default 250 # milliseconds`
			`help`
			`On runtime suspend, as we suspend the device, we have to revoke`
			`userspace GGTT mmaps and force userspace to take a pagefault on`
			`their next access. The revocation and subsequent recreation of`
			`the GGTT mmap can be very slow and so we impose a small hysteris`
			`that complements the runtime-pm autosuspend and provides a lower`
			`floor on the autosuspend delay.`

			`May be 0 to disable the extra delay and solely use the device level`
			`runtime pm autosuspend delay tunable.`

drm/i915/gt: Replace hangcheck by heartbeats Replace sampling the engine state every so often with a periodic heartbeat request to measure the health of an engine. This is coupled with the forced-preemption to allow long running requests to survive so long as they do not block other users. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191023133108.21401-5-chris@chris-wilson.co.uk 2019-10-23 20:31:08 +07:00			`config DRM_I915_HEARTBEAT_INTERVAL`
			`int "Interval between heartbeat pulses (ms)"`
			`default 2500 # milliseconds`
			`help`
			`The driver sends a periodic heartbeat down all active engines to`
			`check the health of the GPU and undertake regular house-keeping of`
			`internal driver state.`

			`May be 0 to disable heartbeats and therefore disable automatic GPU`
			`hang detection.`

drm/i915/execlists: Force preemption If the preempted context takes too long to relinquish control, e.g. it is stuck inside a shader with arbitration disabled, evict that context with an engine reset. This ensures that preemptions are reasonably responsive, providing a tighter QoS for the more important context at the cost of flagging unresponsive contexts more frequently (i.e. instead of using an ~10s hangcheck, we now evict at ~100ms). The challenge of lies in picking a timeout that can be reasonably serviced by HW for typical workloads, balancing the existing clients against the needs for responsiveness. Note that coupled with timeslicing, this will lead to rapid GPU "hang" detection with multiple active contexts vying for GPU time. The forced preemption mechanism can be compiled out with ./scripts/config --set-val DRM_I915_PREEMPT_TIMEOUT 0 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191023133108.21401-2-chris@chris-wilson.co.uk 2019-10-23 20:31:05 +07:00			`config DRM_I915_PREEMPT_TIMEOUT`
			`int "Preempt timeout (ms, jiffy granularity)"`
drm/i915: Default to a more lenient forced preemption timeout Based on a sampling of a number of benchmarks across platforms, by default opt for a much more lenient timeout so that we should not adversely affect existing "good" clients. 640ms ought to be enough for anyone. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112169 Fixes: 3a7a92aba8fb ("drm/i915/execlists: Force preemption") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Eero Tamminen <eero.t.tamminen@intel.com> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191125162737.2161069-1-chris@chris-wilson.co.uk 2019-11-25 23:27:37 +07:00			`default 640 # milliseconds`
drm/i915/execlists: Force preemption If the preempted context takes too long to relinquish control, e.g. it is stuck inside a shader with arbitration disabled, evict that context with an engine reset. This ensures that preemptions are reasonably responsive, providing a tighter QoS for the more important context at the cost of flagging unresponsive contexts more frequently (i.e. instead of using an ~10s hangcheck, we now evict at ~100ms). The challenge of lies in picking a timeout that can be reasonably serviced by HW for typical workloads, balancing the existing clients against the needs for responsiveness. Note that coupled with timeslicing, this will lead to rapid GPU "hang" detection with multiple active contexts vying for GPU time. The forced preemption mechanism can be compiled out with ./scripts/config --set-val DRM_I915_PREEMPT_TIMEOUT 0 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191023133108.21401-2-chris@chris-wilson.co.uk 2019-10-23 20:31:05 +07:00			`help`
			`How long to wait (in milliseconds) for a preemption event to occur`
			`when submitting a new context via execlists. If the current context`
			`does not hit an arbitration point and yield to HW before the timer`
			`expires, the HW will be reset to allow the more important context`
			`to execute.`

			`May be 0 to disable the timeout.`

drm/i915: Expose the busyspin durations for i915_wait_request An interesting discussion regarding "hybrid interrupt polling" for NVMe came to the conclusion that the ideal busyspin before sleeping was half of the expected request latency (and better if it was already halfway through that request). This suggested that we too should look again at our tradeoff between spinning and waiting. Currently, our spin simply tries to hide the cost of enabling the interrupt, which is good to avoid penalising nop requests (i.e. test throughput) and not much else. Studying real world workloads suggests that a spin of upto 500us can dramatically boost performance, but the suggestion is that this is not from avoiding interrupt latency per-se, but from secondary effects of sleeping such as allowing the CPU reduce cstate and context switch away. In a truly hybrid interrupt polling scheme, we would aim to sleep until just before the request completed and then wake up in advance of the interrupt and do a quick poll to handle completion. This is tricky for ourselves at the moment as we are not recording request times, and since we allow preemption, our requests are not on as a nicely ordered timeline as IO. However, the idea is interesting, for it will certainly help us decide when busyspinning is worthwhile. v2: Expose the spin setting via Kconfig options for easier adjustment and testing. v3: Don't get caught sneaking in a change to the busyspin parameters. v4: Explain more about the "hybrid interrupt polling" scheme that we want to migrate towards. Suggested-by: Sagar Kamble <sagar.a.kamble@intel.com> References: http://events.linuxfoundation.org/sites/events/files/slides/lemoal-nvme-polling-vault-2017-final_0.pdf Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Sagar Kamble <sagar.a.kamble@intel.com> Cc: Eero Tamminen <eero.t.tamminen@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Ben Widawsky <ben@bwidawsk.net> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Sagar Kamble <sagar.a.kamble@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190419182625.11186-1-chris@chris-wilson.co.uk 2019-04-20 01:26:25 +07:00			`config DRM_I915_SPIN_REQUEST`
drm/i915: Add a label for config DRM_I915_SPIN_REQUEST If we don't give it a label, it does not appear as a configuration option. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190612093111.11684-9-chris@chris-wilson.co.uk 2019-06-12 16:31:11 +07:00			`int "Busywait for request completion (us)"`
drm/i915: Expose the busyspin durations for i915_wait_request An interesting discussion regarding "hybrid interrupt polling" for NVMe came to the conclusion that the ideal busyspin before sleeping was half of the expected request latency (and better if it was already halfway through that request). This suggested that we too should look again at our tradeoff between spinning and waiting. Currently, our spin simply tries to hide the cost of enabling the interrupt, which is good to avoid penalising nop requests (i.e. test throughput) and not much else. Studying real world workloads suggests that a spin of upto 500us can dramatically boost performance, but the suggestion is that this is not from avoiding interrupt latency per-se, but from secondary effects of sleeping such as allowing the CPU reduce cstate and context switch away. In a truly hybrid interrupt polling scheme, we would aim to sleep until just before the request completed and then wake up in advance of the interrupt and do a quick poll to handle completion. This is tricky for ourselves at the moment as we are not recording request times, and since we allow preemption, our requests are not on as a nicely ordered timeline as IO. However, the idea is interesting, for it will certainly help us decide when busyspinning is worthwhile. v2: Expose the spin setting via Kconfig options for easier adjustment and testing. v3: Don't get caught sneaking in a change to the busyspin parameters. v4: Explain more about the "hybrid interrupt polling" scheme that we want to migrate towards. Suggested-by: Sagar Kamble <sagar.a.kamble@intel.com> References: http://events.linuxfoundation.org/sites/events/files/slides/lemoal-nvme-polling-vault-2017-final_0.pdf Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Sagar Kamble <sagar.a.kamble@intel.com> Cc: Eero Tamminen <eero.t.tamminen@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Ben Widawsky <ben@bwidawsk.net> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Sagar Kamble <sagar.a.kamble@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190419182625.11186-1-chris@chris-wilson.co.uk 2019-04-20 01:26:25 +07:00			`default 5 # microseconds`
			`help`
			`Before sleeping waiting for a request (GPU operation) to complete,`
			`we may spend some time polling for its completion. As the IRQ may`
			`take a non-negligible time to setup, we do a short spin first to`
			`check if the request will complete in the time it would have taken`
			`us to enable the interrupt.`

			`May be 0 to disable the initial spin. In practice, we estimate`
			`the cost of enabling the interrupt (if currently disabled) to be`
			`a few microseconds.`
drm/i915/gt: Try to more gracefully quiesce the system before resets If we are doing a normal GPU reset triggered after detecting a long period of stalled work, we can take our time and allow the engines to quiesce. Since we've stopped submission to the engine, and if we wait long enough an innocent context should complete, leaving the engine idle. So by waiting a short amount of time, we should prevent clobbering other users when resetting a stuck context. Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Suggested-by: Jon Bloomfield <jon.bloomfield@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191023133108.21401-1-chris@chris-wilson.co.uk 2019-10-23 20:31:04 +07:00
			`config DRM_I915_STOP_TIMEOUT`
			`int "How long to wait for an engine to quiesce gracefully before reset (ms)"`
			`default 100 # milliseconds`
			`help`
			`By stopping submission and sleeping for a short time before resetting`
			`the GPU, we allow the innocent contexts also on the system to quiesce.`
			`It is then less likely for a hanging context to cause collateral`
			`damage as the system is reset in order to recover. The corollary is`
			`that the reset itself may take longer and so be more disruptive to`
			`interactive or low latency workloads.`
drm/i915/gt: Make timeslice duration configurable Execlists uses a scheduling quantum (a timeslice) to alternate execution between ready-to-run contexts of equal priority. This ensures that all users (though only if they of equal importance) have the opportunity to run and prevents livelocks where contexts may have implicit ordering due to userspace semaphores. However, not all workloads necessarily benefit from timeslicing and in the extreme some sysadmin may want to disable or reduce the timeslicing granularity. The timeslicing mechanism can be compiled out^W^W disabled (but should DCE!) with ./scripts/config --set-val DRM_I915_TIMESLICE_DURATION 0 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191029091632.26281-1-chris@chris-wilson.co.uk 2019-10-29 16:16:32 +07:00
			`config DRM_I915_TIMESLICE_DURATION`
			`int "Scheduling quantum for userspace batches (ms, jiffy granularity)"`
			`default 1 # milliseconds`
			`help`
			`When two user batches of equal priority are executing, we will`
			`alternate execution of each batch to ensure forward progress of`
			`all users. This is necessary in some cases where there may be`
			`an implicit dependency between those batches that requires`
			`concurrent execution in order for them to proceed, e.g. they`
			`interact with each other via userspace semaphores. Each context`
			`is scheduled for execution for the timeslice duration, before`
			`switching to the next context.`

			`May be 0 to disable timeslicing.`