linux_dsm_epyc7002/drivers/gpu/drm/i915
Chris Wilson 6e4930f6ee drm/i915: Flush GPU rendering with a lockless wait during a pagefault
Arjan van de Ven reported that on his test machine that he was seeing
stalls of greater than 1 frame greatly impacting the user experience. He
tracked this down to being the locked flush during a pagefault as being
the culprit hogging the struct_mutex and so blocking any other user from
proceeding. Stalling on a pagefault is bad behaviour on userspace's
part, for one it means that they are ignoring the coherency rules on
pointer access through the GTT, but fortunately we can apply the same
trick as the set-to-domain ioctl to do a lightweight, nonblocking flush
of outstanding rendering first.

"Prior to the patch it looks like this
(this one testrun does not show the 20ms+ I've seen occasionally)

  4.99 ms     2.36 ms    31360  __wait_seqno i915_wait_seqno i915_gem_object_wait_rendering i915_gem_object_set_to_gtt_domain i915_gem_fault __do_fault handle_
+pte_fault handle_mm_fault __do_page_fault do_page_fault page_fault
   4.99 ms     2.75 ms   107751  __wait_seqno i915_gem_wait_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret
   4.99 ms     1.63 ms     1666  i915_mutex_lock_interruptible i915_gem_fault __do_fault handle_pte_fault handle_mm_fault __do_page_fault do_page_fault page_fa
+ult
   4.93 ms     2.45 ms      980  i915_mutex_lock_interruptible intel_crtc_page_flip drm_mode_page_flip_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_
+sysret
   4.89 ms     2.20 ms     3283  i915_mutex_lock_interruptible i915_gem_wait_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret
   4.34 ms     1.66 ms     1715  i915_mutex_lock_interruptible i915_gem_pwrite_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret
   3.73 ms     3.73 ms       49  i915_mutex_lock_interruptible i915_gem_set_domain_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret
   3.17 ms     0.33 ms      931  i915_mutex_lock_interruptible i915_gem_madvise_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret
   2.97 ms     0.43 ms     1029  i915_mutex_lock_interruptible i915_gem_busy_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret
   2.55 ms     0.51 ms      735  i915_gem_get_tiling drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret

After the patch it looks like this:

   4.99 ms     2.14 ms    22212  __wait_seqno i915_gem_wait_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret
   4.86 ms     0.99 ms    14170  __wait_seqno i915_gem_object_wait_rendering__nonblocking i915_gem_fault __do_fault handle_pte_fault handle_mm_fault __do_page_
+fault do_page_fault page_fault
   3.59 ms     1.31 ms      325  i915_gem_get_tiling drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret
   3.37 ms     3.37 ms       65  i915_mutex_lock_interruptible i915_gem_wait_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret
   2.58 ms     2.58 ms       65  i915_mutex_lock_interruptible i915_gem_do_execbuffer.isra.23 i915_gem_execbuffer2 drm_ioctl i915_compat_ioctl compat_sys_ioctl
+ia32_sysret
   2.19 ms     2.19 ms       65  i915_mutex_lock_interruptible intel_crtc_page_flip drm_mode_page_flip_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_
+sysret
   2.18 ms     2.18 ms       65  i915_mutex_lock_interruptible i915_gem_busy_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret
   1.66 ms     1.66 ms       65  i915_gem_set_tiling drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret

It may not look like it, but this is quite a large difference, and I've
been unable to reproduce > 5 msec delays at all, while before they do
happen (just not in the trace above)."

gem_gtt_hog on an old Pineview (GMA3150),
before: 4969.119ms
after:  4122.749ms

Reported-by: Arjan van de Ven <arjan.van.de.ven@intel.com>
Testcase: igt/gem_gtt_hog
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-12 18:52:59 +01:00
..
dvo_ch7xxx.c
dvo_ch7017.c
dvo_ivch.c
dvo_ns2501.c drm/i915/ns2501: Rip out the reenable hack 2013-11-04 16:32:31 +01:00
dvo_sil164.c
dvo_tfp410.c
dvo.h
i915_debugfs.c drm/i915: Restore rps/rc6 on reset 2014-02-07 10:25:10 +01:00
i915_dma.c drm/i915: Move num_plane to the intel_device_info structure 2014-02-12 18:52:51 +01:00
i915_drv.c drm/i915: Restore rps/rc6 on reset 2014-02-07 10:25:10 +01:00
i915_drv.h drm/i915: vlv: handle only enabled pipestat interrupt events 2014-02-12 18:52:59 +01:00
i915_gem_context.c drm/i915: check for oom when allocating private_default_ctx 2014-02-04 12:10:26 +01:00
i915_gem_debug.c
i915_gem_dmabuf.c drm/i915: Pin pages whilst allocating for dma-buf vmap() 2013-11-29 15:51:20 +01:00
i915_gem_evict.c drm/i915: Kerneldoc for i915_gem_evict.c 2014-01-29 22:19:17 +01:00
i915_gem_execbuffer.c drm/i915: move module parameters into a struct, in a new file 2014-01-27 17:16:45 +01:00
i915_gem_gtt.c drm/i915: move module parameters into a struct, in a new file 2014-01-27 17:16:45 +01:00
i915_gem_stolen.c drm/i915: Fix the offset issue for the stolen GEM objects 2014-01-28 09:04:42 +01:00
i915_gem_tiling.c drm/i915: Make pin count per VMA 2013-12-18 15:27:49 +01:00
i915_gem.c drm/i915: Flush GPU rendering with a lockless wait during a pagefault 2014-02-12 18:52:59 +01:00
i915_gpu_error.c drm/i915: Generate a hang error code 2014-02-05 17:17:10 +01:00
i915_ioc32.c
i915_irq.c drm/i915: vlv: handle only enabled pipestat interrupt events 2014-02-12 18:52:59 +01:00
i915_params.c drm/i915: drop i915_ prefix from enable_rc6, enable_fbc, enable_ppgtt parameters 2014-01-27 17:24:03 +01:00
i915_reg.h drm/i915: vlv: handle only enabled pipestat interrupt events 2014-02-12 18:52:59 +01:00
i915_suspend.c drm/i915: Kill most of the FBC register save/restore 2014-01-25 21:17:03 +01:00
i915_sysfs.c drm/i915: Update rps interrupt limits 2014-02-07 10:26:17 +01:00
i915_trace_points.c
i915_trace.h
i915_ums.c drm/i915: Only restore backlight combination mode reg for ums 2014-01-24 17:22:45 +01:00
intel_acpi.c ACPI: Eliminate the DEVICE_ACPI_HANDLE() macro 2013-11-14 23:17:21 +01:00
intel_bios.c drm/i915: move module parameters into a struct, in a new file 2014-01-27 17:16:45 +01:00
intel_bios.h drm/i915: parse backlight modulation frequency from the BIOS VBT 2013-12-16 10:02:48 +01:00
intel_crt.c drm/i915: Shuffle modeset reset handling around 2014-01-24 17:22:52 +01:00
intel_ddi.c drm/i915: Consolidate FUSE_STRAP in one set of defines 2014-02-12 18:52:52 +01:00
intel_display.c drm/i915: Short-circuit no-op vga_set_state() 2014-02-12 18:52:56 +01:00
intel_dp.c drm/i915: fix initial timestamps for PP sequencing logic 2014-01-29 20:46:05 +01:00
intel_drv.h drm/i915: alloc intel_fb in the intel_fbdev struct 2014-02-12 18:52:55 +01:00
intel_dsi_cmd.c
intel_dsi_cmd.h
intel_dsi_pll.c drm/i915: Try harder to get best m, n, p values with minimal error 2013-12-11 23:52:18 +01:00
intel_dsi.c drm/i915: Parametrize the dphy and other spec specific parameters 2013-12-11 23:52:20 +01:00
intel_dsi.h drm/i915: Parametrize the dphy and other spec specific parameters 2013-12-11 23:52:20 +01:00
intel_dvo.c drm/i915: Return a drm_mode_status enum in the mode_valid vfuncs 2013-11-28 16:49:33 +01:00
intel_fbdev.c drm/i915: alloc intel_fb in the intel_fbdev struct 2014-02-12 18:52:55 +01:00
intel_hdmi.c drm/i915: Reorganize display pipe register accesses 2014-02-05 00:46:08 +01:00
intel_i2c.c drm/i915/vlv: split CCK and DDR freq usage 2013-11-05 19:28:47 +01:00
intel_lvds.c drm/i915: move module parameters into a struct, in a new file 2014-01-27 17:16:45 +01:00
intel_modes.c
intel_opregion.c drm/i915: Eliminate lots of WARNs when there's no backlight present 2014-01-22 10:34:38 +01:00
intel_overlay.c Merge branch 'topic/ppgtt' into drm-intel-next-queued 2014-01-25 21:14:57 +01:00
intel_panel.c drm/i915: move module parameters into a struct, in a new file 2014-01-27 17:16:45 +01:00
intel_pm.c drm/i915: Always use INTEL_INFO() to access the device_info structure 2014-02-12 18:52:50 +01:00
intel_ringbuffer.c drm/i915: Prevent recursion by retiring requests when the ring is full 2014-02-06 17:43:13 +01:00
intel_ringbuffer.h drm/i915: Use hangcheck score to find guilty context 2014-02-04 11:57:24 +01:00
intel_sdvo_regs.h drm/i915: use __packed instead of __attribute__((packed)) 2013-12-03 18:19:49 +01:00
intel_sdvo.c drm/i915: Don't cast away const from infoframe buffer 2013-12-10 14:49:04 +01:00
intel_sideband.c drm/i915: Use FLISDSI interface for band gap reset 2013-12-11 23:52:17 +01:00
intel_sprite.c drm/i915: Shuffle sprite register writes into a tighter group 2014-01-24 17:22:53 +01:00
intel_tv.c drm/i915: pass status instead of enable flags to i915_enable_pipestat 2014-02-12 18:52:57 +01:00
intel_uncore.c Merge branch 'topic/ppgtt' into drm-intel-next-queued 2014-01-25 21:14:57 +01:00
Kconfig i915, fbdev: Fix Kconfig typo 2013-11-21 21:59:02 +01:00
Makefile drm/i915: move module parameters into a struct, in a new file 2014-01-27 17:16:45 +01:00