linux_dsm_epyc7002/drivers/gpu/drm
Chris Wilson 786d290cae drm/i915: Check that each request phase is completed before retiring
Trying to chase an impossible bug (ivb):

[  207.765411] [drm:i915_reset_and_wakeup [i915]] resetting chip
[  207.765734] [drm:i915_gem_reset [i915]] resetting render ring to restart from tail of request 0x4ee834
[  207.765791] [drm:intel_print_rc6_info [i915]] Enabling RC6 states: RC6 on RC6p on RC6pp off
[  207.767213] [drm:intel_guc_setup [i915]] GuC fw status: path (null), fetch NONE, load NONE
[  207.767515] kernel BUG at drivers/gpu/drm/i915/i915_gem_request.c:203!
[  207.767551] invalid opcode: 0000 [#1] PREEMPT SMP
[  207.767576] Modules linked in: snd_hda_intel i915 cdc_ncm usbnet mii x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel lpc_ich snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core mei_me mei snd_pcm sdhci_pci sdhci mmc_core e1000e ptp pps_core [last unloaded: i915]
[  207.767808] CPU: 3 PID: 8855 Comm: gem_ringfill Tainted: G     U          4.9.0-rc5-CI-Patchwork_3052+ #1
[  207.767854] Hardware name: LENOVO 2356GCG/2356GCG, BIOS G7ET31WW (1.13 ) 07/02/2012
[  207.767894] task: ffff88012c82a740 task.stack: ffffc9000383c000
[  207.767927] RIP: 0010:[<ffffffffa00a0a3a>]  [<ffffffffa00a0a3a>] i915_gem_request_retire+0x2a/0x4b0 [i915]
[  207.767999] RSP: 0018:ffffc9000383fb20  EFLAGS: 00010293
[  207.768027] RAX: 00000000004ee83c RBX: ffff880135dcb480 RCX: 00000000004ee83a
[  207.768062] RDX: ffff88012fea42a8 RSI: 0000000000000001 RDI: ffff88012c82af68
[  207.768095] RBP: ffffc9000383fb48 R08: 0000000000000000 R09: 0000000000000000
[  207.768129] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880135dcb480
[  207.768163] R13: ffff88012fea42a8 R14: 0000000000000000 R15: 00000000000001d8
[  207.768200] FS:  00007f955f658740(0000) GS:ffff88013e2c0000(0000) knlGS:0000000000000000
[  207.768239] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  207.768258] CR2: 0000555899725930 CR3: 00000001316f6000 CR4: 00000000001406e0
[  207.768286] Stack:
[  207.768299]  ffff880135dcb480 ffff880135dcbe00 ffff88012fea42a8 0000000000000000
[  207.768350]  00000000000001d8 ffffc9000383fb70 ffffffffa00a1339 0000000000000000
[  207.768402]  ffff88012f296c88 00000000000003f0 ffffc9000383fbb0 ffffffffa00b582d
[  207.768453] Call Trace:
[  207.768493]  [<ffffffffa00a1339>] i915_gem_request_retire_upto+0x49/0x90 [i915]
[  207.768553]  [<ffffffffa00b582d>] intel_ring_begin+0x15d/0x2d0 [i915]
[  207.768608]  [<ffffffffa00b59cb>] intel_ring_alloc_request_extras+0x2b/0x40 [i915]
[  207.768667]  [<ffffffffa00a2fd9>] i915_gem_request_alloc+0x359/0x440 [i915]
[  207.768723]  [<ffffffffa008bd03>] i915_gem_do_execbuffer.isra.15+0x783/0x1a10 [i915]
[  207.768766]  [<ffffffff811a6a2e>] ? __might_fault+0x3e/0x90
[  207.768816]  [<ffffffffa008d380>] i915_gem_execbuffer2+0xc0/0x250 [i915]
[  207.768854]  [<ffffffff815532a6>] drm_ioctl+0x1f6/0x480
[  207.768900]  [<ffffffffa008d2c0>] ? i915_gem_execbuffer+0x330/0x330 [i915]
[  207.768939]  [<ffffffff81202f6e>] do_vfs_ioctl+0x8e/0x690
[  207.768972]  [<ffffffff818193ac>] ? retint_kernel+0x2d/0x2d
[  207.769004]  [<ffffffff810d6ef2>] ? trace_hardirqs_on_caller+0x122/0x1b0
[  207.769039]  [<ffffffff812035ac>] SyS_ioctl+0x3c/0x70
[  207.769068]  [<ffffffff818189ae>] entry_SYSCALL_64_fastpath+0x1c/0xb1
[  207.769103] Code: 90 55 48 89 e5 41 57 41 56 41 55 41 54 49 89 fc 53 8b 35 fa 7b e1 e1 85 f6 0f 85 55 03 00 00 41 8b 84 24 80 02 00 00 85 c0 75 02 <0f> 0b 49 8b 94 24 a8 00 00 00 48 8b 8a e0 01 00 00 8b 89 c0 00
[  207.769400] RIP  [<ffffffffa00a0a3a>] i915_gem_request_retire+0x2a/0x4b0 [i915]
[  207.769463]  RSP <ffffc9000383fb20>

Let's add a couple more BUG_ONs before this to ascertain that the request
did make it to hardware. The impossible part of this stacktrace is that
request must have been considered completed by the i915_request_wait()
before we tried to retire it.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161118143412.26508-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
2016-11-18 20:49:46 +00:00
..
amd drm/irq: Unexport drm_vblank_on/off 2016-11-15 23:33:48 +01:00
arc
arm Merge branch 'drm-tda998x-mali' of git://git.armlinux.org.uk/~rmk/linux-arm into drm-next 2016-11-17 08:55:26 +10:00
armada drm/armada: use DRM_FB_HELPER_DEFAULT_OPS for fb_ops 2016-11-14 07:54:13 +01:00
ast drm/ast: free correct pointer in astfb_create() error paths 2016-11-14 07:45:16 +01:00
atmel-hlcdc
bochs drm/bochs: use DRM_FB_HELPER_DEFAULT_OPS for fb_ops 2016-11-14 07:54:38 +01:00
bridge drm/bridge: analogix_dp: return error if transfer none byte 2016-11-16 11:16:13 +05:30
cirrus Merge tag 'topic/drm-misc-2016-11-10' of git://anongit.freedesktop.org/drm-intel into drm-next 2016-11-11 09:28:44 +10:00
etnaviv
exynos drm/exynos: use DRM_FB_HELPER_DEFAULT_OPS for fb_ops 2016-11-14 07:54:50 +01:00
fsl-dcu
gma500 drm/gma500: use DRM_FB_HELPER_DEFAULT_OPS for fb_ops 2016-11-14 08:01:01 +01:00
hisilicon drm: move allocation out of drm_get_format_name() 2016-11-12 14:19:38 +01:00
i2c Merge branch 'drm-tda998x-mali' of git://git.armlinux.org.uk/~rmk/linux-arm into drm-next 2016-11-17 08:55:26 +10:00
i810
i915 drm/i915: Check that each request phase is completed before retiring 2016-11-18 20:49:46 +00:00
imx drm/imx: Switch to drm_fb_cma_prepare_fb() helper 2016-11-15 08:25:06 +01:00
mediatek drm/irq: Unexport drm_vblank_count 2016-11-15 23:33:47 +01:00
mga
mgag200 Merge tag 'topic/drm-misc-2016-11-10' of git://anongit.freedesktop.org/drm-intel into drm-next 2016-11-11 09:28:44 +10:00
msm drm/msm: use DRM_FB_HELPER_DEFAULT_OPS for fb_ops 2016-11-14 07:58:04 +01:00
nouveau drm/nouveau: Use drm_crtc_vblank_off/on 2016-11-15 23:33:37 +01:00
omapdrm drm/omapdrm: use DRM_FB_HELPER_DEFAULT_OPS for fb_ops 2016-11-14 07:58:15 +01:00
panel
qxl drm/qxl: use DRM_FB_HELPER_DEFAULT_OPS for fb_ops 2016-11-14 07:55:33 +01:00
r128
radeon drm/radeon: use DRM_FB_HELPER_DEFAULT_OPS for fb_ops 2016-11-14 07:56:52 +01:00
rcar-du Merge branch 'drm/next/du' of git://linuxtv.org/pinchartl/media into drm-next 2016-11-16 09:39:21 +10:00
rockchip drm/rockchip: use DRM_FB_HELPER_DEFAULT_OPS for fb_ops 2016-11-14 07:56:47 +01:00
savage
shmobile
sis
sti Merge tag 'topic/drm-misc-2016-11-10' of git://anongit.freedesktop.org/drm-intel into drm-next 2016-11-11 09:28:44 +10:00
sun4i Merge tag 'drm-misc-next-2016-11-16' of git://anongit.freedesktop.org/git/drm-misc into drm-next 2016-11-17 08:02:46 +10:00
tdfx
tegra drm/tegra: use DRM_FB_HELPER_DEFAULT_OPS for fb_ops 2016-11-14 07:56:58 +01:00
tilcdc
ttm drm/ttm: fix ttm_bo_wait 2016-11-09 00:46:04 +05:30
udl drm/udl: use DRM_FB_HELPER_DEFAULT_OPS for fb_ops 2016-11-14 07:57:59 +01:00
vc4 This pull request brings in fragment shader threading and ETC1 support 2016-11-17 09:43:56 +10:00
vgem
via
virtio drm/virtio: use DRM_FB_HELPER_DEFAULT_OPS for fb_ops 2016-11-14 07:58:10 +01:00
vmwgfx drm: move allocation out of drm_get_format_name() 2016-11-12 14:19:38 +01:00
zte drm: zte: checking for NULL instead of IS_ERR() 2016-11-15 11:00:42 +01:00
ati_pcigart.c
drm_agpsupport.c
drm_atomic_helper.c drm/fence: add in-fences support 2016-11-16 09:55:27 +01:00
drm_atomic.c drm/fence: add out-fences support 2016-11-16 14:36:27 +01:00
drm_auth.c
drm_blend.c
drm_bridge.c
drm_bufs.c
drm_cache.c
drm_color_mgmt.c drm/color: document NULL values and default settings better 2016-11-15 22:39:48 +01:00
drm_connector.c drm: Move tile group code into drm_connector.c 2016-11-15 15:30:38 +01:00
drm_context.c
drm_crtc_helper_internal.h
drm_crtc_helper.c
drm_crtc_internal.h drm/fence: add fence timeline to drm_crtc 2016-11-16 10:42:48 +01:00
drm_crtc.c drm/fence: add out-fences support 2016-11-16 14:36:27 +01:00
drm_debugfs_crc.c
drm_debugfs.c drm/atomic: add debugfs file to dump out atomic state 2016-11-08 16:38:03 -05:00
drm_dma.c
drm_dp_aux_dev.c
drm_dp_dual_mode_helper.c
drm_dp_helper.c
drm_dp_mst_topology.c
drm_drv.c drm: Extract drm_drv.h 2016-11-15 12:50:30 +01:00
drm_dumb_buffers.c drm: Consolidate dumb buffer docs 2016-11-15 12:51:49 +01:00
drm_edid_load.c
drm_edid.c Merge remote-tracking branch 'airlied/drm-next' into drm-intel-next-queued 2016-11-17 14:32:57 +01:00
drm_encoder_slave.c
drm_encoder.c
drm_fb_cma_helper.c drm/fb_cma_helper: Add drm_fb_cma_prepare_fb() helper 2016-11-14 12:43:58 +01:00
drm_fb_helper.c drm/fb-helper: fix segfaults in drm_fb_helper_debug_* 2016-11-14 07:47:34 +01:00
drm_flip_work.c
drm_fops.c
drm_fourcc.c drm: move allocation out of drm_get_format_name() 2016-11-12 14:19:38 +01:00
drm_framebuffer.c drm: move allocation out of drm_get_format_name() 2016-11-12 14:19:38 +01:00
drm_gem_cma_helper.c
drm_gem.c
drm_global.c
drm_hashtab.c
drm_info.c
drm_internal.h drm: drm_irq.h header cleanup 2016-11-15 23:33:48 +01:00
drm_ioc32.c
drm_ioctl.c
drm_irq.c drm/irq: Unexport drm_vblank_on/off 2016-11-15 23:33:48 +01:00
drm_kms_helper_common.c
drm_legacy.h
drm_lock.c
drm_memory.c
drm_mipi_dsi.c
drm_mm.c
drm_mode_config.c drm/fence: add out-fences support 2016-11-16 14:36:27 +01:00
drm_mode_object.c
drm_modes.c Revert "drm: Add aspect ratio parsing in DRM layer" 2016-11-15 15:01:42 +01:00
drm_modeset_helper.c drm: move allocation out of drm_get_format_name() 2016-11-12 14:19:38 +01:00
drm_modeset_lock.c drm: don't let crtc_ww_class leak out 2016-11-15 08:33:35 +01:00
drm_of.c
drm_panel.c
drm_pci.c
drm_plane_helper.c drm: add helpers to go from plane state to drm_rect 2016-11-08 16:38:03 -05:00
drm_plane.c drm/fence: add in-fences support 2016-11-16 09:55:27 +01:00
drm_platform.c
drm_prime.c
drm_print.c drm/print: Move kerneldoc next to definition 2016-11-15 12:55:24 +01:00
drm_probe_helper.c
drm_property.c
drm_rect.c drm: helper macros to print composite types 2016-11-08 16:38:03 -05:00
drm_scatter.c
drm_simple_kms_helper.c
drm_sysfs.c
drm_trace_points.c
drm_trace.h
drm_vm.c
drm_vma_manager.c
Kconfig drm/fence: add in-fences support 2016-11-16 09:55:27 +01:00
Makefile drm: Extract drm_mode_config.[hc] 2016-11-15 15:23:29 +01:00