mirror of
https://github.com/AuxXxilium/linux_dsm_epyc7002.git
synced 2024-12-28 11:18:45 +07:00
10e36489ab
If we skip the reset as we found the engine inactive at the time of the
reset, we still need to clear the residual inflight & pending request
bookkeeping to reflect the current state of HW.
Otherwise, we may end up stuck in a loop like:
<7> [416.490346] hangcheck rcs0
<7> [416.490371] hangcheck Awake? 1
<7> [416.490376] hangcheck Hangcheck: 8003 ms ago
<7> [416.490380] hangcheck Reset count: 0 (global 0)
<7> [416.490383] hangcheck Requests:
<7> [416.491210] hangcheck RING_START: 0x0017b000
<7> [416.491983] hangcheck RING_HEAD: 0x00000048
<7> [416.491992] hangcheck RING_TAIL: 0x00000048
<7> [416.492006] hangcheck RING_CTL: 0x00000000
<7> [416.492037] hangcheck RING_MODE: 0x00000200 [idle]
<7> [416.492044] hangcheck RING_IMR: 00000000
<7> [416.492809] hangcheck ACTHD: 0x00000000_9ca00048
<7> [416.492824] hangcheck BBADDR: 0x00000000_00001004
<7> [416.492838] hangcheck DMA_FADDR: 0x00000000_00000000
<7> [416.492845] hangcheck IPEIR: 0x00000000
<7> [416.492852] hangcheck IPEHR: 0x00000000
<7> [416.492863] hangcheck Execlist status: 0x00018001 00000000, entries 12
<7> [416.492869] hangcheck Execlist CSB read 1, write 1, tasklet queued? no (enabled)
<7> [416.492938] hangcheck Pending[0] ring:{start:0017b000, hwsp:fedf9000, seqno:00016fd6}, rq: 20ffa:16fd6!+ prio=-4094 @ 8307ms: signaled
<7> [416.492972] hangcheck Queue priority hint: -4093
<7> [416.492979] hangcheck Q 20ffa:16fd8- prio=-4093 @ 8307ms: [i915]
<7> [416.492985] hangcheck Q 20ffa:16fda prio=-4094 @ 8307ms: [i915]
<7> [416.492990] hangcheck Q 20ffa:16fdc prio=-4094 @ 8307ms: [i915]
<7> [416.492996] hangcheck Q 20ffa:16fde prio=-4094 @ 8307ms: [i915]
<7> [416.493001] hangcheck Q 20ffa:16fe0 prio=-4094 @ 8307ms: [i915]
<7> [416.493007] hangcheck Q 20ffa:16fe2 prio=-4094 @ 8307ms: [i915]
<7> [416.493013] hangcheck Q 20ffa:16fe4 prio=-4094 @ 8307ms: [i915]
<7> [416.493021] hangcheck ...skipping 21 queued requests...
<7> [416.493027] hangcheck Q 20ffa:17010 prio=-4094 @ 8307ms: [i915]
<7> [416.493081] hangcheck HWSP:
<7> [416.493089] hangcheck [0000] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<7> [416.493094] hangcheck *
<7> [416.493100] hangcheck [0040] 10008002 00000000 10000018 00000000 10000018 00000000 10000001 00000000
<7> [416.493106] hangcheck [0060] 10000018 00000000 10000001 00000000 10000018 00000000 10000001 00000000
<7> [416.493111] hangcheck *
<7> [416.493117] hangcheck [00a0] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000001
<7> [416.493123] hangcheck [00c0] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<7> [416.493127] hangcheck *
<7> [416.493132] hangcheck Idle? no
<6> [416.512124] i915 0000:00:02.0: GPU HANG: ecode 11:0:0x00000000, hang on rcs0
<6> [416.512205] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
<6> [416.512207] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
<6> [416.512208] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
<6> [416.512210] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
<6> [416.512212] [drm] GPU crash dump saved to /sys/class/drm/card0/error
<5> [416.513602] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
<7> [424.489258] hangcheck rcs0
<7> [424.489263] hangcheck Awake? 1
<7> [424.489267] hangcheck Hangcheck: 5954 ms ago
<7> [424.489271] hangcheck Reset count: 1 (global 0)
<7> [424.489274] hangcheck Requests:
<7> [424.490128] hangcheck RING_START: 0x00000000
<7> [424.490870] hangcheck RING_HEAD: 0x00000000
<7> [424.490877] hangcheck RING_TAIL: 0x00000000
<7> [424.490887] hangcheck RING_CTL: 0x00000000
<7> [424.490897] hangcheck RING_MODE: 0x00000200 [idle]
<7> [424.490904] hangcheck RING_IMR: 00000000
<7> [424.490917] hangcheck ACTHD: 0x00000000_00000000
<7> [424.490930] hangcheck BBADDR: 0x00000000_00000000
<7> [424.490943] hangcheck DMA_FADDR: 0x00000000_00000000
<7> [424.490950] hangcheck IPEIR: 0x00000000
<7> [424.490956] hangcheck IPEHR: 0x00000000
<7> [424.490968] hangcheck Execlist status: 0x00000001 00000000, entries 12
<7> [424.490972] hangcheck Execlist CSB read 11, write 11, tasklet queued? no (enabled)
<7> [424.490983] hangcheck Pending[0] ring:{start:0017b000, hwsp:fedf9000, seqno:00016fd6}, rq: 20ffa:16fd6!+ prio=-4094 @ 16305ms: signaled
<7> [424.490989] hangcheck Queue priority hint: -4093
<7> [424.490996] hangcheck Q 20ffa:16fd8- prio=-4093 @ 16305ms: [i915]
<7> [424.491001] hangcheck Q 20ffa:16fda prio=-4094 @ 16305ms: [i915]
<7> [424.491006] hangcheck Q 20ffa:16fdc prio=-4094 @ 16305ms: [i915]
<7> [424.491011] hangcheck Q 20ffa:16fde prio=-4094 @ 16305ms: [i915]
<7> [424.491016] hangcheck Q 20ffa:16fe0 prio=-4094 @ 16305ms: [i915]
<7> [424.491022] hangcheck Q 20ffa:16fe2 prio=-4094 @ 16305ms: [i915]
<7> [424.491048] hangcheck Q 20ffa:16fe4 prio=-4094 @ 16305ms: [i915]
<7> [424.491057] hangcheck ...skipping 21 queued requests...
<7> [424.491063] hangcheck Q 20ffa:17010 prio=-4094 @ 16305ms: [i915]
<7> [424.491095] hangcheck HWSP:
<7> [424.491102] hangcheck [0000] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<7> [424.491106] hangcheck *
<7> [424.491113] hangcheck [0040] 10008002 00000000 10000018 00000000 10000018 00000000 10000001 00000000
<7> [424.491118] hangcheck [0060] 10000018 00000000 10000001 00000000 10000018 00000000 10000001 00000000
<7> [424.491122] hangcheck *
<7> [424.491127] hangcheck [00a0] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 0000000b
<7> [424.491133] hangcheck [00c0] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<7> [424.491136] hangcheck *
<7> [424.491141] hangcheck Idle? no
<5> [424.491834] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Where not having cleared the pending array on reset, it persists
indefinitely.
Fixes:
|
||
---|---|---|
.. | ||
amd | ||
arc | ||
arm | ||
armada | ||
aspeed | ||
ast | ||
atmel-hlcdc | ||
bochs | ||
bridge | ||
cirrus | ||
etnaviv | ||
exynos | ||
fsl-dcu | ||
gma500 | ||
hisilicon | ||
i2c | ||
i810 | ||
i915 | ||
imx | ||
ingenic | ||
lib | ||
lima | ||
mcde | ||
mediatek | ||
meson | ||
mga | ||
mgag200 | ||
msm | ||
mxsfb | ||
nouveau | ||
omapdrm | ||
panel | ||
panfrost | ||
pl111 | ||
qxl | ||
r128 | ||
radeon | ||
rcar-du | ||
rockchip | ||
savage | ||
scheduler | ||
selftests | ||
shmobile | ||
sis | ||
sti | ||
stm | ||
sun4i | ||
tdfx | ||
tegra | ||
tilcdc | ||
tinydrm | ||
ttm | ||
tve200 | ||
udl | ||
v3d | ||
vboxvideo | ||
vc4 | ||
vgem | ||
via | ||
virtio | ||
vkms | ||
vmwgfx | ||
xen | ||
zte | ||
ati_pcigart.c | ||
drm_agpsupport.c | ||
drm_atomic_helper.c | ||
drm_atomic_state_helper.c | ||
drm_atomic_uapi.c | ||
drm_atomic.c | ||
drm_auth.c | ||
drm_blend.c | ||
drm_bridge.c | ||
drm_bufs.c | ||
drm_cache.c | ||
drm_client_modeset.c | ||
drm_client.c | ||
drm_color_mgmt.c | ||
drm_connector.c | ||
drm_context.c | ||
drm_crtc_helper_internal.h | ||
drm_crtc_helper.c | ||
drm_crtc_internal.h | ||
drm_crtc.c | ||
drm_damage_helper.c | ||
drm_debugfs_crc.c | ||
drm_debugfs.c | ||
drm_dma.c | ||
drm_dp_aux_dev.c | ||
drm_dp_cec.c | ||
drm_dp_dual_mode_helper.c | ||
drm_dp_helper.c | ||
drm_dp_mst_topology.c | ||
drm_drv.c | ||
drm_dsc.c | ||
drm_dumb_buffers.c | ||
drm_edid_load.c | ||
drm_edid.c | ||
drm_encoder_slave.c | ||
drm_encoder.c | ||
drm_fb_cma_helper.c | ||
drm_fb_helper.c | ||
drm_file.c | ||
drm_flip_work.c | ||
drm_format_helper.c | ||
drm_fourcc.c | ||
drm_framebuffer.c | ||
drm_gem_cma_helper.c | ||
drm_gem_framebuffer_helper.c | ||
drm_gem_shmem_helper.c | ||
drm_gem_vram_helper.c | ||
drm_gem.c | ||
drm_hashtab.c | ||
drm_hdcp.c | ||
drm_internal.h | ||
drm_ioc32.c | ||
drm_ioctl.c | ||
drm_irq.c | ||
drm_kms_helper_common.c | ||
drm_lease.c | ||
drm_legacy_misc.c | ||
drm_legacy.h | ||
drm_lock.c | ||
drm_memory.c | ||
drm_mipi_dsi.c | ||
drm_mm.c | ||
drm_mode_config.c | ||
drm_mode_object.c | ||
drm_modes.c | ||
drm_modeset_helper.c | ||
drm_modeset_lock.c | ||
drm_of.c | ||
drm_panel_orientation_quirks.c | ||
drm_panel.c | ||
drm_pci.c | ||
drm_plane_helper.c | ||
drm_plane.c | ||
drm_prime.c | ||
drm_print.c | ||
drm_probe_helper.c | ||
drm_property.c | ||
drm_rect.c | ||
drm_scatter.c | ||
drm_scdc_helper.c | ||
drm_self_refresh_helper.c | ||
drm_simple_kms_helper.c | ||
drm_syncobj.c | ||
drm_sysfs.c | ||
drm_trace_points.c | ||
drm_trace.h | ||
drm_vblank.c | ||
drm_vm.c | ||
drm_vma_manager.c | ||
drm_vram_helper_common.c | ||
drm_vram_mm_helper.c | ||
drm_writeback.c | ||
Kconfig | ||
Makefile |