linux_dsm_epyc7002/drivers/gpu/drm
Emily Deng 8ee3a52e3f drm/gpu-sched: fix force APP kill hang(v4)
issue:
there are VMC page fault occurred if force APP kill during
3dmark test, the cause is in entity_fini we manually signal
all those jobs in entity's queue which confuse the sync/dep
mechanism:

1)page fault occurred in sdma's clear job which operate on
shadow buffer, and shadow buffer's Gart table is cleaned by
ttm_bo_release since the fence in its reservation was fake signaled
by entity_fini() under the case of SIGKILL received.

2)page fault occurred in gfx' job because during the lifetime
of gfx job we manually fake signal all jobs from its entity
in entity_fini(), thus the unmapping/clear PTE job depend on those
result fence is satisfied and sdma start clearing the PTE and lead
to GFX page fault.

fix:
1)should at least wait all jobs already scheduled complete in entity_fini()
if SIGKILL is the case.

2)if a fence signaled and try to clear some entity's dependency, should
set this entity guilty to prevent its job really run since the dependency
is fake signaled.

v2:
splitting drm_sched_entity_fini() into two functions:
1)The first one is does the waiting, removes the entity from the
runqueue and returns an error when the process was killed.
2)The second one then goes over the entity, install it as
completion signal for the remaining jobs and signals all jobs
with an error code.

v3:
1)Replace the fini1 and fini2 with better name
2)Call the first part before the VM teardown in
amdgpu_driver_postclose_kms() and the second part
after the VM teardown
3)Keep the original function drm_sched_entity_fini to
refine the code.

v4:
1)Rename entity->finished to entity->last_scheduled;
2)Rename drm_sched_entity_fini_job_cb() to
drm_sched_entity_kill_jobs_cb();
3)Pass NULL to drm_sched_entity_fini_job_cb() if -ENOENT;
4)Replace the type of entity->fini_status with "int";
5)Remove the check about entity->finished.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-15 13:43:17 -05:00
..
amd drm/gpu-sched: fix force APP kill hang(v4) 2018-05-15 13:43:17 -05:00
arc drm/arcpgu: remove drm_encoder_slave 2018-01-30 18:05:25 +01:00
arm drm: mali-dp: Add YUV->RGB conversion support for video layers 2018-03-14 11:41:01 +00:00
armada drm: Don't pass clip to drm_atomic_helper_check_plane_state() 2018-03-05 20:48:25 +02:00
ast Linux 4.16-rc7 2018-03-28 14:30:41 +10:00
atmel-hlcdc drm/atmel-hlcdc: Use the alpha format field in drm_format_info 2018-01-29 12:08:37 +01:00
bochs drm/ttm: add bo as parameter to the ttm_tt_create callback 2018-03-14 14:38:27 -05:00
bridge drm/bridge: dw-hdmi: Remove unused hdmi_enable_overflow_interrupts() 2018-03-15 09:45:11 +01:00
cirrus Linux 4.16-rc7 2018-03-28 14:30:41 +10:00
etnaviv drm/etnaviv: bump HW job limit to 4 2018-03-22 11:08:48 +01:00
exynos Linux 4.16-rc7 2018-03-28 14:30:41 +10:00
fsl-dcu drm/fsl-dcu: Use drm_mode_config_helper_suspend/resume() 2017-12-05 13:46:41 +01:00
gma500 pci-v4.16-changes 2018-02-06 09:59:40 -08:00
hisilicon drm/ttm: add bo as parameter to the ttm_tt_create callback 2018-03-14 14:38:27 -05:00
i2c drm/i2c: tda998x: Remove duplicate NULL check 2018-01-18 16:24:38 +02:00
i810
i915 Merge tag 'drm-intel-next-fixes-2018-03-27' of git://anongit.freedesktop.org/drm/drm-intel into drm-next 2018-03-28 14:47:26 +10:00
imx Linux 4.16-rc7 2018-03-28 14:30:41 +10:00
lib License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
mediatek drm: Don't pass clip to drm_atomic_helper_check_plane_state() 2018-03-05 20:48:25 +02:00
meson Linux 4.16-rc7 2018-03-28 14:30:41 +10:00
mga
mgag200 drm/ttm: add bo as parameter to the ttm_tt_create callback 2018-03-14 14:38:27 -05:00
msm Merge tag 'drm-msm-next-2018-03-20' of git://people.freedesktop.org/~robclark/linux into drm-next 2018-03-21 14:06:00 +10:00
mxsfb drm/mxsfb: Do not use deprecated drm_driver.{enable|disable)_vblank 2018-02-22 17:58:59 +01:00
nouveau Linux 4.16-rc7 2018-03-28 14:30:41 +10:00
omapdrm drm/omap: fix compile error when DPI is disabled 2018-03-14 10:39:50 +02:00
panel drm/panel: rm68200: Add backlight dependency 2018-03-14 11:51:24 +01:00
pl111 drm/pl111: Use max memory bandwidth for resolution 2018-03-07 23:14:24 +01:00
qxl Merge airlied/drm-next into drm-misc-next 2018-03-21 09:40:55 -04:00
r128 r128: don't open-code memdup_user() 2017-12-27 19:00:09 -05:00
radeon drm/radeon: add PX quirk for Asus K73TK 2018-05-15 13:43:02 -05:00
rcar-du drm-misc-next for 4.17: 2018-03-14 10:59:16 +10:00
rockchip drm/rockchip: cdn-dp: remove the DP phy switch 2018-03-16 11:51:11 +01:00
savage
scheduler drm/gpu-sched: fix force APP kill hang(v4) 2018-05-15 13:43:17 -05:00
selftests Merge tag 'drm-misc-next-2017-11-30' of git://anongit.freedesktop.org/drm/drm-misc into drm-next 2017-12-04 05:42:49 +10:00
shmobile main drm pull request for v4.15 2017-11-15 20:42:10 -08:00
sis
sti drm/sti: Use drm_fb_cma_fbdev_init/fini() 2017-12-08 14:47:41 +01:00
stm drm/stm: check pitch and size calculations even if !CONFIG_MMU 2018-02-23 09:37:12 +01:00
sun4i Linux 4.16-rc7 2018-03-28 14:30:41 +10:00
tdfx
tegra Linux 4.16-rc7 2018-03-28 14:30:41 +10:00
tilcdc drm/tilcdc: tilcdc_panel: Rename device from "panel" to "tilcdc-panel" 2018-02-28 11:48:25 +02:00
tinydrm tinydrm: add backlight dependency 2018-02-28 15:08:56 -05:00
ttm drm/ttm: keep a reference to transfer pipelined BOs 2018-05-15 13:43:11 -05:00
tve200 drm/tve200: Do not use deprecated drm_driver.{enable|disable)_vblank 2018-02-22 17:58:59 +01:00
udl drm: udl: Properly check framebuffer mmap offsets 2018-03-22 07:59:01 +01:00
vc4 drm/vc4_validate: Remove VLA usage 2018-03-16 15:51:52 -07:00
vgem treewide: setup_timer() -> timer_setup() 2017-11-21 15:57:07 -08:00
via Merge tag 'drm-misc-next-2017-11-30' of git://anongit.freedesktop.org/drm/drm-misc into drm-next 2017-12-04 05:42:49 +10:00
virtio Linux 4.16-rc7 2018-03-28 14:30:41 +10:00
vmwgfx Linux 4.16-rc7 2018-03-28 14:30:41 +10:00
zte drm: Don't pass clip to drm_atomic_helper_check_plane_state() 2018-03-05 20:48:25 +02:00
ati_pcigart.c
drm_agpsupport.c
drm_atomic_helper.c drm: Remove now pointelss blob->data casts 2018-03-16 15:44:01 +02:00
drm_atomic.c drm: Verify gamma/degamma LUT size 2018-03-16 15:44:01 +02:00
drm_auth.c drm: Check for lessee in DROP_MASTER ioctl 2018-01-31 09:27:51 +01:00
drm_blend.c drm/docs: Align layout of optional plane blending properties 2018-02-20 12:10:46 +01:00
drm_bridge.c
drm_bufs.c drm: dma_bufs: Fixed checkpatch issues 2018-03-19 09:31:20 -04:00
drm_cache.c
drm_color_mgmt.c drm/atomic: Include color encoding/range in plane state dump 2018-03-02 14:41:21 +02:00
drm_connector.c drm/docs: Document "scaling mode" property better 2018-02-20 12:10:46 +01:00
drm_context.c
drm_crtc_helper_internal.h
drm_crtc_helper.c drm: Replace kzalloc with kcalloc 2017-10-13 15:49:03 -04:00
drm_crtc_internal.h drm/atomic: Include color encoding/range in plane state dump 2018-03-02 14:41:21 +02:00
drm_crtc.c drm: Check that the plane supports the request format+modifier combo 2018-02-26 16:29:47 +02:00
drm_debugfs_crc.c drm/crc: Add support for polling on the data fd. 2018-02-05 13:22:44 +01:00
drm_debugfs.c drm/debugfs: Fix framebuffer debugfs file init 2017-11-14 11:08:17 +02:00
drm_dma.c
drm_dp_aux_dev.c Pass mode to wait_on_atomic_t() action funcs and provide default actions 2017-11-13 15:38:16 +00:00
drm_dp_dual_mode_helper.c drm: Add retries for lspcon mode detection 2017-10-13 12:13:54 +03:00
drm_dp_helper.c drm/dp: Add HBR3 support in existing DRM DP helpers 2018-01-26 13:36:53 +02:00
drm_dp_mst_topology.c drm: NULL pointer dereference [null-pointer-deref] (CWE 476) problem 2018-02-19 12:58:20 +01:00
drm_drv.c drm: NULL pointer dereference [null-pointer-deref] (CWE 476) problem 2018-03-06 08:14:16 +01:00
drm_dumb_buffers.c
drm_edid_load.c
drm_edid.c Linux 4.16-rc7 2018-03-28 14:30:41 +10:00
drm_encoder_slave.c
drm_encoder.c drm: Warn if plane/crtc/encoder/connector index exceeds our 32bit bitmasks 2018-01-29 18:46:53 +02:00
drm_fb_cma_helper.c drm/cma-helper: Add drm_fb_cma_fbdev_init/fini() 2017-12-08 14:27:47 +01:00
drm_fb_helper.c drm: Remove now pointelss blob->data casts 2018-03-16 15:44:01 +02:00
drm_file.c vfs: do bulk POLL* -> EPOLL* replacement 2018-02-11 14:34:03 -08:00
drm_flip_work.c
drm_fourcc.c drm/fourcc: Add a alpha field to drm_format_info 2018-01-29 12:07:47 +01:00
drm_framebuffer.c Linux 4.16-rc7 2018-03-28 14:30:41 +10:00
drm_gem_cma_helper.c drm: gem_cma_helper.c: Allow importing of contiguous scatterlists with nents > 1 2017-11-15 18:14:46 +01:00
drm_gem_framebuffer_helper.c drm/gem-fb-helper: drm_gem_fbdev_fb_create() make funcs optional 2017-12-08 14:26:00 +01:00
drm_gem.c drm: Use idr_init_base(1) when using id==0 for invalid 2018-02-19 12:21:24 +00:00
drm_global.c
drm_hashtab.c
drm_info.c
drm_internal.h Merge airlied/drm-next into drm-misc-next 2017-11-21 14:17:56 +01:00
drm_ioc32.c
drm_ioctl.c drm: Print the pid when debug logging an ioctl error. 2018-02-10 22:23:10 +00:00
drm_irq.c
drm_kms_helper_common.c
drm_lease.c drm: Fix kerneldoc warnings for drm_lease 2018-02-19 10:49:59 +01:00
drm_legacy.h
drm_lock.c
drm_memory.c drm: fix drm_get_max_iomem type mismatch 2018-02-22 11:18:58 -05:00
drm_mipi_dsi.c drm/dsi: Fix improper use of mipi_dsi_device_transfer() return value 2018-01-16 17:10:14 -05:00
drm_mm.c Linux 4.16-rc7 2018-03-28 14:30:41 +10:00
drm_mode_config.c Linux 4.15-rc4 2017-12-19 21:37:24 +10:00
drm_mode_object.c drm/mode_object: fix documentation for object lookups. 2017-11-10 13:50:47 +10:00
drm_modes.c drm: Fix uabi regression by allowing garbage mode->type from userspace 2018-03-23 13:51:12 +02:00
drm_modeset_helper.c drm/modeset-helper: Add simple modeset suspend/resume helpers 2017-11-30 18:18:08 +01:00
drm_modeset_lock.c drm/atomic: Call ww_acquire_done after drm_modeset_lock_all 2018-03-05 10:35:32 +01:00
drm_of.c drm: of: simplify component probe code 2018-03-06 14:05:00 +05:30
drm_panel_orientation_quirks.c drm: Include the header with the prototype for drm_get_panel_orientation_quirk() 2018-02-26 17:39:59 +02:00
drm_panel.c
drm_pci.c drm/core: clean up references to drm_dev_unref() 2017-09-27 10:53:12 +02:00
drm_plane_helper.c drm: Don't pass clip to drm_atomic_helper_check_plane_state() 2018-03-05 20:48:25 +02:00
drm_plane.c Merge airlied/drm-next into drm-misc-next 2018-03-21 09:40:55 -04:00
drm_prime.c drm/prime: make the pages array optional for drm_prime_sg_to_page_addr_arrays 2018-03-06 12:24:52 -05:00
drm_print.c drm: Reduce object size of DRM_DEV_<LEVEL> uses 2018-03-19 15:15:42 +01:00
drm_probe_helper.c Linux 4.16-rc7 2018-03-28 14:30:41 +10:00
drm_property.c Revert "drm: Use a flexible array member for blob property data" 2018-03-16 15:44:01 +02:00
drm_rect.c
drm_scatter.c
drm_scdc_helper.c Merge tag 'drm-misc-next-2017-09-20' of git://anongit.freedesktop.org/git/drm-misc into drm-next 2017-09-28 05:46:15 +10:00
drm_simple_kms_helper.c drm: Don't pass clip to drm_atomic_helper_check_plane_state() 2018-03-05 20:48:25 +02:00
drm_syncobj.c drm: Use idr_init_base(1) when using id==0 for invalid 2018-02-19 12:21:24 +00:00
drm_sysfs.c
drm_trace_points.c
drm_trace.h main drm pull request for v4.15 2017-11-15 20:42:10 -08:00
drm_vblank.c Merge tag 'drm-intel-next-2018-03-08' of git://anongit.freedesktop.org/drm/drm-intel into drm-next 2018-03-14 14:53:01 +10:00
drm_vm.c
drm_vma_manager.c drm/drm_vma_manager.c: Remove useless goto statement 2017-11-02 10:44:08 +01:00
Kconfig Fixes for 4.16: 2018-01-25 11:42:25 +10:00
Makefile drm: fix gpu scheduler link order 2018-01-24 15:49:04 -05:00