linux_dsm_epyc7002/drivers/gpu/drm
Daniel Vetter 122f46bada drm/i915: fix gpu hang vs. flip stall deadlocks
Since we've started to clean up pending flips when the gpu hangs in

commit 96a02917a0
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date:   Mon Feb 18 19:08:49 2013 +0200

    drm/i915: Finish page flips and update primary planes after a GPU reset

the gpu reset work now also grabs modeset locks. But since work items
on our private work queue are not allowed to do that due to the
flush_workqueue from the pageflip code this results in a neat
deadlock:

INFO: task kms_flip:14676 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kms_flip        D ffff88019283a5c0     0 14676  13344 0x00000004
 ffff88018e62dbf8 0000000000000046 ffff88013bdb12e0 ffff88018e62dfd8
 ffff88018e62dfd8 00000000001d3b00 ffff88019283a5c0 ffff88018ec21000
 ffff88018f693f00 ffff88018eece000 ffff88018e62dd60 ffff88018eece898
Call Trace:
 [<ffffffff8138ee7b>] schedule+0x60/0x62
 [<ffffffffa046c0dd>] intel_crtc_wait_for_pending_flips+0xb2/0x114 [i915]
 [<ffffffff81050ff4>] ? finish_wait+0x60/0x60
 [<ffffffffa0478041>] intel_crtc_set_config+0x7f3/0x81e [i915]
 [<ffffffffa031780a>] drm_mode_set_config_internal+0x4f/0xc6 [drm]
 [<ffffffffa0319cf3>] drm_mode_setcrtc+0x44d/0x4f9 [drm]
 [<ffffffff810e44da>] ? might_fault+0x38/0x86
 [<ffffffffa030d51f>] drm_ioctl+0x2f9/0x447 [drm]
 [<ffffffff8107a722>] ? trace_hardirqs_off+0xd/0xf
 [<ffffffffa03198a6>] ? drm_mode_setplane+0x343/0x343 [drm]
 [<ffffffff8112222f>] ? mntput_no_expire+0x3e/0x13d
 [<ffffffff81117f33>] vfs_ioctl+0x18/0x34
 [<ffffffff81118776>] do_vfs_ioctl+0x396/0x454
 [<ffffffff81396b37>] ? sysret_check+0x1b/0x56
 [<ffffffff81118886>] SyS_ioctl+0x52/0x7d
 [<ffffffff81396b12>] system_call_fastpath+0x16/0x1b
2 locks held by kms_flip/14676:
 #0:  (&dev->mode_config.mutex){+.+.+.}, at: [<ffffffffa0316545>] drm_modeset_lock_all+0x22/0x59 [drm]
 #1:  (&crtc->mutex){+.+.+.}, at: [<ffffffffa031656b>] drm_modeset_lock_all+0x48/0x59 [drm]
INFO: task kworker/u8:4:175 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/u8:4    D ffff88018de9a5c0     0   175      2 0x00000000
Workqueue: i915 i915_error_work_func [i915]
 ffff88018e37dc30 0000000000000046 ffff8801938ab8a0 ffff88018e37dfd8
 ffff88018e37dfd8 00000000001d3b00 ffff88018de9a5c0 ffff88018ec21018
 0000000000000246 ffff88018e37dca0 000000005a865a86 ffff88018de9a5c0
Call Trace:
 [<ffffffff8138ee7b>] schedule+0x60/0x62
 [<ffffffff8138f23d>] schedule_preempt_disabled+0x9/0xb
 [<ffffffff8138d0cd>] mutex_lock_nested+0x205/0x3b1
 [<ffffffffa0477094>] ? intel_display_handle_reset+0x7e/0xbd [i915]
 [<ffffffffa0477094>] ? intel_display_handle_reset+0x7e/0xbd [i915]
 [<ffffffffa0477094>] intel_display_handle_reset+0x7e/0xbd [i915]
 [<ffffffffa044e0a2>] i915_error_work_func+0x128/0x147 [i915]
 [<ffffffff8104a89a>] process_one_work+0x1d4/0x35a
 [<ffffffff8104a821>] ? process_one_work+0x15b/0x35a
 [<ffffffff8104b4a5>] worker_thread+0x144/0x1f0
 [<ffffffff8104b361>] ? rescuer_thread+0x275/0x275
 [<ffffffff8105076d>] kthread+0xac/0xb4
 [<ffffffff81059d30>] ? finish_task_switch+0x3b/0xc0
 [<ffffffff810506c1>] ? __kthread_parkme+0x60/0x60
 [<ffffffff81396a6c>] ret_from_fork+0x7c/0xb0
 [<ffffffff810506c1>] ? __kthread_parkme+0x60/0x60
3 locks held by kworker/u8:4/175:
 #0:  (i915){.+.+.+}, at: [<ffffffff8104a821>] process_one_work+0x15b/0x35a
 #1:  ((&dev_priv->gpu_error.work)){+.+.+.}, at: [<ffffffff8104a821>] process_one_work+0x15b/0x35a
 #2:  (&crtc->mutex){+.+.+.}, at: [<ffffffffa0477094>] intel_display_handle_reset+0x7e/0xbd [i915]

This blew up while running kms_flip/flip-vs-panning-vs-hang-interruptible
on one of my older machines.

Unfortunately (despite the proper lockdep annotations for
flush_workqueue) lockdep still doesn't detect this correctly, so we
need to rely on chance to discover these bugs.

Apply the usual bugfix and schedule the reset work on the system
workqueue to keep our own driver workqueue free of any modeset lock
grabbing.

Note that this is not a terribly serious regression since before the
offending commit we'd simply have stalled userspace forever due to
failing to abort all outstanding pageflips.

v2: Add a comment as requested by Chris.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-09-05 14:48:04 +02:00
..
ast Merge branch 'drm-next-3.12' of git://people.freedesktop.org/~agd5f/linux into drm-next 2013-09-02 09:31:40 +10:00
cirrus Merge branch 'drm-next-3.12' of git://people.freedesktop.org/~agd5f/linux into drm-next 2013-09-02 09:31:40 +10:00
exynos Merge branch 'drm-next-3.12' of git://people.freedesktop.org/~agd5f/linux into drm-next 2013-09-02 09:31:40 +10:00
gma500 Merge branch 'drm-next-3.12' of git://people.freedesktop.org/~agd5f/linux into drm-next 2013-09-02 09:31:40 +10:00
i2c drm/i2c: tda998x: prepare for broken sync workaround 2013-08-19 09:10:48 +10:00
i810 drm: rip out drm_core_has_MTRR checks 2013-08-19 14:11:44 +10:00
i915 drm/i915: fix gpu hang vs. flip stall deadlocks 2013-09-05 14:48:04 +02:00
mga drm: rip out drm_core_has_MTRR checks 2013-08-19 14:11:44 +10:00
mgag200 Merge branch 'drm-next-3.12' of git://people.freedesktop.org/~agd5f/linux into drm-next 2013-09-02 09:31:40 +10:00
msm drm/msm: convert to drm_bridge 2013-09-02 10:23:35 +10:00
nouveau drm/nouveau: Support render nodes 2013-09-02 10:51:47 +10:00
omapdrm drm: Pass page flip ioctl flags to driver 2013-08-30 09:24:54 +10:00
qxl drm: verify vma access in TTM+GEM drivers 2013-08-27 11:54:58 +10:00
r128 drm: rip out drm_core_has_MTRR checks 2013-08-19 14:11:44 +10:00
radeon drm/radeon: support render nodes 2013-09-02 10:51:53 +10:00
rcar-du drm: Pass page flip ioctl flags to driver 2013-08-30 09:24:54 +10:00
savage drm: rip out drm_core_has_MTRR checks 2013-08-19 14:11:44 +10:00
shmobile drm: Pass page flip ioctl flags to driver 2013-08-30 09:24:54 +10:00
sis drm: rip out drm_core_has_MTRR checks 2013-08-19 14:11:44 +10:00
tdfx drm: rip out drm_core_has_MTRR checks 2013-08-19 14:11:44 +10:00
tilcdc drm: Pass page flip ioctl flags to driver 2013-08-30 09:24:54 +10:00
ttm drm/ttm: kill unused functions 2013-08-19 09:36:12 +10:00
udl drm/udl: use gem get/put page helpers 2013-08-19 10:36:12 +10:00
via drm: rip out drm_core_has_MTRR checks 2013-08-19 14:11:44 +10:00
vmwgfx Merge branch 'drm-next-3.12' of git://people.freedesktop.org/~agd5f/linux into drm-next 2013-09-02 09:31:40 +10:00
ati_pcigart.c
drm_agpsupport.c drm/agp: move AGP cleanup paths to drm_agpsupport.c 2013-08-07 10:14:24 +10:00
drm_auth.c
drm_buffer.c
drm_bufs.c drm: remove the dma_ioctl special-case 2013-08-19 14:15:50 +10:00
drm_cache.c lib/scatterlist: sg_page_iter: support sg lists w/o backing pages 2013-03-27 17:13:44 +01:00
drm_context.c drm: mark context support as a legacy subsystem 2013-08-19 10:04:48 +10:00
drm_crtc_helper.c drm: Add drm_bridge 2013-09-02 10:23:26 +10:00
drm_crtc.c drm: fix DRM_IOCTL_MODE_GETFB handle-leak 2013-09-02 10:51:36 +10:00
drm_debugfs.c
drm_dma.c drm: mark dma setup/teardown as legacy systems 2013-08-19 10:04:21 +10:00
drm_dp_helper.c
drm_drv.c drm: implement experimental render nodes 2013-08-30 08:43:57 +10:00
drm_edid_load.c drm: avoid warning in drm_load_edid_firmware() 2013-07-10 14:21:46 -07:00
drm_edid.c Merge branch 'drm-next-3.12' of git://people.freedesktop.org/~agd5f/linux into drm-next 2013-09-02 09:31:40 +10:00
drm_encoder_slave.c drm: refactor call to request_module 2013-05-10 14:46:03 +10:00
drm_fb_cma_helper.c drm: Make drm_fb_cma_describe() static 2013-08-21 12:47:41 +10:00
drm_fb_helper.c drm/fb-helper: Make load_lut and gamma_set/gamma_get hooks optional 2013-06-17 19:42:47 +10:00
drm_flip_work.c drm: add flip-work helper 2013-08-19 10:32:26 +10:00
drm_fops.c drm: implement experimental render nodes 2013-08-30 08:43:57 +10:00
drm_gem_cma_helper.c drm/gem: create drm_gem_dumb_destroy 2013-08-07 09:59:24 +10:00
drm_gem.c drm/prime: Remove PRIME handles only if supported 2013-08-30 09:11:59 +10:00
drm_global.c
drm_hashtab.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
drm_info.c drm/gem: switch dev->object_name_lock to a mutex 2013-08-21 12:58:01 +10:00
drm_ioc32.c
drm_ioctl.c drm: Advertise async page flip ability through GETCAP ioctl 2013-08-30 09:25:13 +10:00
drm_irq.c drm: Don't pass negative delta to ktime_sub_ns() 2013-08-08 09:50:25 +10:00
drm_lock.c
drm_memory.c drm/memory: don't export agp helpers 2013-08-19 10:05:53 +10:00
drm_mm.c Merge tag 'drm-intel-next-2013-08-23' of git://people.freedesktop.org/~danvet/drm-intel into drm-next 2013-08-30 09:47:41 +10:00
drm_modes.c drm: Remove drm_mode_list_concat() 2013-08-21 12:47:24 +10:00
drm_pci.c drm: implement experimental render nodes 2013-08-30 08:43:57 +10:00
drm_platform.c drm: implement experimental render nodes 2013-08-30 08:43:57 +10:00
drm_prime.c drm/prime: double lock typo 2013-08-30 08:58:32 +10:00
drm_rect.c drm: Add drm_rect_debug_print() 2013-04-30 22:20:00 +02:00
drm_scatter.c drm: disallow legacy sg ioctls for modesetting drivers 2013-08-19 10:04:06 +10:00
drm_stub.c drm: implement experimental render nodes 2013-08-30 08:43:57 +10:00
drm_sysfs.c drm: Convert drm class driver from legacy pm ops to dev_pm_ops 2013-07-04 10:50:26 +10:00
drm_trace_points.c
drm_trace.h drm: fix print format of sequence in trace point 2013-07-04 10:55:27 +10:00
drm_usb.c drm: implement experimental render nodes 2013-08-30 08:43:57 +10:00
drm_vm.c drm: rip out drm_core_has_MTRR checks 2013-08-19 14:11:44 +10:00
drm_vma_manager.c drm/vma: add access management helpers 2013-08-27 11:54:54 +10:00
Kconfig Merge tag 'drm-intel-next-2013-08-23' of git://people.freedesktop.org/~danvet/drm-intel into drm-next 2013-08-30 09:47:41 +10:00
Makefile drm/msm: basic KMS driver for snapdragon 2013-08-24 14:57:07 -04:00
README.drm

************************************************************
* For the very latest on DRI development, please see:      *
*     http://dri.freedesktop.org/                          *
************************************************************

The Direct Rendering Manager (drm) is a device-independent kernel-level
device driver that provides support for the XFree86 Direct Rendering
Infrastructure (DRI).

The DRM supports the Direct Rendering Infrastructure (DRI) in four major
ways:

    1. The DRM provides synchronized access to the graphics hardware via
       the use of an optimized two-tiered lock.

    2. The DRM enforces the DRI security policy for access to the graphics
       hardware by only allowing authenticated X11 clients access to
       restricted regions of memory.

    3. The DRM provides a generic DMA engine, complete with multiple
       queues and the ability to detect the need for an OpenGL context
       switch.

    4. The DRM is extensible via the use of small device-specific modules
       that rely extensively on the API exported by the DRM module.


Documentation on the DRI is available from:
    http://dri.freedesktop.org/wiki/Documentation
    http://sourceforge.net/project/showfiles.php?group_id=387
    http://dri.sourceforge.net/doc/

For specific information about kernel-level support, see:

    The Direct Rendering Manager, Kernel Support for the Direct Rendering
    Infrastructure
    http://dri.sourceforge.net/doc/drm_low_level.html

    Hardware Locking for the Direct Rendering Infrastructure
    http://dri.sourceforge.net/doc/hardware_locking_low_level.html

    A Security Analysis of the Direct Rendering Infrastructure
    http://dri.sourceforge.net/doc/security_low_level.html