linux_dsm_epyc7002/drivers
Harry Ciao d519c8d9cc edac: fix edac core deadlock when removing a device
When deleting an edac device, we have to wait for its edac_dev.work to be
completed before deleting the whole edac_dev structure.  Since we have no
idea which work in current edac_poller's workqueue is the work we are
conerned about, we wait for all work in the edac_poller's workqueue to be
proceseed.  This is done via flush_cpu_workqueue() which inserts a
wq_barrier into the tail of the workqueue and then sleeping on the
completion of this wq_barrier.  The edac_poller will wake up sleepers when
it is found.

EDAC core creates only one kernel worker thread, edac_poller, to run the
works of all current edac devices.  They share the same callback function
of edac_device_workq_function(), which would grab the mutex of
device_ctls_mutex first before it checks the device.  This is exactly
where edac_poller and rmmod would have a great chance to deadlock.

In below call trace of rmmod > ... >
edac_device_del_device >
edac_device_workq_teardown > flush_workqueue > flush_cpu_workqueue,

device_ctls_mutex would have already been grabbed by
edac_device_del_device().  So, on one hand rmmod would sleep on the
completion of a wq_barrier, holding device_ctls_mutex; on the other hand
edac_poller would be blocked on the same mutex when it's running any one
of works of existing edac evices(Note, this edac_dev.work is likely to be
totally irrelevant to the one that is being removed right now)and never
would have a chance to run the work of above wq_barrier to wake rmmod up.

edac_device_workq_teardown() should not be called within the critical
region of device_ctls_mutex.  Just like is done in edac_pci_del_device()
and edac_mc_del_mc(), where edac_pci_workq_teardown() and
edac_mc_workq_teardown() are called after related mutex are released.

Moreover, an edac_dev.work should check first if it is being removed.  If
this is the case, then it should bail out immediately.  Since not all of
existing edac devices are to be removed, this "shutting flag" should be
contained to edac device being removed.  The current edac_dev.op_state can
be used to serve this purpose.

The original deadlock problem and the solution have been witnessed and
tested on actual hardware.  Without the solution, rmmod an edac driver
would result in below deadlock:

root@localhost:/root> rmmod mv64x60_edac
EDAC DEBUG: mv64x60_dma_err_remove()
EDAC DEBUG: edac_device_del_device()
EDAC DEBUG: find_edac_device_by_dev()

(hang for a moment)

INFO: task edac-poller:2030 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
edac-poller   D 00000000     0  2030      2
Call Trace:
[df159dc0] [c0071e3c] free_hot_cold_page+0x17c/0x304 (unreliable)
[df159e80] [c000a024] __switch_to+0x6c/0xa0
[df159ea0] [c03587d8] schedule+0x2f4/0x4d8
[df159f00] [c03598a8] __mutex_lock_slowpath+0xa0/0x174
[df159f40] [e1030434] edac_device_workq_function+0x28/0xd8 [edac_core]
[df159f60] [c003beb4] run_workqueue+0x114/0x218
[df159f90] [c003c674] worker_thread+0x5c/0xc8
[df159fd0] [c004106c] kthread+0x5c/0xa0
[df159ff0] [c0013538] original_kernel_thread+0x44/0x60
INFO: task rmmod:2062 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
rmmod         D 0ff2c9fc     0  2062   1839
Call Trace:
[df119c00] [c0437a74] 0xc0437a74 (unreliable)
[df119cc0] [c000a024] __switch_to+0x6c/0xa0
[df119ce0] [c03587d8] schedule+0x2f4/0x4d8
[df119d40] [c03591dc] schedule_timeout+0xb0/0xf4

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-23 15:58:21 -08:00
..
accessibility
acpi ACPI: fix 2.6.28 acpi.debug_level regression 2008-12-19 04:38:32 -05:00
amba
ata pata_hpt366: no ATAPI DMA 2008-12-16 05:40:34 -05:00
atm ATM: horizon, fix hrz_probe fail path 2008-11-29 20:42:28 -08:00
auxdisplay
base sysfs: Fix return values for sysdev_store_{ulong,int} 2008-10-29 15:03:49 -07:00
block cciss: fix problem that deleting multiple logical drives could cause a panic 2008-12-19 08:14:07 +01:00
bluetooth bpa10x: free sk_buff with kfree_skb 2008-10-31 00:40:19 -07:00
cdrom Commands needing to be retried require a complete re-initialization. 2008-12-12 16:04:26 +01:00
char xilinx_hwicap: remove improper wording in license statement 2008-12-17 11:23:07 -08:00
clocksource Merge branches 'timers/clocksource', 'timers/hrtimers', 'timers/nohz', 'timers/ntp', 'timers/posixtimers' and 'timers/debug' into v28-timers-for-linus 2008-10-20 13:14:06 +02:00
connector
cpufreq
cpuidle regression: disable timer peek-ahead for 2.6.28 2008-11-09 16:28:42 -08:00
crypto fix talitos 2008-11-30 10:03:36 -08:00
dca [4/4] dca: fixup initialization dependency 2008-11-10 15:01:03 -08:00
dio
dma async_xor: dma_map destination DMA_BIDIRECTIONAL 2008-12-08 13:46:00 -07:00
edac edac: fix edac core deadlock when removing a device 2008-12-23 15:58:21 -08:00
eisa
firewire firewire: fw-ohci: fix IOMMU resource exhaustion 2008-12-10 12:45:34 +01:00
firmware trivial: dmi_scan typo 2008-11-07 08:25:43 -08:00
gpio gpiolib: extend gpio label column width in debugfs file 2008-11-19 18:49:57 -08:00
gpu drm/i915: GEM on PAE has problems - disable it for now. 2008-12-19 15:38:34 +10:00
hid HID: Apple ALU wireless keyboards are bluetooth devices 2008-11-28 15:09:26 +01:00
hwmon hwmon: applesmc: make applesmc load automatically on startup 2008-12-01 19:55:24 -08:00
i2c i2c-s3c2410: fix check for being in suspend. 2008-12-16 20:19:53 +00:00
ide drivers/ide/{cs5530.c,sc1200.c}: Move a dereference below a NULL test 2008-12-22 23:05:06 +01:00
idle i7300_idle: Kconfig, show menu only on x86_64 2008-10-28 00:14:47 -04:00
ieee1394 ieee1394: add quirk fix for Freecom HDD 2008-12-14 01:13:13 +01:00
infiniband Merge branches 'ehca' and 'mlx4' into for-linus 2008-12-01 10:11:50 -08:00
input Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input 2008-11-30 11:05:21 -08:00
isdn hysdn: fix writing outside the field on 64 bits 2008-12-03 21:01:28 -08:00
leds remove unused #include <version.h>'s 2008-11-01 09:50:12 -07:00
lguest
macintosh rackmeter section fixes 2008-11-30 10:03:37 -08:00
mca
md md: Don't read past end of bitmap when reading bitmap. 2008-12-19 16:25:01 +11:00
media em28xx: remove backward compat macro added on a previous fix 2008-12-01 18:04:14 -02:00
memstick [PATCH] switch memstick 2008-10-21 07:48:33 -04:00
message Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6 2008-12-19 11:37:23 -08:00
mfd mfd: Correct WM8350 I2C return code usage 2008-11-16 19:58:47 +01:00
misc [IA64] Fix GRU compile error w/o CONFIG_HUGETLB_PAGE 2008-12-09 10:06:43 -08:00
mmc mmc: struct device - replace bus_id with dev_name(), dev_set_name() 2008-11-08 21:37:46 +01:00
mtd Merge git://git.infradead.org/mtd-2.6 2008-12-09 08:28:36 -08:00
net ppp: fix segfaults introduced by netdev_priv changes 2008-12-18 19:41:42 -08:00
nubus
of OF-device: Don't overwrite numa_node in device registration 2008-10-31 16:12:01 +11:00
oprofile oprofile: fix memory ordering 2008-10-27 19:15:41 +01:00
parisc [PATCH] introduce fmode_t, do annotations 2008-10-21 07:47:06 -04:00
parport parport_serial: fix array overflow 2008-12-01 19:55:24 -08:00
pci PCI hotplug: ibmphp: Fix module ref count underflow 2008-12-17 16:07:47 -08:00
pcmcia pcmcia: blackfin: fix bug - add missing ; to MODULE macro 2008-12-15 16:27:06 -08:00
pnp drivers: remove duplicated #include 2008-11-04 08:18:19 -08:00
power Merge git://git.infradead.org/battery-2.6 2008-10-20 09:44:30 -07:00
ps3 powerpc/ps3: Fix compile error in ps3-lpm.c 2008-11-05 19:59:08 +11:00
rapidio rapidio section noise 2008-11-30 10:03:37 -08:00
regulator regulator: Use menuconfig in Kconfig 2008-11-09 14:49:23 +00:00
rtc rtc: rtc-isl1208: reject invalid dates 2008-12-23 15:58:21 -08:00
s390 [SCSI] zfcp: prevent double decrement on host_busy while being busy 2008-12-01 10:18:20 -06:00
sbus Revert "of_platform_driver noise on sparce" 2008-12-01 07:55:14 -08:00
scsi Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6 2008-12-19 11:37:23 -08:00
serial Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 2008-12-10 10:04:25 -08:00
sh sh: maple: Do not pass SLAB_POISON to kmem_cache_create() 2008-12-16 16:40:32 +09:00
sn
spi spi: fix spi_s3c24xx_gpio num_chipselect 2008-12-01 19:55:24 -08:00
ssb SSB: hide empty sub menu 2008-11-10 13:50:17 -08:00
staging STAGING: Move staging drivers back to staging-specific menu 2008-12-17 11:23:07 -08:00
tc
telephony telephony: trivial: fix up email address 2008-11-11 09:30:23 -08:00
thermal
uio saner FASYNC handling on file close 2008-11-01 09:49:46 -07:00
usb USB: pl2303: add id for Hewlett-Packard LD220-HP POS pole display 2008-12-17 10:49:15 -08:00
uwb uwb: wrong sizeof argument in mac address compare 2008-10-20 14:37:53 +01:00
video Revert "radeonfb: accelerate imageblit and other improvements" 2008-12-10 16:53:32 -08:00
virtio
w1 w1: fix slave selection on big-endian systems 2008-12-23 15:58:21 -08:00
watchdog iTCO_wdt: fix typo when setting TCO_EN bit 2008-12-03 16:20:19 -08:00
xen xen: fix scrub_page() 2008-11-17 19:11:26 +01:00
zorro
Kconfig regulator: Build on non-ARM platforms 2008-10-28 21:47:17 +00:00
Makefile Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/dvrabel/uwb 2008-10-26 16:35:46 -07:00