linux_dsm_epyc7002/drivers
Sagi Grimberg b9156daeb1 nvme: fix a possible deadlock when passthru commands sent to a multipath device
When the user issues a command with side effects, we will end up freezing
the namespace request queue when updating disk info (and the same for
the corresponding mpath disk node).

However, we are not freezing the mpath node request queue,
which means that mpath I/O can still come in and block on blk_queue_enter
(called from nvme_ns_head_make_request -> direct_make_request).

This is a deadlock, because blk_queue_enter will block until the inner
namespace request queue is unfroze, but that process is blocked because
the namespace revalidation is trying to update the mpath disk info
and freeze its request queue (which will never complete because
of the I/O that is blocked on blk_queue_enter).

Fix this by freezing all the subsystem nsheads request queues before
executing the passthru command. Given that these commands are infrequent
we should not worry about this temporary I/O freeze to keep things sane.

Here is the matching hang traces:
--
[ 374.465002] INFO: task systemd-udevd:17994 blocked for more than 122 seconds.
[ 374.472975] Not tainted 5.2.0-rc3-mpdebug+ #42
[ 374.478522] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 374.487274] systemd-udevd D 0 17994 1 0x00000000
[ 374.493407] Call Trace:
[ 374.496145] __schedule+0x2ef/0x620
[ 374.500047] schedule+0x38/0xa0
[ 374.503569] blk_queue_enter+0x139/0x220
[ 374.507959] ? remove_wait_queue+0x60/0x60
[ 374.512540] direct_make_request+0x60/0x130
[ 374.517219] nvme_ns_head_make_request+0x11d/0x420 [nvme_core]
[ 374.523740] ? generic_make_request_checks+0x307/0x6f0
[ 374.529484] generic_make_request+0x10d/0x2e0
[ 374.534356] submit_bio+0x75/0x140
[ 374.538163] ? guard_bio_eod+0x32/0xe0
[ 374.542361] submit_bh_wbc+0x171/0x1b0
[ 374.546553] block_read_full_page+0x1ed/0x330
[ 374.551426] ? check_disk_change+0x70/0x70
[ 374.556008] ? scan_shadow_nodes+0x30/0x30
[ 374.560588] blkdev_readpage+0x18/0x20
[ 374.564783] do_read_cache_page+0x301/0x860
[ 374.569463] ? blkdev_writepages+0x10/0x10
[ 374.574037] ? prep_new_page+0x88/0x130
[ 374.578329] ? get_page_from_freelist+0xa2f/0x1280
[ 374.583688] ? __alloc_pages_nodemask+0x179/0x320
[ 374.588947] read_cache_page+0x12/0x20
[ 374.593142] read_dev_sector+0x2d/0xd0
[ 374.597337] read_lba+0x104/0x1f0
[ 374.601046] find_valid_gpt+0xfa/0x720
[ 374.605243] ? string_nocheck+0x58/0x70
[ 374.609534] ? find_valid_gpt+0x720/0x720
[ 374.614016] efi_partition+0x89/0x430
[ 374.618113] ? string+0x48/0x60
[ 374.621632] ? snprintf+0x49/0x70
[ 374.625339] ? find_valid_gpt+0x720/0x720
[ 374.629828] check_partition+0x116/0x210
[ 374.634214] rescan_partitions+0xb6/0x360
[ 374.638699] __blkdev_reread_part+0x64/0x70
[ 374.643377] blkdev_reread_part+0x23/0x40
[ 374.647860] blkdev_ioctl+0x48c/0x990
[ 374.651956] block_ioctl+0x41/0x50
[ 374.655766] do_vfs_ioctl+0xa7/0x600
[ 374.659766] ? locks_lock_inode_wait+0xb1/0x150
[ 374.664832] ksys_ioctl+0x67/0x90
[ 374.668539] __x64_sys_ioctl+0x1a/0x20
[ 374.672732] do_syscall_64+0x5a/0x1c0
[ 374.676828] entry_SYSCALL_64_after_hwframe+0x44/0xa9

[ 374.738474] INFO: task nvmeadm:49141 blocked for more than 123 seconds.
[ 374.745871] Not tainted 5.2.0-rc3-mpdebug+ #42
[ 374.751419] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 374.760170] nvmeadm D 0 49141 36333 0x00004080
[ 374.766301] Call Trace:
[ 374.769038] __schedule+0x2ef/0x620
[ 374.772939] schedule+0x38/0xa0
[ 374.776452] blk_mq_freeze_queue_wait+0x59/0x100
[ 374.781614] ? remove_wait_queue+0x60/0x60
[ 374.786192] blk_mq_freeze_queue+0x1a/0x20
[ 374.790773] nvme_update_disk_info.isra.57+0x5f/0x350 [nvme_core]
[ 374.797582] ? nvme_identify_ns.isra.50+0x71/0xc0 [nvme_core]
[ 374.804006] __nvme_revalidate_disk+0xe5/0x110 [nvme_core]
[ 374.810139] nvme_revalidate_disk+0xa6/0x120 [nvme_core]
[ 374.816078] ? nvme_submit_user_cmd+0x11e/0x320 [nvme_core]
[ 374.822299] nvme_user_cmd+0x264/0x370 [nvme_core]
[ 374.827661] nvme_dev_ioctl+0x112/0x1d0 [nvme_core]
[ 374.833114] do_vfs_ioctl+0xa7/0x600
[ 374.837117] ? __audit_syscall_entry+0xdd/0x130
[ 374.842184] ksys_ioctl+0x67/0x90
[ 374.845891] __x64_sys_ioctl+0x1a/0x20
[ 374.850082] do_syscall_64+0x5a/0x1c0
[ 374.854178] entry_SYSCALL_64_after_hwframe+0x44/0xa9
--

Reported-by: James Puthukattukaran <james.puthukattukaran@oracle.com>
Tested-by: James Puthukattukaran <james.puthukattukaran@oracle.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2019-07-31 17:59:01 -07:00
..
accessibility
acpi libnvdimm fixes v5.3-rc2 2019-07-27 08:25:51 -07:00
amba Driver Core and debugfs changes for 5.3-rc1 2019-07-12 12:24:03 -07:00
android binder: prevent transactions to context manager from its own process. 2019-07-24 11:02:28 +02:00
ata libata: zpodd: Fix small read overflow in zpodd_get_mech_type() 2019-07-29 16:00:14 -06:00
atm atm: idt77252: Remove call to memset after dma_alloc_coherent 2019-07-15 11:06:27 -07:00
auxdisplay It's been a relatively busy cycle for docs: 2019-07-09 12:34:26 -07:00
base Char/Misc driver fixes for 5.3-rc2 2019-07-28 10:26:10 -07:00
bcma
block ataflop: Mark expected switch fall-through 2019-07-29 15:24:58 -06:00
bluetooth Bluetooth: btusb: Add protocol support for MediaTek MT7663U USB devices 2019-07-06 21:44:25 +02:00
bus ARM: SoC-related driver updates 2019-07-19 17:13:56 -07:00
cdrom
char hpet: Fix division by zero in hpet_time_div() 2019-07-25 14:39:51 +02:00
clk This round of clk driver and framework updates is heavy on the driver update 2019-07-17 10:07:48 -07:00
clocksource clocksource/drivers/npcm: Fix misuse of GENMASK macro 2019-07-10 11:05:26 +02:00
connector connector: remove redundant input callback from cn_dev 2019-07-21 13:31:14 -07:00
counter Staging / IIO driver update for 5.3-rc1 2019-07-11 15:36:02 -07:00
cpufreq cpufreq/pasemi: fix use-after-free in pas_cpufreq_cpu_init() 2019-07-23 09:49:10 +02:00
cpuidle Merge branch 'pm-cpufreq' 2019-07-18 09:49:30 +02:00
crypto Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2019-07-19 12:23:37 -07:00
dax Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
dca
devfreq
dio
dma dmaengine updates for v5.3-rc1 2019-07-17 09:55:43 -07:00
dma-buf Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
edac
eisa
extcon
firewire firewire: mark expected switch fall-throughs 2019-07-25 20:09:37 -05:00
firmware Merge branch 'for-linus-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/ibft 2019-07-26 09:43:43 -07:00
fpga fpga-manager: altera-ps-spi: Fix build error 2019-07-24 11:29:41 +02:00
fsi
gnss
gpio GPIO fixes for the v5.3 merge window: 2019-07-17 13:05:21 -07:00
gpu Wimplicit-fallthrough patches for 5.3-rc2 2019-07-27 11:04:18 -07:00
hid Linux 5.2 2019-07-15 09:42:32 -07:00
hsi
hv proc/sysctl: add shared variables for range check 2019-07-18 17:08:07 -07:00
hwmon hwmon: (nct6775) Fix register address and added missed tolerance for nct6106 2019-07-21 19:18:45 -07:00
hwspinlock
hwtracing coresight: Make the coresight_device_fwnode_match declaration's fwnode parameter const 2019-07-12 14:42:05 -07:00
i2c docs: fix broken doc references due to renames 2019-07-17 06:57:51 -03:00
i3c * Drop support for 10-bit I2C addresses 2019-07-09 09:04:31 -07:00
ide It's been a relatively busy cycle for docs: 2019-07-09 12:34:26 -07:00
idle
iio Driver Core and debugfs changes for 5.3-rc1 2019-07-12 12:24:03 -07:00
infiniband SCSI fixes on 20190720 2019-07-20 10:04:58 -07:00
input Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input 2019-07-20 12:22:30 -07:00
interconnect
iommu iommu/amd: Add support for X2APIC IOMMU interrupts 2019-07-23 17:41:52 +02:00
ipack TTY / Serial driver updates for 5.3-rc1 2019-07-11 15:38:21 -07:00
irqchip irqchip/gic-v3-its: Fix misuse of GENMASK macro 2019-07-10 11:04:17 +02:00
isdn ISDN: hfcsusb: checking idx of ep configuration 2019-07-15 11:10:31 -07:00
leds LED updates for 5.3-rc1 2019-07-09 08:59:39 -07:00
lightnvm
macintosh powerpc updates for 5.3 2019-07-13 16:08:36 -07:00
mailbox - stm32: race fix by adding a spinlock 2019-07-14 16:36:51 -07:00
mcb
md for-linus-20190726 2019-07-26 10:32:12 -07:00
media media updates for v5.3-rc1 2019-07-22 09:01:47 -07:00
memory Kbuild updates for v5.3 (2nd) 2019-07-20 09:34:55 -07:00
memstick MMC core: 2019-07-11 18:11:21 -07:00
message SCSI misc on 20190709 2019-07-11 15:14:01 -07:00
mfd - Core Frameworks 2019-07-15 20:18:40 -07:00
misc eeprom: make older eeprom drivers select NVMEM_SYSFS 2019-07-25 14:39:51 +02:00
mmc MMC core: 2019-07-11 18:11:21 -07:00
mtd mtd: onenand_base: Mark expected switch fall-through 2019-07-25 20:09:54 -05:00
mux
net Wimplicit-fallthrough patches for 5.3-rc2 2019-07-27 11:04:18 -07:00
nfc nfc: st-nci: remove redundant assignment to variable r 2019-07-02 12:00:50 -07:00
ntb New feature to add support for NTB virtual MSI interrupts, the ability 2019-07-21 09:46:59 -07:00
nubus
nvdimm libnvdimm fixes v5.3-rc2 2019-07-27 08:25:51 -07:00
nvme nvme: fix a possible deadlock when passthru commands sent to a multipath device 2019-07-31 17:59:01 -07:00
nvmem Driver Core and debugfs changes for 5.3-rc1 2019-07-12 12:24:03 -07:00
of virtio, vhost: fixes, features, performance 2019-07-17 11:26:09 -07:00
opp pci-v5.3-changes 2019-07-15 20:44:49 -07:00
oprofile vfs: Convert oprofilefs to use the new mount API 2019-07-04 22:01:59 -04:00
parisc
parport It's been a relatively busy cycle for docs: 2019-07-09 12:34:26 -07:00
pci New feature to add support for NTB virtual MSI interrupts, the ability 2019-07-21 09:46:59 -07:00
pcmcia It's been a relatively busy cycle for docs: 2019-07-09 12:34:26 -07:00
perf docs: perf: move to the admin-guide 2019-07-15 09:20:27 -03:00
phy phy: for 5.3 2019-07-01 15:04:59 +02:00
pinctrl This is the bulk of pin control changes for the v5.3 kernel 2019-07-13 15:02:27 -07:00
platform platform/x86: asus: Rename "fan mode" to "fan boost mode" 2019-07-17 19:07:58 +03:00
pnp docs: driver-api: add a series of orphaned documents 2019-07-15 11:03:02 -03:00
power power supply and reset changes for the v5.3 series 2019-07-15 21:06:15 -07:00
powercap powercap: Invoke powercap_init() and rapl_init() earlier 2019-07-22 11:23:00 +02:00
pps drivers/pps/pps.c: clear offset flags in PPS_SETPARAMS ioctl 2019-07-16 19:23:24 -07:00
ps3
ptp
pwm pwm: Changes for v5.3-rc1 2019-07-09 08:57:45 -07:00
rapidio Merge branch 'akpm' (patches from Andrew) 2019-07-17 08:58:04 -07:00
ras
regulator - Core Frameworks 2019-07-15 20:18:40 -07:00
remoteproc remoteproc updates for v5.3 2019-07-17 11:44:41 -07:00
reset ARM: SoC-related driver updates 2019-07-19 17:13:56 -07:00
rpmsg
rtc RTC for 5.3 2019-07-17 10:03:50 -07:00
s390 virtio/s390: fix race on airq_areas[] 2019-07-26 13:36:18 +02:00
sbus
scsi SCSI fixes on 20190726 2019-07-26 14:20:28 -07:00
sfi
sh
siox
slimbus
sn
soc Merge branch 'pdf_fixes_v1' of https://git.linuxtv.org/mchehab/experimental into mauro 2019-07-22 13:51:20 -06:00
soundwire soundwire updates for v5.3-rc1 2019-07-05 08:15:08 +02:00
spi Driver Core and debugfs changes for 5.3-rc1 2019-07-12 12:24:03 -07:00
spmi
ssb
staging docs conversion for v5.3-rc1 2019-07-16 12:21:41 -07:00
target scsi: target: cxgbit: add support for IEEE_8021QAZ_APP_SEL_STREAM selector 2019-07-22 17:04:20 -04:00
tc
tee
thermal int340X/processor_thermal_device: Fix proc_thermal_rapl_remove() 2019-07-23 09:36:07 +02:00
thunderbolt Driver Core and debugfs changes for 5.3-rc1 2019-07-12 12:24:03 -07:00
tty TTY fixes for 5.3-rc2 2019-07-28 10:18:33 -07:00
uio
usb xhci: Fix crash if scatter gather is used with Immediate Data Transfer (IDT). 2019-07-25 11:26:42 +02:00
uwb
vfio VFIO updates for v5.3-rc1 2019-07-17 11:23:13 -07:00
vhost virtio, vhost: fixes, features, performance 2019-07-17 11:26:09 -07:00
video - New Functionality 2019-07-16 09:25:04 -07:00
virt
virtio Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
visorbus
vlynq
vme
w1 docs: driver-api: add a series of orphaned documents 2019-07-15 11:03:02 -03:00
watchdog watchdog: digicolor_wdt: Remove unused variable in dc_wdt_probe 2019-07-15 08:49:11 +02:00
xen xen: fixes and features for 5.3-rc1 2019-07-19 11:41:26 -07:00
zorro
Kconfig
Makefile