linux_dsm_epyc7002/drivers
Shaohua Li ed3b98c71c MD: add rdev reference for super write
Xiao Ni reported below crash:
[26396.335146] BUG: unable to handle kernel NULL pointer dereference at 00000000000002a8
[26396.342990] IP: [<ffffffffa0425b00>] super_written+0x20/0x80 [md_mod]
[26396.349449] PGD 0
[26396.351468] Oops: 0002 [#1] SMP
[26396.354898] Modules linked in: ext4 mbcache jbd2 raid456 async_raid6_recov async_memcpy async_pq async_xor xor async_td
[26396.408404] CPU: 5 PID: 3261 Comm: loop0 Not tainted 4.5.0 #1
[26396.414140] Hardware name: Dell Inc. PowerEdge R715/0G2DP3, BIOS 3.2.2 09/15/2014
[26396.421608] task: ffff8808339be680 ti: ffff8808365f4000 task.ti: ffff8808365f4000
[26396.429074] RIP: 0010:[<ffffffffa0425b00>]  [<ffffffffa0425b00>] super_written+0x20/0x80 [md_mod]
[26396.437952] RSP: 0018:ffff8808365f7c38  EFLAGS: 00010046
[26396.443252] RAX: ffffffffa0425ae0 RBX: ffff8804336a7900 RCX: ffffe8f9f7b41198
[26396.450371] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8804336a7900
[26396.457489] RBP: ffff8808365f7c50 R08: 0000000000000005 R09: 00001801e02ce3d7
[26396.464608] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
[26396.471728] R13: ffff8808338d9a00 R14: 0000000000000000 R15: ffff880833f9fe00
[26396.478849] FS:  00007f9e5066d740(0000) GS:ffff880237b40000(0000) knlGS:0000000000000000
[26396.486922] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[26396.492656] CR2: 00000000000002a8 CR3: 00000000019ea000 CR4: 00000000000006e0
[26396.499775] Stack:
[26396.501781]  ffff8804336a7900 0000000000000000 0000000000000000 ffff8808365f7c68
[26396.509199]  ffffffff81308cd0 ffff8804336a7900 ffff8808365f7ca8 ffffffff81310637
[26396.516618]  00000000a0233a00 ffff880833f9fe00 0000000000000000 ffff880833fb0000
[26396.524038] Call Trace:
[26396.526485]  [<ffffffff81308cd0>] bio_endio+0x40/0x60
[26396.531529]  [<ffffffff81310637>] blk_update_request+0x87/0x320
[26396.537439]  [<ffffffff8131a20a>] blk_mq_end_request+0x1a/0x70
[26396.543261]  [<ffffffff81313889>] blk_flush_complete_seq+0xd9/0x2a0
[26396.549517]  [<ffffffff81313ccf>] flush_end_io+0x15f/0x240
[26396.554993]  [<ffffffff8131a22a>] blk_mq_end_request+0x3a/0x70
[26396.560815]  [<ffffffff8131a314>] __blk_mq_complete_request+0xb4/0xe0
[26396.567246]  [<ffffffff8131a35c>] blk_mq_complete_request+0x1c/0x20
[26396.573506]  [<ffffffffa04182df>] loop_queue_work+0x6f/0x72c [loop]
[26396.579764]  [<ffffffff81697844>] ? __schedule+0x2b4/0x8f0
[26396.585242]  [<ffffffff810a7812>] kthread_worker_fn+0x52/0x170
[26396.591065]  [<ffffffff810a77c0>] ? kthread_create_on_node+0x1a0/0x1a0
[26396.597582]  [<ffffffff810a7238>] kthread+0xd8/0xf0
[26396.602453]  [<ffffffff810a7160>] ? kthread_park+0x60/0x60
[26396.607929]  [<ffffffff8169bdcf>] ret_from_fork+0x3f/0x70
[26396.613319]  [<ffffffff810a7160>] ? kthread_park+0x60/0x60

md_super_write() and corresponding md_super_wait() generally are called
with reconfig_mutex locked, which prevents disk disappears. There is one
case this rule is broken. write_sb_page of bitmap.c doesn't hold the
mutex. next_active_rdev does increase rdev reference, but it decreases
the reference too early (eg, before IO finish). disk can disappear at
the window. We unconditionally increase rdev reference in
md_super_write() to avoid the race.

Reported-and-tested-by: Xiao Ni <xni@redhat.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Signed-off-by: Shaohua Li <shli@fb.com>
2016-03-31 10:04:18 -07:00
..
accessibility
acpi Power management and ACPI material for v4.6-rc1, part 2 2016-03-24 22:59:58 -07:00
amba
android
ata Merge branch 'for-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata 2016-03-18 20:06:46 -07:00
atm Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2016-03-17 21:38:27 -07:00
auxdisplay
base Power management and ACPI material for v4.6-rc1, part 2 2016-03-24 22:59:58 -07:00
bcma
block Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client 2016-03-26 15:53:16 -07:00
bluetooth
bus arm[64] perf updates for 4.6: 2016-03-21 13:14:16 -07:00
cdrom
char Revert "ppdev: use new parport device model" 2016-03-25 09:02:13 -07:00
clk The clk changes for this release cycle are mostly dominated by 2016-03-23 06:06:45 -07:00
clocksource Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-03-24 10:32:42 -07:00
connector
cpufreq Power management and ACPI material for v4.6-rc1, part 2 2016-03-24 22:59:58 -07:00
cpuidle cpuidle: menu: Fall back to polling if next timer event is near 2016-03-21 15:50:28 +01:00
crypto Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2016-03-23 06:12:39 -07:00
dca
devfreq PM / devfreq: Spelling s/frequnecy/frequency/ 2016-03-17 02:30:16 +01:00
dio
dma asm-generic changes for 4.6 2016-03-24 23:13:48 -07:00
dma-buf dma-buf: Update docs for SYNC ioctl 2016-03-21 09:26:45 +01:00
edac EDAC queue for 4.6 2016-03-16 08:36:55 -07:00
eisa
extcon
firewire IEEE 1394 subsystem patch: 2016-03-25 08:52:25 -07:00
firmware kernel: add kcov code coverage 2016-03-22 15:36:02 -07:00
fmc
fpga
gpio gpio: xgene: Prevent NULL pointer dereference 2016-03-30 10:39:39 +02:00
gpu Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux 2016-03-25 08:48:31 -07:00
hid drivers/hid/uhid.c: check write() bitness using in_compat_syscall 2016-03-22 15:36:02 -07:00
hsi
hv Char/Misc patches for 4.6-rc1 2016-03-17 13:47:50 -07:00
hwmon hwmon: (max1111) Return -ENODEV from max1111_read_channel if not instantiated 2016-03-27 10:37:48 -07:00
hwspinlock
hwtracing
i2c Merge branch 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux 2016-03-22 12:47:40 -07:00
ide ide: palm_bk3710: test clock rate to avoid division by 0 2016-03-20 16:59:27 -04:00
idle intel_idle: Support for Intel Xeon Phi Processor x200 Product Family 2016-03-23 16:19:38 -04:00
iio - New Drivers 2016-03-18 10:15:11 -07:00
infiniband Merge branch 'for-next-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending 2016-03-23 15:57:39 -07:00
input Merge branch 'akpm' (patches from Andrew) 2016-03-25 16:59:11 -07:00
iommu IOMMU Updates for Linux v4.6 2016-03-22 11:57:43 -07:00
ipack
irqchip irqchip/mbigen: Make CONFIG_HISILICON_IRQ_MBIGEN a hidden option 2016-03-23 12:02:29 +01:00
isdn isdn: Use ktime_t instead of 'struct timeval' 2016-03-20 16:47:13 -04:00
leds platform-drivers-x86 for 4.6-1 2016-03-23 17:20:59 -07:00
lguest
lightnvm lightnvm: do not load L2P table if not supported 2016-03-18 18:10:38 -07:00
macintosh
mailbox Merge branch 'mailbox-for-next' of git://git.linaro.org/landing-teams/working/fujitsu/integration 2016-03-23 06:09:15 -07:00
mcb
md MD: add rdev reference for super write 2016-03-31 10:04:18 -07:00
media [media] vsp1: use proper dma alloc/free functions 2016-03-21 13:49:01 -07:00
memory MTD updates for v4.6 2016-03-24 19:57:15 -07:00
memstick drivers/memstick/host/r592.c: avoid gcc-6 warning 2016-03-25 16:37:42 -07:00
message
mfd - New Drivers 2016-03-18 10:15:11 -07:00
misc Merge branch 'mm-pkeys-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-03-20 19:08:56 -07:00
mmc MMC core: 2016-03-21 14:35:52 -07:00
mtd MTD updates for v4.6 2016-03-24 19:57:15 -07:00
net asm-generic changes for 4.6 2016-03-24 23:13:48 -07:00
nfc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2016-03-19 10:05:34 -07:00
ntb NTB: Remove _addr functions from ntb_hw_amd 2016-03-26 11:44:33 -04:00
nubus
nvdimm x86, pmem: use memcpy_mcsafe() for memcpy_from_pmem() 2016-03-28 17:19:31 -07:00
nvme nvme: avoid cqe corruption when update at the same time as read 2016-03-22 10:27:29 -06:00
nvmem
of DeviceTree updates for 4.6: 2016-03-19 15:15:07 -07:00
oprofile
parisc PCI changes for the v4.6 merge window: 2016-03-16 14:45:55 -07:00
parport
pci Revert "PCI: dra7xx: Mark driver as broken" 2016-03-22 07:50:11 -05:00
pcmcia
perf drivers/perf: arm_pmu: avoid NULL dereference when not using devicetree 2016-03-21 11:36:17 +00:00
phy
pinctrl Merge branch 'akpm' (patches from Andrew) 2016-03-18 19:26:54 -07:00
platform platform-drivers-x86 for 4.6-1 2016-03-23 17:20:59 -07:00
pnp
power Power management and ACPI material for v4.6-rc1, part 2 2016-03-25 16:55:37 -07:00
powercap
pps
ps3
ptp Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-03-15 12:13:56 -07:00
pwm pwm: omap-dmtimer: Add debug message for effective period and duty cycle 2016-03-23 17:11:48 +01:00
rapidio rapidio: add mport char device driver 2016-03-22 15:36:02 -07:00
ras
regulator - New Drivers 2016-03-18 10:15:11 -07:00
remoteproc
reset
rpmsg
rtc RTC for 4.6 #2 2016-03-24 22:49:08 -07:00
s390 virtio/vhost: new features, performance improvements, cleanups 2016-03-20 13:28:18 -07:00
sbus
scsi SCSI misc on 20160326 2016-03-26 11:31:01 -07:00
sfi
sh
sn
soc ARM: SoC driver updates for v4.6 2016-03-20 15:40:32 -07:00
spi dmaengine updates for 4.6 2016-03-17 12:34:54 -07:00
spmi
ssb
staging Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux 2016-03-25 08:48:31 -07:00
target Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending 2016-03-22 12:41:14 -07:00
tc
thermal Thermal: Ignore invalid trip points 2016-03-18 14:10:57 +08:00
thunderbolt
tty xen: features and fixes for 4.6-rc0 2016-03-22 12:55:17 -07:00
uio
usb The clk changes for this release cycle are mostly dominated by 2016-03-23 06:06:45 -07:00
uwb
vfio VFIO updates for v4.6-rc1 2016-03-17 13:05:09 -07:00
vhost Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending 2016-03-22 12:41:14 -07:00
video The clk changes for this release cycle are mostly dominated by 2016-03-23 06:06:45 -07:00
virt
virtio virtio/vhost: new features, performance improvements, cleanups 2016-03-20 13:28:18 -07:00
vlynq
vme
w1
watchdog hpwdt: use nmi_panic() when kernel panics in NMI handler 2016-03-22 15:36:02 -07:00
xen xen: features and fixes for 4.6-rc0 2016-03-22 12:55:17 -07:00
zorro
Kconfig
Makefile