linux_dsm_epyc7002/drivers/block
Tetsuo Handa 1e047eaab3 block/loop: fix deadlock after loop_set_status
syzbot is reporting deadlocks at __blkdev_get() [1].

----------------------------------------
[   92.493919] systemd-udevd   D12696   525      1 0x00000000
[   92.495891] Call Trace:
[   92.501560]  schedule+0x23/0x80
[   92.502923]  schedule_preempt_disabled+0x5/0x10
[   92.504645]  __mutex_lock+0x416/0x9e0
[   92.510760]  __blkdev_get+0x73/0x4f0
[   92.512220]  blkdev_get+0x12e/0x390
[   92.518151]  do_dentry_open+0x1c3/0x2f0
[   92.519815]  path_openat+0x5d9/0xdc0
[   92.521437]  do_filp_open+0x7d/0xf0
[   92.527365]  do_sys_open+0x1b8/0x250
[   92.528831]  do_syscall_64+0x6e/0x270
[   92.530341]  entry_SYSCALL_64_after_hwframe+0x42/0xb7

[   92.931922] 1 lock held by systemd-udevd/525:
[   92.933642]  #0: 00000000a2849e25 (&bdev->bd_mutex){+.+.}, at: __blkdev_get+0x73/0x4f0
----------------------------------------

The reason of deadlock turned out that wait_event_interruptible() in
blk_queue_enter() got stuck with bdev->bd_mutex held at __blkdev_put()
due to q->mq_freeze_depth == 1.

----------------------------------------
[   92.787172] a.out           S12584   634    633 0x80000002
[   92.789120] Call Trace:
[   92.796693]  schedule+0x23/0x80
[   92.797994]  blk_queue_enter+0x3cb/0x540
[   92.803272]  generic_make_request+0xf0/0x3d0
[   92.807970]  submit_bio+0x67/0x130
[   92.810928]  submit_bh_wbc+0x15e/0x190
[   92.812461]  __block_write_full_page+0x218/0x460
[   92.815792]  __writepage+0x11/0x50
[   92.817209]  write_cache_pages+0x1ae/0x3d0
[   92.825585]  generic_writepages+0x5a/0x90
[   92.831865]  do_writepages+0x43/0xd0
[   92.836972]  __filemap_fdatawrite_range+0xc1/0x100
[   92.838788]  filemap_write_and_wait+0x24/0x70
[   92.840491]  __blkdev_put+0x69/0x1e0
[   92.841949]  blkdev_close+0x16/0x20
[   92.843418]  __fput+0xda/0x1f0
[   92.844740]  task_work_run+0x87/0xb0
[   92.846215]  do_exit+0x2f5/0xba0
[   92.850528]  do_group_exit+0x34/0xb0
[   92.852018]  SyS_exit_group+0xb/0x10
[   92.853449]  do_syscall_64+0x6e/0x270
[   92.854944]  entry_SYSCALL_64_after_hwframe+0x42/0xb7

[   92.943530] 1 lock held by a.out/634:
[   92.945105]  #0: 00000000a2849e25 (&bdev->bd_mutex){+.+.}, at: __blkdev_put+0x3c/0x1e0
----------------------------------------

The reason of q->mq_freeze_depth == 1 turned out that loop_set_status()
forgot to call blk_mq_unfreeze_queue() at error paths for
info->lo_encrypt_type != NULL case.

----------------------------------------
[   37.509497] CPU: 2 PID: 634 Comm: a.out Tainted: G        W        4.16.0+ #457
[   37.513608] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
[   37.518832] RIP: 0010:blk_freeze_queue_start+0x17/0x40
[   37.521778] RSP: 0018:ffffb0c2013e7c60 EFLAGS: 00010246
[   37.524078] RAX: 0000000000000000 RBX: ffff8b07b1519798 RCX: 0000000000000000
[   37.527015] RDX: 0000000000000002 RSI: ffffb0c2013e7cc0 RDI: ffff8b07b1519798
[   37.529934] RBP: ffffb0c2013e7cc0 R08: 0000000000000008 R09: 47a189966239b898
[   37.532684] R10: dad78b99b278552f R11: 9332dca72259d5ef R12: ffff8b07acd73678
[   37.535452] R13: 0000000000004c04 R14: 0000000000000000 R15: ffff8b07b841e940
[   37.538186] FS:  00007fede33b9740(0000) GS:ffff8b07b8e80000(0000) knlGS:0000000000000000
[   37.541168] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   37.543590] CR2: 00000000206fdf18 CR3: 0000000130b30006 CR4: 00000000000606e0
[   37.546410] Call Trace:
[   37.547902]  blk_freeze_queue+0x9/0x30
[   37.549968]  loop_set_status+0x67/0x3c0 [loop]
[   37.549975]  loop_set_status64+0x3b/0x70 [loop]
[   37.549986]  lo_ioctl+0x223/0x810 [loop]
[   37.549995]  blkdev_ioctl+0x572/0x980
[   37.550003]  block_ioctl+0x34/0x40
[   37.550006]  do_vfs_ioctl+0xa7/0x6d0
[   37.550017]  ksys_ioctl+0x6b/0x80
[   37.573076]  SyS_ioctl+0x5/0x10
[   37.574831]  do_syscall_64+0x6e/0x270
[   37.576769]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
----------------------------------------

[1] https://syzkaller.appspot.com/bug?id=cd662bc3f6022c0979d01a262c318fab2ee9b56f

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: syzbot <bot+48594378e9851eab70bcd6f99327c7db58c5a28a@syzkaller.appspotmail.com>
Fixes: ecdd09597a ("block/loop: fix race between I/O and set_status")
Cc: Ming Lei <tom.leiming@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: stable <stable@vger.kernel.org>
Cc: Jens Axboe <axboe@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-04-10 08:38:46 -06:00
..
aoe aoe: use ktime_t instead of timeval 2018-01-17 08:41:07 -07:00
drbd block: Use blk_queue_flag_*() in drivers instead of queue_flag_*() 2018-03-08 14:13:48 -07:00
mtip32xx mtip32xx: Use the blk_queue_flag_*() functions 2018-03-08 14:13:48 -07:00
paride cdrom: do not call check_disk_change() inside cdrom_open() 2018-03-09 08:06:35 -07:00
rsxx block: Use blk_queue_flag_*() in drivers instead of queue_flag_*() 2018-03-08 14:13:48 -07:00
xen-blkback Merge branch 'for-4.14/block' of git://git.kernel.dk/linux-block 2017-09-07 11:59:42 -07:00
zram block: Move SECTOR_SIZE and SECTOR_SHIFT definitions into <linux/blkdev.h> 2018-03-17 14:45:23 -06:00
amiflop.c genhd: Rename get_disk() to get_disk_and_module() 2018-02-26 09:48:42 -07:00
ataflop.c genhd: Rename get_disk() to get_disk_and_module() 2018-02-26 09:48:42 -07:00
brd.c block: Move SECTOR_SIZE and SECTOR_SHIFT definitions into <linux/blkdev.h> 2018-03-17 14:45:23 -06:00
cryptoloop.c block: cryptoloop - Fix build warning 2017-09-26 07:41:22 -06:00
DAC960.c pci-v4.16-changes 2018-02-06 09:59:40 -08:00
DAC960.h block: DAC960: Replace PCI pool old API 2018-01-02 16:09:50 -06:00
floppy.c genhd: Rename get_disk() to get_disk_and_module() 2018-02-26 09:48:42 -07:00
Kconfig null_blk: remove explicit 'select FAULT_INJECTION' 2018-01-11 07:58:31 -07:00
loop.c block/loop: fix deadlock after loop_set_status 2018-04-10 08:38:46 -06:00
loop.h block/loop: make loop cgroup aware 2017-09-26 07:41:22 -06:00
Makefile License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
nbd.c block: Use blk_queue_flag_*() in drivers instead of queue_flag_*() 2018-03-08 14:13:48 -07:00
null_blk.c block: Move SECTOR_SIZE and SECTOR_SHIFT definitions into <linux/blkdev.h> 2018-03-17 14:45:23 -06:00
pktcdvd.c block: fix a typo 2018-03-01 08:41:27 -07:00
ps3disk.c block: introduce new block status code type 2017-06-09 09:27:32 -06:00
ps3vram.c block/ps3vram: Check return of ps3vram_cache_init 2017-08-17 23:03:44 +10:00
rbd_types.h rbd: RBD_V{1,2}_DATA_FORMAT macros 2017-02-20 12:16:15 +01:00
rbd.c block: Move SECTOR_SIZE and SECTOR_SHIFT definitions into <linux/blkdev.h> 2018-03-17 14:45:23 -06:00
skd_main.c block: Use blk_queue_flag_*() in drivers instead of queue_flag_*() 2018-03-08 14:13:48 -07:00
skd_s1120.h skd: Use __packed only when needed 2017-08-18 08:45:29 -06:00
sunvdc.c treewide: setup_timer() -> timer_setup() 2017-11-21 15:57:07 -08:00
swim3.c treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE casts 2017-11-21 16:35:54 -08:00
swim_asm.S
swim.c genhd: Rename get_disk() to get_disk_and_module() 2018-02-26 09:48:42 -07:00
sx8.c block: introduce new block status code type 2017-06-09 09:27:32 -06:00
umem.c block: Fix a race between the cgroup code and request queue initialization 2018-02-28 12:23:35 -07:00
umem.h
virtio_blk.c virtio, vhost: fixes, cleanups, features 2018-02-08 10:41:00 -08:00
xen-blkfront.c for-4.17/block-20180402 2018-04-05 14:27:02 -07:00
xsysace.c treewide: setup_timer() -> timer_setup() 2017-11-21 15:57:07 -08:00
z2ram.c genhd: Rename get_disk() to get_disk_and_module() 2018-02-26 09:48:42 -07:00