linux_dsm_epyc7002/drivers/md
Shaohua Li 61eb2b43b9 md/raid1/10: fix potential deadlock
Neil Brown pointed out a potential deadlock in raid 10 code with
bio_split/chain. The raid1 code could have the same issue, but recent
barrier rework makes it less likely to happen. The deadlock happens in
below sequence:

1. generic_make_request(bio), this will set current->bio_list
2. raid10_make_request will split bio to bio1 and bio2
3. __make_request(bio1), wait_barrer, add underlayer disk bio to
current->bio_list
4. __make_request(bio2), wait_barrer

If raise_barrier happens between 3 & 4, since wait_barrier runs at 3,
raise_barrier waits for IO completion from 3. And since raise_barrier
sets barrier, 4 waits for raise_barrier. But IO from 3 can't be
dispatched because raid10_make_request() doesn't finished yet.

The solution is to adjust the IO ordering. Quotes from Neil:
"
It is much safer to:

    if (need to split) {
        split = bio_split(bio, ...)
        bio_chain(...)
        make_request_fn(split);
        generic_make_request(bio);
   } else
        make_request_fn(mddev, bio);

This way we first process the initial section of the bio (in 'split')
which will queue some requests to the underlying devices.  These
requests will be queued in generic_make_request.
Then we queue the remainder of the bio, which will be added to the end
of the generic_make_request queue.
Then we return.
generic_make_request() will pop the lower-level device requests off the
queue and handle them first.  Then it will process the remainder
of the original bio once the first section has been fully processed.
"

Note, this only happens in read path. In write path, the bio is flushed to
underlaying disks either by blk flush (from schedule) or offladed to raid1/10d.
It's queued in current->bio_list.

Cc: Coly Li <colyli@suse.de>
Cc: stable@vger.kernel.org (v3.14+, only the raid10 part)
Suggested-by: NeilBrown <neilb@suse.com>
Reviewed-by: Jack Wang <jinpu.wang@profitbricks.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2017-03-09 09:02:42 -08:00
..
bcache sched/headers: Prepare to use <linux/rcuupdate.h> instead of <linux/rculist.h> in <linux/sched.h> 2017-03-02 08:42:38 +01:00
persistent-data sched/headers: Prepare to move the get_task_struct()/put_task_struct() and related APIs from <linux/sched.h> to <linux/sched/task.h> 2017-03-02 08:42:40 +01:00
bitmap.c md: separate flags for superblock changes 2016-12-08 22:01:47 -08:00
bitmap.h
dm-bio-prison.c
dm-bio-prison.h
dm-bio-record.h
dm-bufio.c sched/headers: Prepare to move the memalloc_noio_*() APIs to <linux/sched/mm.h> 2017-03-02 08:42:33 +01:00
dm-bufio.h
dm-builtin.c
dm-cache-block-types.h linux: drop __bitwise__ everywhere 2016-12-16 00:13:41 +02:00
dm-cache-metadata.c dm cache metadata: use cursor api in blocks_are_clean_separate_dirty() 2017-02-16 13:12:51 -05:00
dm-cache-metadata.h dm cache metadata: add "metadata2" feature 2017-02-16 13:12:47 -05:00
dm-cache-policy-cleaner.c
dm-cache-policy-internal.h
dm-cache-policy-smq.c dm cache policy smq: use hash_32() instead of hash_32_generic() 2016-12-08 19:42:37 -05:00
dm-cache-policy.c
dm-cache-policy.h
dm-cache-target.c - Fix dm-raid transient device failure processing and other smaller 2017-02-21 12:11:41 -08:00
dm-core.h dm: always defer request allocation to the owner of the request_queue 2017-01-27 15:08:35 -07:00
dm-crypt.c KEYS: Differentiate uses of rcu_dereference_key() and user_key_payload() 2017-03-02 10:09:00 +11:00
dm-delay.c
dm-era-target.c block: Use pointer to backing_dev_info from request_queue 2017-02-02 08:20:48 -07:00
dm-exception-store.c
dm-exception-store.h
dm-flakey.c dm flakey: introduce "error_writes" feature 2016-12-13 15:01:31 -05:00
dm-io.c
dm-ioctl.c sched/headers: Prepare to move the memalloc_noio_*() APIs to <linux/sched/mm.h> 2017-03-02 08:42:33 +01:00
dm-kcopyd.c
dm-linear.c
dm-log-userspace-base.c
dm-log-userspace-transfer.c
dm-log-userspace-transfer.h
dm-log-writes.c
dm-log.c
dm-mpath.c Merge branch 'for-4.11/next' into for-4.11/linus-merge 2017-02-17 14:08:19 -07:00
dm-mpath.h
dm-path-selector.c
dm-path-selector.h
dm-queue-length.c
dm-raid1.c Merge branch 'for-4.10/block' of git://git.kernel.dk/linux-block 2016-12-13 10:19:16 -08:00
dm-raid.c dm raid: bump the target version 2017-02-28 16:47:52 -05:00
dm-region-hash.c
dm-round-robin.c dm round robin: revert "use percpu 'repeat_count' and 'current_path'" 2017-02-17 00:54:09 -05:00
dm-rq.c dm-rq: don't dereference request payload after ending request 2017-02-24 13:19:32 -07:00
dm-rq.h dm: always defer request allocation to the owner of the request_queue 2017-01-27 15:08:35 -07:00
dm-service-time.c
dm-snap-persistent.c
dm-snap-transient.c
dm-snap.c
dm-stats.c dm stats: fix a leaked s->histogram_boundaries array 2017-02-16 14:17:07 -05:00
dm-stats.h
dm-stripe.c
dm-switch.c
dm-sysfs.c
dm-table.c block: Use pointer to backing_dev_info from request_queue 2017-02-02 08:20:48 -07:00
dm-target.c dm: always defer request allocation to the owner of the request_queue 2017-01-27 15:08:35 -07:00
dm-thin-metadata.c
dm-thin-metadata.h
dm-thin.c block: Use pointer to backing_dev_info from request_queue 2017-02-02 08:20:48 -07:00
dm-uevent.c
dm-uevent.h
dm-verity-fec.c
dm-verity-fec.h
dm-verity-target.c
dm-verity.h
dm-zero.c
dm.c sched/headers: Prepare to move signal wakeup & sigpending methods from <linux/sched.h> into <linux/sched/signal.h> 2017-03-02 08:42:32 +01:00
dm.h dm: always defer request allocation to the owner of the request_queue 2017-01-27 15:08:35 -07:00
faulty.c md: fast clone bio in bio_clone_mddev() 2017-02-15 11:24:54 -08:00
Kconfig
linear.c Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md 2017-02-24 14:42:19 -08:00
linear.h md linear: fix a race between linear_add() and linear_congested() 2017-02-13 09:17:50 -08:00
Makefile
md-cluster.c md-cluster: remove useless memset from gather_all_resync_info 2017-03-09 09:02:06 -08:00
md-cluster.h
md.c md: don't impose the MD_SB_DISKS limit on arrays without metadata. 2017-03-09 09:02:30 -08:00
md.h md: delete dead code 2017-03-09 09:01:29 -08:00
multipath.c Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md 2017-02-24 14:42:19 -08:00
multipath.h
raid0.c Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md 2017-02-24 14:42:19 -08:00
raid0.h
raid1.c md/raid1/10: fix potential deadlock 2017-03-09 09:02:42 -08:00
raid1.h RAID1: avoid unnecessary spin locks in I/O barrier code 2017-02-19 22:04:25 -08:00
raid5-cache.c md/raid5-cache: exclude reclaiming stripes in reclaim check 2017-02-13 09:20:05 -08:00
raid5.c md: move funcs from pers->resize to update_size 2017-03-09 09:02:18 -08:00
raid5.h md/raid5-cache: exclude reclaiming stripes in reclaim check 2017-02-13 09:20:05 -08:00
raid10.c md/raid1/10: fix potential deadlock 2017-03-09 09:02:42 -08:00
raid10.h