linux_dsm_epyc7002/drivers/md
Cong Wang 5b1f5bc332 md: use a mutex to protect a global list
We saw a list corruption in the list all_detected_devices:

 WARNING: CPU: 16 PID: 226 at lib/list_debug.c:29 __list_add+0x3c/0xa9()
 list_add corruption. next->prev should be prev (ffff880859d58320), but was ffff880859ce74c0. (next=ffffffff81abfdb0).
 Modules linked in: ahci libahci libata sd_mod scsi_mod
 CPU: 16 PID: 226 Comm: kworker/u241:4 Not tainted 4.1.20 #1
 Hardware name: Dell Inc. PowerEdge C6220/04GD66, BIOS 2.2.3 11/07/2013
 Workqueue: events_unbound async_run_entry_fn
  0000000000000000 ffff880859a5baf8 ffffffff81502872 ffff880859a5bb48
  0000000000000009 ffff880859a5bb38 ffffffff810692a5 ffff880859ee8828
  ffffffff812ad02c ffff880859d58320 ffffffff81abfdb0 ffff880859eb90c0
 Call Trace:
  [<ffffffff81502872>] dump_stack+0x4d/0x63
  [<ffffffff810692a5>] warn_slowpath_common+0xa1/0xbb
  [<ffffffff812ad02c>] ? __list_add+0x3c/0xa9
  [<ffffffff81069305>] warn_slowpath_fmt+0x46/0x48
  [<ffffffff812ad02c>] __list_add+0x3c/0xa9
  [<ffffffff81406f28>] md_autodetect_dev+0x41/0x62
  [<ffffffff81285862>] rescan_partitions+0x25f/0x29d
  [<ffffffff81506372>] ? mutex_lock+0x13/0x31
  [<ffffffff811a090f>] __blkdev_get+0x1aa/0x3cd
  [<ffffffff811a0b91>] blkdev_get+0x5f/0x294
  [<ffffffff81377ceb>] ? put_device+0x17/0x19
  [<ffffffff8128227c>] ? disk_put_part+0x12/0x14
  [<ffffffff812836f3>] add_disk+0x29d/0x407
  [<ffffffff81384345>] ? __pm_runtime_use_autosuspend+0x5c/0x64
  [<ffffffffa004a724>] sd_probe_async+0x115/0x1af [sd_mod]
  [<ffffffff81083177>] async_run_entry_fn+0x72/0x12c
  [<ffffffff8107c44c>] process_one_work+0x198/0x2ce
  [<ffffffff8107cac7>] worker_thread+0x1dd/0x2bb
  [<ffffffff8107c8ea>] ? cancel_delayed_work_sync+0x15/0x15
  [<ffffffff8107c8ea>] ? cancel_delayed_work_sync+0x15/0x15
  [<ffffffff81080d9c>] kthread+0xae/0xb6
  [<ffffffff81080000>] ? param_array_set+0x40/0xfa
  [<ffffffff81080cee>] ? __kthread_parkme+0x61/0x61
  [<ffffffff81508152>] ret_from_fork+0x42/0x70
  [<ffffffff81080cee>] ? __kthread_parkme+0x61/0x61

I suspect it is because there is no lock protecting this
global list, autostart_arrays() is called in ioctl() path
where there is no lock.

Cc: Shaohua Li <shli@kernel.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2016-06-09 09:37:23 -07:00
..
bcache bcache: switch to using blk_queue_write_cache() 2016-04-12 16:00:39 -06:00
persistent-data dm space map metadata: remove unused variable in brb_pop() 2015-12-14 09:26:01 -05:00
bitmap.c md-cluster: gather resync infos and enable recv_thread after bitmap is ready 2016-05-09 09:24:03 -07:00
bitmap.h md-cluster: sync bitmap when node received RESYNCING msg 2016-05-04 12:39:35 -07:00
dm-bio-prison.c
dm-bio-prison.h
dm-bio-record.h
dm-bufio.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-01-12 17:11:47 -08:00
dm-bufio.h
dm-builtin.c
dm-cache-block-types.h
dm-cache-metadata.c dm cache metadata: fix cmd_read_lock() acquiring write lock 2016-04-17 11:24:46 -04:00
dm-cache-metadata.h dm cache: make sure every metadata function checks fail_io 2016-03-10 17:12:12 -05:00
dm-cache-policy-cleaner.c - Revert a dm-multipath change that caused a regression for unprivledged 2015-11-04 21:19:53 -08:00
dm-cache-policy-internal.h
dm-cache-policy-smq.c dm cache policy smq: clarify that mq registration failure was for 'mq' 2016-03-10 17:12:11 -05:00
dm-cache-policy.c
dm-cache-policy.h
dm-cache-target.c dm cache: bump the target version 2016-03-10 17:12:12 -05:00
dm-crypt.c Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2016-03-17 11:22:54 -07:00
dm-delay.c dm: rename target's per_bio_data_size to per_io_data_size 2016-02-22 22:34:37 -05:00
dm-era-target.c dm persistent data: eliminate unnecessary return values 2015-10-31 19:06:02 -04:00
dm-exception-store.c - Revert a dm-multipath change that caused a regression for unprivledged 2015-11-04 21:19:53 -08:00
dm-exception-store.h dm snapshot: fix hung bios when copy error occurs 2016-01-08 20:03:05 -05:00
dm-flakey.c dm: rename target's per_bio_data_size to per_io_data_size 2016-02-22 22:34:37 -05:00
dm-io.c md: more open-coded offset_in_page() 2016-01-04 10:29:12 -05:00
dm-ioctl.c dm ioctl: drop use of __GFP_REPEAT in copy_params()'s __vmalloc() call 2016-05-05 15:25:55 -04:00
dm-kcopyd.c mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd 2015-11-06 17:50:42 -08:00
dm-linear.c dm linear: remove redundant target name from error messages 2015-10-31 19:06:03 -04:00
dm-log-userspace-base.c dm: drop NULL test before kmem_cache_destroy() and mempool_destroy() 2015-10-31 19:06:00 -04:00
dm-log-userspace-transfer.c
dm-log-userspace-transfer.h
dm-log-writes.c dm: rename target's per_bio_data_size to per_io_data_size 2016-02-22 22:34:37 -05:00
dm-log.c
dm-mpath.c dm mpath: eliminate use of spinlock in IO fast-paths 2016-05-05 15:25:52 -04:00
dm-mpath.h
dm-path-selector.c
dm-path-selector.h dm path selector: remove 'repeat_count' return from .select_path hook 2016-02-22 22:34:42 -05:00
dm-queue-length.c dm path selector: remove 'repeat_count' return from .select_path hook 2016-02-22 22:34:42 -05:00
dm-raid1.c dm: rename target's per_bio_data_size to per_io_data_size 2016-02-22 22:34:37 -05:00
dm-raid.c dm raid: make sure no feature flags are set in metadata 2016-05-13 09:03:51 -04:00
dm-region-hash.c dm: convert ffs to __ffs 2015-10-31 19:06:01 -04:00
dm-round-robin.c dm round robin: use percpu 'repeat_count' and 'current_path' 2016-02-22 22:34:42 -05:00
dm-service-time.c dm path selector: remove 'repeat_count' return from .select_path hook 2016-02-22 22:34:42 -05:00
dm-snap-persistent.c dm snapshot: fix hung bios when copy error occurs 2016-01-08 20:03:05 -05:00
dm-snap-transient.c dm snapshot: fix hung bios when copy error occurs 2016-01-08 20:03:05 -05:00
dm-snap.c dm snapshot: disallow the COW and origin devices from being identical 2016-03-10 17:12:09 -05:00
dm-stats.c
dm-stats.h
dm-stripe.c
dm-switch.c dm switch: simplify conditional in alloc_region_table() 2015-10-31 19:06:06 -04:00
dm-sysfs.c
dm-table.c block: kill off q->flush_flags 2016-04-13 13:33:19 -06:00
dm-target.c dm: set DM_TARGET_WILDCARD feature on "error" target 2016-02-22 11:06:21 -05:00
dm-thin-metadata.c dm thin metadata: don't issue prefetches if a transaction abort has failed 2016-03-10 17:12:09 -05:00
dm-thin-metadata.h
dm-thin.c dm thin: unroll issue_discard() to create longer discard bio chains 2016-05-13 09:04:20 -04:00
dm-uevent.c
dm-uevent.h
dm-verity-fec.c dm: rename target's per_bio_data_size to per_io_data_size 2016-02-22 22:34:37 -05:00
dm-verity-fec.h dm verity: add support for forward error correction 2015-12-10 10:39:03 -05:00
dm-verity-target.c dm: rename target's per_bio_data_size to per_io_data_size 2016-02-22 22:34:37 -05:00
dm-verity.h dm verity: add ignore_zero_blocks feature 2015-12-10 10:39:03 -05:00
dm-zero.c
dm.c dm: remove unused mapped_device argument from free_tio() 2016-05-05 15:25:49 -04:00
dm.h dm: allow immutable request-based targets to use blk-mq pdu 2016-02-22 22:34:37 -05:00
faulty.c MD: rename some functions 2016-01-20 13:52:20 -08:00
Kconfig dm: add missing newline between DM_DEBUG_BLOCK_STACK_TRACING and DM_BUFIO 2016-03-10 17:12:11 -05:00
linear.c
linear.h
Makefile dm cache: make the 'mq' policy an alias for 'smq' 2016-03-10 17:12:08 -05:00
md-cluster.c md-cluster: check the return value of process_recvd_msg 2016-05-09 09:24:04 -07:00
md-cluster.h md-cluster: gather resync infos and enable recv_thread after bitmap is ready 2016-05-09 09:24:03 -07:00
md.c md: use a mutex to protect a global list 2016-06-09 09:37:23 -07:00
md.h md-cluster: fix deadlock issue when add disk to an recoverying array 2016-06-03 16:22:59 -07:00
multipath.c md: multipath: don't hardcopy bio in .make_request path 2016-03-14 11:32:26 -07:00
multipath.h
raid0.c md/raid0: remove empty line printk from dump_zones 2016-04-25 08:43:58 -07:00
raid0.h
raid1.c md: set MD_CHANGE_PENDING in a atomic region 2016-05-09 09:24:02 -07:00
raid1.h
raid5-cache.c Merge tag 'md/4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md 2016-05-19 17:25:13 -07:00
raid5.c right meaning of PARITY_ENABLE_RMW and PARITY_PREFER_RMW 2016-05-25 21:26:07 -07:00
raid5.h RAID5: revert e9e4c377e2 to fix a livelock 2016-02-26 09:44:56 -08:00
raid10.c md: set MD_CHANGE_PENDING in a atomic region 2016-05-09 09:24:02 -07:00
raid10.h