linux_dsm_epyc7002/block
Joseph Qi 4c6994806f blk-throttle: fix race between blkcg_bio_issue_check() and cgroup_rmdir()
We've triggered a WARNING in blk_throtl_bio() when throttling writeback
io, which complains blkg->refcnt is already 0 when calling blkg_get(),
and then kernel crashes with invalid page request.
After investigating this issue, we've found it is caused by a race
between blkcg_bio_issue_check() and cgroup_rmdir(), which is described
below:

writeback kworker               cgroup_rmdir
                                  cgroup_destroy_locked
                                    kill_css
                                      css_killed_ref_fn
                                        css_killed_work_fn
                                          offline_css
                                            blkcg_css_offline
  blkcg_bio_issue_check
    rcu_read_lock
    blkg_lookup
                                              spin_trylock(q->queue_lock)
                                              blkg_destroy
                                              spin_unlock(q->queue_lock)
    blk_throtl_bio
    spin_lock_irq(q->queue_lock)
    ...
    spin_unlock_irq(q->queue_lock)
  rcu_read_unlock

Since rcu can only prevent blkg from releasing when it is being used,
the blkg->refcnt can be decreased to 0 during blkg_destroy() and schedule
blkg release.
Then trying to blkg_get() in blk_throtl_bio() will complains the WARNING.
And then the corresponding blkg_put() will schedule blkg release again,
which result in double free.
This race is introduced by commit ae11889636 ("blkcg: consolidate blkg
creation in blkcg_bio_issue_check()"). Before this commit, it will
lookup first and then try to lookup/create again with queue_lock. Since
revive this logic is a bit drastic, so fix it by only offlining pd during
blkcg_css_offline(), and move the rest destruction (especially
blkg_put()) into blkcg_css_free(), which should be the right way as
discussed.

Fixes: ae11889636 ("blkcg: consolidate blkg creation in blkcg_bio_issue_check()")
Reported-by: Jiufei Xue <jiufei.xue@linux.alibaba.com>
Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-03-16 10:35:12 -06:00
..
partitions partitions/msdos: Unable to mount UFS 44bsd partitions 2018-01-10 09:12:16 -07:00
badblocks.c badblocks: fix wrong return value in badblocks_set if badblocks are disabled 2017-11-03 11:29:50 -07:00
bfq-cgroup.c block, bfq: put async queues for root bfq groups too 2018-01-09 08:45:25 -07:00
bfq-iosched.c block, bfq: add requeue-request hook 2018-02-07 15:17:46 -07:00
bfq-iosched.h block, bfq: limit sectors served with interactive weight raising 2018-01-18 08:21:37 -07:00
bfq-wf2q.c block, bfq: limit sectors served with interactive weight raising 2018-01-18 08:21:37 -07:00
bio-integrity.c block: Fix __bio_integrity_endio() documentation 2018-01-17 09:59:33 -07:00
bio.c Merge branch 'for-4.16/block' of git://git.kernel.dk/linux-block 2018-01-29 11:51:49 -08:00
blk-cgroup.c blk-throttle: fix race between blkcg_bio_issue_check() and cgroup_rmdir() 2018-03-16 10:35:12 -06:00
blk-core.c block: Introduce blk_queue_flag_{set,clear,test_and_{set,clear}}() 2018-03-08 14:13:48 -07:00
blk-exec.c blk-mq-sched: remove unused 'can_block' arg from blk_mq_sched_insert_request 2018-01-17 09:49:21 -07:00
blk-flush.c blk-mq: don't allocate driver tag upfront for flush rq 2017-11-04 12:40:13 -06:00
blk-integrity.c block: switch bios to blk_status_t 2017-06-09 09:27:32 -06:00
blk-ioc.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
blk-lib.c block: add bdev_read_only() checks to common helpers 2018-01-18 12:57:19 -07:00
blk-map.c Merge branch 'for-4.16/block' of git://git.kernel.dk/linux-block 2018-01-29 11:51:49 -08:00
blk-merge.c blk-mq: fix discard merge with scheduler attached 2018-02-01 14:01:02 -07:00
blk-mq-cpumap.c blk-mq: map queues to all present CPUs 2017-07-24 10:01:31 -06:00
blk-mq-debugfs.c blk-mq-debugfs: Show zone locking information 2018-02-28 12:23:35 -07:00
blk-mq-debugfs.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
blk-mq-pci.c blk-mq-pci: add a fallback when pci_irq_get_affinity returns NULL 2017-08-18 08:08:14 -06:00
blk-mq-rdma.c block: Add rdma affinity based queue mapping helper 2017-08-08 14:58:03 -04:00
blk-mq-sched.c blk-mq-sched: Enable merging discard bio into request 2018-02-01 14:45:11 -07:00
blk-mq-sched.h blk-mq-sched: remove unused 'can_block' arg from blk_mq_sched_insert_request 2018-01-17 09:49:21 -07:00
blk-mq-sysfs.c block: properly protect the 'queue' kobj in blk_unregister_queue 2018-01-15 08:41:38 -07:00
blk-mq-tag.c blk-mq: improve heavily contended tag case 2017-12-22 11:09:37 -07:00
blk-mq-tag.h Merge branch 'for-4.15/block' of git://git.kernel.dk/linux-block 2017-11-14 15:32:19 -08:00
blk-mq-virtio.c blk-mq: provide a default queue mapping for virtio device 2017-02-27 20:54:05 +02:00
blk-mq.c block: Protect queue flag changes with the queue lock 2018-03-08 14:13:48 -07:00
blk-mq.h blk-mq: Rename blk_mq_request_direct_issue() into blk_mq_request_issue_directly() 2018-01-19 12:51:59 -07:00
blk-settings.c block: Introduce blk_queue_flag_{set,clear,test_and_{set,clear}}() 2018-03-08 14:13:48 -07:00
blk-softirq.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
blk-stat.c block: Protect queue flag changes with the queue lock 2018-03-08 14:13:48 -07:00
blk-stat.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
blk-sysfs.c block: Introduce blk_queue_flag_{set,clear,test_and_{set,clear}}() 2018-03-08 14:13:48 -07:00
blk-tag.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
blk-throttle.c Merge branch 'for-4.16/block' of git://git.kernel.dk/linux-block 2018-01-29 11:51:49 -08:00
blk-timeout.c block: Introduce blk_queue_flag_{set,clear,test_and_{set,clear}}() 2018-03-08 14:13:48 -07:00
blk-wbt.c blk-wbt: account flush requests correctly 2018-02-06 14:14:03 -07:00
blk-wbt.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
blk-zoned.c block: Suppress kernel-doc warnings triggered by blk-zoned.c 2018-03-09 08:23:32 -07:00
blk.h block: Move the queue_flag_*() functions from a public into a private header file 2018-03-08 14:13:48 -07:00
bounce.c Merge branch 'for-4.16/block' of git://git.kernel.dk/linux-block 2018-01-29 11:51:49 -08:00
bsg-lib.c bsg: split handling of SCSI CDBs vs transport requeues 2018-03-13 11:40:24 -06:00
bsg.c bsg: split handling of SCSI CDBs vs transport requeues 2018-03-13 11:40:24 -06:00
cfq-iosched.c block/cfq: cache rightmost rb_node 2017-09-08 18:26:49 -07:00
cmdline-parser.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
compat_ioctl.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
deadline-iosched.c deadline-iosched: Introduce zone locking support 2018-01-05 09:22:17 -07:00
elevator.c block: Document scheduler modification locking requirements 2018-01-18 12:54:42 -07:00
genhd.c genhd: Fix BUG in blkdev_open() 2018-02-26 09:48:42 -07:00
ioctl.c block: pass inclusive 'lend' parameter to truncate_inode_pages_range 2018-02-23 15:20:19 -07:00
ioprio.c block: Add fallthrough markers to switch statements 2017-06-21 11:46:07 -06:00
Kconfig License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
Kconfig.iosched License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
kyber-iosched.c block: kyber: fix domain token leak during requeue 2018-02-24 15:55:54 -07:00
Makefile License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
mq-deadline.c mq-deadline: make it clear that __dd_dispatch_request() works on all hw queues 2018-01-06 09:23:11 -07:00
noop-iosched.c block: move existing elevator ops to union 2017-01-17 10:03:33 -07:00
opal_proto.h block: sed-opal: Set MBRDone on S3 resume path if TPER is MBREnabled 2017-09-11 09:45:52 -06:00
partition-generic.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
scsi_ioctl.c block: silently forbid sending any ioctl to a partition 2018-01-10 12:30:37 -07:00
sed-opal.c block: sed-opal: fix u64 short atom length 2018-03-16 10:15:27 -06:00
t10-pi.c t10-pi: Move opencoded contants to common header 2017-07-03 16:56:25 -06:00