linux_dsm_epyc7002/fs/btrfs
Filipe David Borba Manana 3fe81ce206 Btrfs: fix deadlock when iterating inode refs and running delayed inodes
While running btrfs/004 from xfstests, after 503 iterations, dmesg reported
a deadlock between tasks iterating inode refs and tasks running delayed inodes
(during a transaction commit).

It turns out that iterating inode refs implies doing one tree search and
release all nodes in the path except the leaf node, and then passing that
leaf node to btrfs_ref_to_path(), which in turn does another tree search
without releasing the lock on the leaf node it received as parameter.

This is a problem when other task wants to write to the btree as well and
ends up updating the leaf that is read locked - the writer task locks the
parent of the leaf and then blocks waiting for the leaf's lock to be
released - at the same time, the task executing btrfs_ref_to_path()
does a second tree search, without releasing the lock on the first leaf,
and wants to access a leaf (the same or another one) that is a child of
the same parent, resulting in a deadlock.

The trace reported by lockdep follows.

[84314.936373] INFO: task fsstress:11930 blocked for more than 120 seconds.
[84314.936381]       Tainted: G        W  O 3.12.0-fdm-btrfs-next-16+ #70
[84314.936383] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[84314.936386] fsstress        D ffff8806e1bf8000     0 11930  11926 0x00000000
[84314.936393]  ffff8804d6d89b78 0000000000000046 ffff8804d6d89b18 ffffffff810bd8bd
[84314.936399]  ffff8806e1bf8000 ffff8804d6d89fd8 ffff8804d6d89fd8 ffff8804d6d89fd8
[84314.936405]  ffff880806308000 ffff8806e1bf8000 ffff8804d6d89c08 ffff8804deb8f190
[84314.936410] Call Trace:
[84314.936421]  [<ffffffff810bd8bd>] ? trace_hardirqs_on+0xd/0x10
[84314.936428]  [<ffffffff81774269>] schedule+0x29/0x70
[84314.936451]  [<ffffffffa0715bf5>] btrfs_tree_lock+0x75/0x270 [btrfs]
[84314.936457]  [<ffffffff810715c0>] ? __init_waitqueue_head+0x60/0x60
[84314.936470]  [<ffffffffa06ba231>] btrfs_search_slot+0x7f1/0x930 [btrfs]
[84314.936489]  [<ffffffffa0731c2a>] ? __btrfs_run_delayed_items+0x13a/0x1e0 [btrfs]
[84314.936504]  [<ffffffffa06d2e1f>] btrfs_lookup_inode+0x2f/0xa0 [btrfs]
[84314.936510]  [<ffffffff810bd6ef>] ? trace_hardirqs_on_caller+0x1f/0x1e0
[84314.936528]  [<ffffffffa073173c>] __btrfs_update_delayed_inode+0x4c/0x1d0 [btrfs]
[84314.936543]  [<ffffffffa0731c2a>] ? __btrfs_run_delayed_items+0x13a/0x1e0 [btrfs]
[84314.936558]  [<ffffffffa0731c2a>] ? __btrfs_run_delayed_items+0x13a/0x1e0 [btrfs]
[84314.936573]  [<ffffffffa0731c82>] __btrfs_run_delayed_items+0x192/0x1e0 [btrfs]
[84314.936589]  [<ffffffffa0731d03>] btrfs_run_delayed_items+0x13/0x20 [btrfs]
[84314.936604]  [<ffffffffa06dbcd4>] btrfs_flush_all_pending_stuffs+0x24/0x80 [btrfs]
[84314.936620]  [<ffffffffa06ddc13>] btrfs_commit_transaction+0x223/0xa20 [btrfs]
[84314.936630]  [<ffffffffa06ae5ae>] btrfs_sync_fs+0x6e/0x110 [btrfs]
[84314.936635]  [<ffffffff811d0b50>] ? __sync_filesystem+0x60/0x60
[84314.936639]  [<ffffffff811d0b50>] ? __sync_filesystem+0x60/0x60
[84314.936643]  [<ffffffff811d0b70>] sync_fs_one_sb+0x20/0x30
[84314.936648]  [<ffffffff811a3541>] iterate_supers+0xf1/0x100
[84314.936652]  [<ffffffff811d0c45>] sys_sync+0x55/0x90
[84314.936658]  [<ffffffff8177ef12>] system_call_fastpath+0x16/0x1b
[84314.936660] INFO: lockdep is turned off.
[84314.936663] INFO: task btrfs:11955 blocked for more than 120 seconds.
[84314.936666]       Tainted: G        W  O 3.12.0-fdm-btrfs-next-16+ #70
[84314.936668] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[84314.936670] btrfs           D ffff880541729a88     0 11955  11608 0x00000000
[84314.936674]  ffff880541729a38 0000000000000046 ffff8805417299d8 ffffffff810bd8bd
[84314.936680]  ffff88075430c8a0 ffff880541729fd8 ffff880541729fd8 ffff880541729fd8
[84314.936685]  ffffffff81c104e0 ffff88075430c8a0 ffff8804de8b00b8 ffff8804de8b0000
[84314.936690] Call Trace:
[84314.936695]  [<ffffffff810bd8bd>] ? trace_hardirqs_on+0xd/0x10
[84314.936700]  [<ffffffff81774269>] schedule+0x29/0x70
[84314.936717]  [<ffffffffa0715815>] btrfs_tree_read_lock+0xd5/0x140 [btrfs]
[84314.936721]  [<ffffffff810715c0>] ? __init_waitqueue_head+0x60/0x60
[84314.936733]  [<ffffffffa06ba201>] btrfs_search_slot+0x7c1/0x930 [btrfs]
[84314.936746]  [<ffffffffa06bd505>] btrfs_find_item+0x55/0x160 [btrfs]
[84314.936763]  [<ffffffffa06ff689>] ? free_extent_buffer+0x49/0xc0 [btrfs]
[84314.936780]  [<ffffffffa073c9ca>] btrfs_ref_to_path+0xba/0x1e0 [btrfs]
[84314.936797]  [<ffffffffa06f9719>] ? release_extent_buffer+0xb9/0xe0 [btrfs]
[84314.936813]  [<ffffffffa06ff689>] ? free_extent_buffer+0x49/0xc0 [btrfs]
[84314.936830]  [<ffffffffa073cb50>] inode_to_path+0x60/0xd0 [btrfs]
[84314.936846]  [<ffffffffa073d365>] paths_from_inode+0x115/0x3c0 [btrfs]
[84314.936851]  [<ffffffff8118dd44>] ? kmem_cache_alloc_trace+0x114/0x200
[84314.936868]  [<ffffffffa0714494>] btrfs_ioctl+0xf14/0x2030 [btrfs]
[84314.936873]  [<ffffffff817762db>] ? _raw_spin_unlock+0x2b/0x50
[84314.936877]  [<ffffffff8116598f>] ? handle_mm_fault+0x34f/0xb00
[84314.936882]  [<ffffffff81075563>] ? up_read+0x23/0x40
[84314.936886]  [<ffffffff8177a41c>] ? __do_page_fault+0x20c/0x5a0
[84314.936892]  [<ffffffff811b2946>] do_vfs_ioctl+0x96/0x570
[84314.936896]  [<ffffffff81776e23>] ? error_sti+0x5/0x6
[84314.936901]  [<ffffffff810b71e8>] ? trace_hardirqs_off_caller+0x28/0xd0
[84314.936906]  [<ffffffff81776a09>] ? retint_swapgs+0xe/0x13
[84314.936910]  [<ffffffff811b2eb1>] SyS_ioctl+0x91/0xb0
[84314.936915]  [<ffffffff813eecde>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[84314.936920]  [<ffffffff8177ef12>] system_call_fastpath+0x16/0x1b
[84314.936922] INFO: lockdep is turned off.
[84434.866873] INFO: task btrfs-transacti:11921 blocked for more than 120 seconds.
[84434.866881]       Tainted: G        W  O 3.12.0-fdm-btrfs-next-16+ #70
[84434.866883] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[84434.866886] btrfs-transacti D ffff880755b6a478     0 11921      2 0x00000000
[84434.866893]  ffff8800735b9ce8 0000000000000046 ffff8800735b9c88 ffffffff810bd8bd
[84434.866899]  ffff8805a1b848a0 ffff8800735b9fd8 ffff8800735b9fd8 ffff8800735b9fd8
[84434.866904]  ffffffff81c104e0 ffff8805a1b848a0 ffff880755b6a478 ffff8804cece78f0
[84434.866910] Call Trace:
[84434.866920]  [<ffffffff810bd8bd>] ? trace_hardirqs_on+0xd/0x10
[84434.866927]  [<ffffffff81774269>] schedule+0x29/0x70
[84434.866948]  [<ffffffffa06dd2ef>] wait_current_trans.isra.33+0xbf/0x120 [btrfs]
[84434.866954]  [<ffffffff810715c0>] ? __init_waitqueue_head+0x60/0x60
[84434.866970]  [<ffffffffa06dec18>] start_transaction+0x388/0x5a0 [btrfs]
[84434.866985]  [<ffffffffa06db9b5>] ? transaction_kthread+0xb5/0x280 [btrfs]
[84434.866999]  [<ffffffffa06dee97>] btrfs_attach_transaction+0x17/0x20 [btrfs]
[84434.867012]  [<ffffffffa06dba9e>] transaction_kthread+0x19e/0x280 [btrfs]
[84434.867026]  [<ffffffffa06db900>] ? open_ctree+0x2260/0x2260 [btrfs]
[84434.867030]  [<ffffffff81070dad>] kthread+0xed/0x100
[84434.867035]  [<ffffffff81070cc0>] ? flush_kthread_worker+0x190/0x190
[84434.867040]  [<ffffffff8177ee6c>] ret_from_fork+0x7c/0xb0
[84434.867044]  [<ffffffff81070cc0>] ? flush_kthread_worker+0x190/0x190

Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-01-28 13:19:59 -08:00
..
tests Btrfs: add a sanity test for a vacant extent at the front of a file 2013-11-11 21:58:19 -05:00
acl.c btrfs: remove dead code 2014-01-28 13:19:50 -08:00
async-thread.c Btrfs: fix __btrfs_start_workers retval 2013-11-20 20:42:11 -05:00
async-thread.h Btrfs: eliminate races in worker stopping code 2013-10-04 16:02:13 -04:00
backref.c Btrfs: fix deadlock when iterating inode refs and running delayed inodes 2014-01-28 13:19:59 -08:00
backref.h Btrfs: allocate prelim_ref with a slab allocater 2013-09-01 08:16:27 -04:00
btrfs_inode.h Btrfs: improve inode hash function/inode lookup 2013-11-11 21:55:19 -05:00
check-integrity.c Btrfs: fix check-integrity to look at the referenced data properly 2014-01-28 13:19:21 -08:00
check-integrity.h block: submit_bio_wait() conversions 2013-11-24 16:33:41 -07:00
compression.c btrfs: Fix checkpatch.pl warning of spacing issues 2013-11-11 22:12:31 -05:00
compression.h btrfs: make static code static & remove dead code 2013-05-06 15:55:23 -04:00
ctree.c Btrfs: return immediately if tree log mod is not necessary 2014-01-28 13:19:56 -08:00
ctree.h Btrfs: remove field tree_mod_seq_elem from btrfs_fs_info struct 2014-01-28 13:19:58 -08:00
delayed-inode.c btrfs: Replace kmalloc with kmalloc_array 2013-11-11 22:12:22 -05:00
delayed-inode.h [readdir] convert btrfs 2013-06-29 12:57:00 +04:00
delayed-ref.c Btrfs: skip merge part for delayed data refs 2014-01-28 13:19:23 -08:00
delayed-ref.h Btrfs: introduce a head ref rbtree 2014-01-28 13:19:22 -08:00
dev-replace.c btrfs: fix typo in the log message 2013-11-20 20:44:47 -05:00
dev-replace.h Btrfs: add new sources for device replace code 2012-12-12 17:15:41 -05:00
dir-item.c Btrfs: fix max dir item size calculation 2014-01-28 13:19:49 -08:00
disk-io.c Btrfs: move the extent buffer radix tree into the fs_info 2014-01-28 13:19:55 -08:00
disk-io.h Btrfs: add a sanity test for btrfs_split_item 2013-11-11 21:51:02 -05:00
export.c btrfs: remove fs/btrfs/compat.h 2013-11-11 22:03:19 -05:00
export.h NFS support for btrfs - v3 2008-09-25 11:04:06 -04:00
extent_io.c Btrfs: move the extent buffer radix tree into the fs_info 2014-01-28 13:19:55 -08:00
extent_io.h Btrfs: move the extent buffer radix tree into the fs_info 2014-01-28 13:19:55 -08:00
extent_map.c Btrfs: fix extent_map block_len after merging 2014-01-28 13:19:51 -08:00
extent_map.h btrfs: Enclose macros with complex values within parenthesis 2013-11-11 22:12:06 -05:00
extent-tree.c btrfs: fix static checker warnings 2014-01-28 13:19:44 -08:00
file-item.c btrfs: Use WARN_ON()'s return value in place of WARN_ON(1) 2013-11-11 22:11:53 -05:00
file.c Btrfs: fix use of uninitialized err variable 2014-01-28 13:19:57 -08:00
free-space-cache.c btrfs: remove unused variable from setup_cluster_no_bitmap 2014-01-28 13:19:33 -08:00
free-space-cache.h Btrfs: remove path arg from btrfs_truncate_free_space_cache 2013-11-11 21:51:33 -05:00
hash.h btrfs: extended inode refs 2012-10-09 09:14:45 -04:00
inode-item.c btrfs: cleanup: removed unused 'btrfs_get_inode_ref_index' 2014-01-28 13:19:39 -08:00
inode-map.c btrfs: Use WARN_ON()'s return value in place of WARN_ON(1) 2013-11-11 22:11:53 -05:00
inode-map.h Btrfs: Support reading/writing on disk free ino cache 2011-04-25 16:46:11 +08:00
inode.c Btrfs: remove dead comments for read_csums() 2014-01-28 13:19:59 -08:00
ioctl.c Btrfs: fix error check of btrfs_lookup_dentry() 2014-01-28 13:19:56 -08:00
Kconfig Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2013-11-22 08:38:55 -08:00
locking.c btrfs: make static code static & remove dead code 2013-05-06 15:55:23 -04:00
locking.h Btrfs: remove btrfs_try_spin_lock 2013-03-14 14:57:10 -04:00
lzo.c Btrfs: return -1 when lzo compression makes data bigger 2013-09-01 07:57:19 -04:00
Makefile Btrfs: add tests for btrfs_get_extent 2013-11-11 21:57:30 -05:00
math.h Btrfs: cleanup duplicated division functions 2012-12-11 13:31:30 -05:00
ordered-data.c Btrfs: avoid unnecessary ordered extent cache resets 2014-01-28 13:19:46 -08:00
ordered-data.h Btrfs: don't wait for the completion of all the ordered extents 2013-11-11 22:13:44 -05:00
orphan.c btrfs: expand btrfs_find_item() to include find_orphan_item functionality 2014-01-28 13:19:37 -08:00
print-tree.c btrfs: drop unused parameter from btrfs_item_nr 2013-11-11 21:50:48 -05:00
print-tree.h btrfs: make static code static & remove dead code 2013-05-06 15:55:23 -04:00
qgroup.c btrfs: fix unused variables in qgroup.c 2014-01-28 13:19:35 -08:00
raid56.c btrfs: remove fs/btrfs/compat.h 2013-11-11 22:03:19 -05:00
raid56.h Btrfs: RAID5 and RAID6 2013-02-01 14:24:23 -05:00
rcu-string.h Btrfs: use rcu to protect device->name 2012-06-14 21:29:16 -04:00
reada.c Btrfs: fix reada debug code compilation 2013-05-06 15:54:55 -04:00
relocation.c Btrfs: remove unnecessary filemap writting and waiting after block group relocation 2014-01-28 13:19:57 -08:00
root-tree.c btrfs: expand btrfs_find_item() to include find_root_ref functionality 2014-01-28 13:19:36 -08:00
scrub.c Btrfs: wrap repeated code into scrub_blocked_if_needed() 2014-01-28 13:19:53 -08:00
send.c Btrfs: fix pass of transid with wrong endianness in send.c 2014-01-28 13:19:51 -08:00
send.h btrfs: make static code static & remove dead code 2013-05-06 15:55:23 -04:00
struct-funcs.c Btrfs: rewrite BTRFS_SETGET_FUNCS 2012-07-23 16:28:06 -04:00
super.c Btrfs: don't clear the default compression type 2013-12-12 07:11:19 -08:00
sysfs.c btrfs: fix leaks during sysfs teardown 2014-01-28 13:19:45 -08:00
sysfs.h btrfs: publish allocation data in sysfs 2014-01-28 13:19:29 -08:00
transaction.c Btrfs: wrap repeated code into scrub_blocked_if_needed() 2014-01-28 13:19:53 -08:00
transaction.h Btrfs: fix BUG_ON() casued by the reserved space migration 2013-11-11 21:54:28 -05:00
tree-defrag.c Btrfs: cleanup dead code of defragment 2013-11-11 21:59:45 -05:00
tree-log.c btrfs: expand btrfs_find_item() to include find_orphan_item functionality 2014-01-28 13:19:37 -08:00
tree-log.h btrfs: make static code static & remove dead code 2013-05-06 15:55:23 -04:00
ulist.c Btrfs: fix crash regarding to ulist_add_merge 2013-07-02 11:50:59 -04:00
ulist.h Btrfs: add a rb_tree to improve performance of ulist search 2013-05-06 15:54:44 -04:00
uuid-tree.c Btrfs: remove unused max_key arg from btrfs_search_forward 2013-11-11 21:54:57 -05:00
volumes.c btrfs: fix bio_size_ok() for max_sectors > 0xffff 2013-11-20 20:48:44 -05:00
volumes.h btrfs: Pack struct btrfs_device 2013-11-11 22:11:26 -05:00
xattr.c btrfs: make static code static & remove dead code 2013-05-06 15:55:23 -04:00
xattr.h fs/vfs/security: pass last path component to LSM on inode creation 2011-02-01 11:12:29 -05:00
zlib.c btrfs: fix message printing 2012-10-09 09:19:57 -04:00