linux_dsm_epyc7002/fs/btrfs
Filipe Manana 34172f601a btrfs: send: fix invalid path for unlink operations after parent orphanization
commit d8ac76cdd1755b21e8c008c28d0b7251c0b14986 upstream.

During an incremental send operation, when processing the new references
for the current inode, we might send an unlink operation for another inode
that has a conflicting path and has more than one hard link. However this
path was computed and cached before we processed previous new references
for the current inode. We may have orphanized a directory of that path
while processing a previous new reference, in which case the path will
be invalid and cause the receiver process to fail.

The following reproducer triggers the problem and explains how/why it
happens in its comments:

  $ cat test-send-unlink.sh
  #!/bin/bash

  DEV=/dev/sdi
  MNT=/mnt/sdi

  mkfs.btrfs -f $DEV >/dev/null
  mount $DEV $MNT

  # Create our test files and directory. Inode 259 (file3) has two hard
  # links.
  touch $MNT/file1
  touch $MNT/file2
  touch $MNT/file3

  mkdir $MNT/A
  ln $MNT/file3 $MNT/A/hard_link

  # Filesystem looks like:
  #
  # .                                     (ino 256)
  # |----- file1                          (ino 257)
  # |----- file2                          (ino 258)
  # |----- file3                          (ino 259)
  # |----- A/                             (ino 260)
  #        |---- hard_link                (ino 259)
  #

  # Now create the base snapshot, which is going to be the parent snapshot
  # for a later incremental send.
  btrfs subvolume snapshot -r $MNT $MNT/snap1
  btrfs send -f /tmp/snap1.send $MNT/snap1

  # Move inode 257 into directory inode 260. This results in computing the
  # path for inode 260 as "/A" and caching it.
  mv $MNT/file1 $MNT/A/file1

  # Move inode 258 (file2) into directory inode 260, with a name of
  # "hard_link", moving first inode 259 away since it currently has that
  # location and name.
  mv $MNT/A/hard_link $MNT/tmp
  mv $MNT/file2 $MNT/A/hard_link

  # Now rename inode 260 to something else (B for example) and then create
  # a hard link for inode 258 that has the old name and location of inode
  # 260 ("/A").
  mv $MNT/A $MNT/B
  ln $MNT/B/hard_link $MNT/A

  # Filesystem now looks like:
  #
  # .                                     (ino 256)
  # |----- tmp                            (ino 259)
  # |----- file3                          (ino 259)
  # |----- B/                             (ino 260)
  # |      |---- file1                    (ino 257)
  # |      |---- hard_link                (ino 258)
  # |
  # |----- A                              (ino 258)

  # Create another snapshot of our subvolume and use it for an incremental
  # send.
  btrfs subvolume snapshot -r $MNT $MNT/snap2
  btrfs send -f /tmp/snap2.send -p $MNT/snap1 $MNT/snap2

  # Now unmount the filesystem, create a new one, mount it and try to
  # apply both send streams to recreate both snapshots.
  umount $DEV

  mkfs.btrfs -f $DEV >/dev/null

  mount $DEV $MNT

  # First add the first snapshot to the new filesystem by applying the
  # first send stream.
  btrfs receive -f /tmp/snap1.send $MNT

  # The incremental receive operation below used to fail with the
  # following error:
  #
  #    ERROR: unlink A/hard_link failed: No such file or directory
  #
  # This is because when send is processing inode 257, it generates the
  # path for inode 260 as "/A", since that inode is its parent in the send
  # snapshot, and caches that path.
  #
  # Later when processing inode 258, it first processes its new reference
  # that has the path of "/A", which results in orphanizing inode 260
  # because there is a a path collision. This results in issuing a rename
  # operation from "/A" to "/o260-6-0".
  #
  # Finally when processing the new reference "B/hard_link" for inode 258,
  # it notices that it collides with inode 259 (not yet processed, because
  # it has a higher inode number), since that inode has the name
  # "hard_link" under the directory inode 260. It also checks that inode
  # 259 has two hardlinks, so it decides to issue a unlink operation for
  # the name "hard_link" for inode 259. However the path passed to the
  # unlink operation is "/A/hard_link", which is incorrect since currently
  # "/A" does not exists, due to the orphanization of inode 260 mentioned
  # before. The path is incorrect because it was computed and cached
  # before the orphanization. This results in the receiver to fail with
  # the above error.
  btrfs receive -f /tmp/snap2.send $MNT

  umount $MNT

When running the test, it fails like this:

  $ ./test-send-unlink.sh
  Create a readonly snapshot of '/mnt/sdi' in '/mnt/sdi/snap1'
  At subvol /mnt/sdi/snap1
  Create a readonly snapshot of '/mnt/sdi' in '/mnt/sdi/snap2'
  At subvol /mnt/sdi/snap2
  At subvol snap1
  At snapshot snap2
  ERROR: unlink A/hard_link failed: No such file or directory

Fix this by recomputing a path before issuing an unlink operation when
processing the new references for the current inode if we previously
have orphanized a directory.

A test case for fstests will follow soon.

CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-14 16:55:40 +02:00
..
tests btrfs: fix missing delalloc new bit for new delalloc ranges 2020-11-13 22:15:59 +01:00
acl.c
async-thread.c
async-thread.h
backref.c btrfs: do not warn if we can't find the reloc root when looking up backref 2021-03-04 11:38:29 +01:00
backref.h btrfs: add asserts for deleting backref cache nodes 2021-03-04 11:38:29 +01:00
block-group.c btrfs: fix race between writes to swap files and scrub 2021-03-09 11:11:11 +01:00
block-group.h btrfs: fix race between writes to swap files and scrub 2021-03-09 11:11:11 +01:00
block-rsv.c btrfs: print the block rsv type when we fail our reservation 2020-11-05 13:02:05 +01:00
block-rsv.h
btrfs_inode.h btrfs: fix deadlock when cloning inline extent and low on free metadata space 2021-01-17 14:16:54 +01:00
check-integrity.c
check-integrity.h
compression.c btrfs: handle remount to no compress during compression 2021-05-11 14:47:15 +02:00
compression.h
ctree.c btrfs: fix race when picking most recent mod log operation for an old root 2021-05-11 14:47:33 +02:00
ctree.h btrfs: fix race between writes to swap files and scrub 2021-03-09 11:11:11 +01:00
delalloc-space.c
delalloc-space.h
delayed-inode.c btrfs: don't flush from btrfs_delayed_inode_reserve_metadata 2021-03-11 14:17:22 +01:00
delayed-inode.h
delayed-ref.c btrfs: account for new extents being deleted in total_bytes_pinned 2021-03-04 11:38:30 +01:00
delayed-ref.h btrfs: handle space_info::total_bytes_pinned inside the delayed ref itself 2021-03-04 11:38:30 +01:00
dev-replace.c btrfs: fix deadlock when cloning inline extent and low on free metadata space 2021-01-17 14:16:54 +01:00
dev-replace.h
dir-item.c
discard.c btrfs: merge critical sections of discard lock in workfn 2021-01-19 18:27:24 +01:00
discard.h
disk-io.c btrfs: promote debugging asserts to full-fledged checks in validate_super 2021-06-16 12:01:40 +02:00
disk-io.h btrfs: add a helper to read the tree_root commit root for backref lookup 2020-10-26 15:04:57 +01:00
export.c
export.h
extent_io.c btrfs: return whole extents in fiemap 2021-06-03 09:00:43 +02:00
extent_io.h btrfs: remove struct extent_io_ops 2020-10-07 12:13:25 +02:00
extent_map.c
extent_map.h
extent-io-tree.h btrfs: remove struct extent_io_ops 2020-10-07 12:13:25 +02:00
extent-tree.c btrfs: fix unmountable seed device after fstrim 2021-06-10 13:39:28 +02:00
file-item.c btrfs: fix error handling in btrfs_del_csums 2021-06-10 13:39:27 +02:00
file.c btrfs: return value from btrfs_mark_extent_written() in case of error 2021-06-16 12:01:40 +02:00
free-space-cache.c btrfs: fix race between extent freeing/allocation when using bitmaps 2021-03-09 11:11:11 +01:00
free-space-cache.h
free-space-tree.c btrfs: fix possible free space tree corruption with online conversion 2021-02-03 23:28:40 +01:00
free-space-tree.h
inode-item.c
inode-map.c
inode-map.h
inode.c btrfs: abort in rename_exchange if we fail to insert the second ref 2021-06-10 13:39:28 +02:00
ioctl.c btrfs: fix metadata extent leak after failure to create subvolume 2021-05-11 14:47:15 +02:00
Kconfig
locking.c btrfs: add nesting tags to the locking helpers 2020-10-07 12:12:16 +02:00
locking.h btrfs: introduce BTRFS_NESTING_NEW_ROOT for adding new roots 2020-10-07 12:12:17 +02:00
lzo.c btrfs: compression: inline free_workspace 2019-11-18 12:46:59 +01:00
Makefile
misc.h
ordered-data.c btrfs: remove inode argument from btrfs_start_ordered_extent 2020-10-07 12:13:22 +02:00
ordered-data.h btrfs: remove inode argument from btrfs_start_ordered_extent 2020-10-07 12:13:22 +02:00
orphan.c
print-tree.c btrfs: print the actual offset in btrfs_root_name 2021-01-27 11:55:06 +01:00
print-tree.h btrfs: print the actual offset in btrfs_root_name 2021-01-27 11:55:06 +01:00
props.c
props.h
qgroup.c btrfs: fix sleep while in non-sleep context during qgroup removal 2021-03-30 14:31:53 +02:00
qgroup.h btrfs: export and rename qgroup_reserve_meta 2021-03-11 14:17:22 +01:00
raid56.c btrfs: fix raid6 qstripe kmap 2021-03-09 11:11:10 +01:00
raid56.h
rcu-string.h
reada.c btrfs: fix readahead hang and use-after-free after removing a device 2020-10-26 15:03:59 +01:00
ref-verify.c btrfs: ref-verify: fix memory leak in btrfs_ref_tree_mod 2020-11-05 13:03:39 +01:00
ref-verify.h
reflink.c btrfs: fix deadlock when cloning inline extents and low on available space 2021-06-10 13:39:28 +02:00
reflink.h
relocation.c btrfs: convert logic BUG_ON()'s in replace_path to ASSERT()'s 2021-05-11 14:47:22 +02:00
root-tree.c btrfs: qgroup: fix qgroup meta rsv leak for subvolume operations 2020-10-07 12:12:13 +02:00
scrub.c btrfs: fix race between writes to swap files and scrub 2021-03-09 11:11:11 +01:00
send.c btrfs: send: fix invalid path for unlink operations after parent orphanization 2021-07-14 16:55:40 +02:00
send.h btrfs: send: avoid copying file data 2020-10-07 12:13:17 +02:00
space-info.c btrfs: shrink delalloc pages instead of full inodes 2021-01-17 14:16:54 +01:00
space-info.h btrfs: handle space_info::total_bytes_pinned inside the delayed ref itself 2021-03-04 11:38:30 +01:00
struct-funcs.c btrfs: use unaligned helpers for stack and header set/get helpers 2020-10-07 12:13:23 +02:00
super.c btrfs: fix transaction leak and crash after RO remount caused by qgroup rescan 2021-01-19 18:27:24 +01:00
sysfs.c btrfs: do not create raid sysfs entries under any locks 2020-10-07 12:13:19 +02:00
sysfs.h btrfs: split and refactor btrfs_sysfs_remove_devices_dir 2020-10-07 12:12:21 +02:00
transaction.c btrfs: fix race between transaction aborts and fsyncs leading to use-after-free 2021-05-11 14:47:15 +02:00
transaction.h
tree-checker.c btrfs: tree-checker: do not error out if extent ref hash doesn't match 2021-06-10 13:39:12 +02:00
tree-checker.h
tree-defrag.c
tree-log.c btrfs: fixup error handling in fixup_inode_link_counts 2021-06-10 13:39:28 +02:00
tree-log.h
ulist.c
ulist.h
uuid-tree.c
volumes.c btrfs: fix lockdep warning due to seqcount_mutex on 32bit arch 2021-02-03 23:28:40 +01:00
volumes.h btrfs: fix lockdep warning due to seqcount_mutex on 32bit arch 2021-02-03 23:28:40 +01:00
xattr.c btrfs: fix warning when creating a directory with smack enabled 2021-03-09 11:11:12 +01:00
xattr.h
zlib.c
zstd.c