linux_dsm_epyc7002/fs
Robbie Ko 801bec365e Btrfs: send, fix failure to move directories with the same name around
When doing an incremental send we can end up not moving directories that
have the same name. This happens when the same parent directory has
different child directories with the same name in the parent and send
snapshots.

For example, consider the following scenario:

  Parent snapshot:

  .                   (ino 256)
  |---- d/            (ino 257)
  |     |--- p1/      (ino 258)
  |
  |---- p1/           (ino 259)

  Send snapshot:

  .                    (ino 256)
  |--- d/              (ino 257)
       |--- p1/        (ino 259)
             |--- p1/  (ino 258)

The directory named "d" (inode 257) has in both snapshots an entry with
the name "p1" but it refers to different inodes in both snapshots (inode
258 in the parent snapshot and inode 259 in the send snapshot). When
attempting to move inode 258, the operation is delayed because its new
parent, inode 259, was not yet moved/renamed (as the stream is currently
processing inode 258). Then when processing inode 259, we also end up
delaying its move/rename operation so that it happens after inode 258 is
moved/renamed. This decision to delay the move/rename rename operation
of inode 259 is due to the fact that the new parent inode (257) still
has inode 258 as its child, which has the same name has inode 259. So
we end up with inode 258 move/rename operation waiting for inode's 259
move/rename operation, which in turn it waiting for inode's 258
move/rename. This results in ending the send stream without issuing
move/rename operations for inodes 258 and 259 and generating the
following warnings in syslog/dmesg:

[148402.979747] ------------[ cut here ]------------
[148402.980588] WARNING: CPU: 14 PID: 4117 at fs/btrfs/send.c:6177 btrfs_ioctl_send+0xe03/0xe51 [btrfs]
[148402.981928] Modules linked in: btrfs crc32c_generic xor raid6_pq acpi_cpufreq tpm_tis ppdev tpm parport_pc psmouse parport sg pcspkr i2c_piix4 i2c_core evdev processor serio_raw button loop autofs4 ext4 crc16 jbd2 mbcache sr_mod cdrom sd_mod ata_generic virtio_scsi ata_piix libata virtio_pci virtio_ring virtio e1000 scsi_mod floppy [last unloaded: btrfs]
[148402.986999] CPU: 14 PID: 4117 Comm: btrfs Tainted: G        W       4.6.0-rc7-btrfs-next-31+ #1
[148402.988136] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS by qemu-project.org 04/01/2014
[148402.988136]  0000000000000000 ffff88022139fca8 ffffffff8126b42c 0000000000000000
[148402.988136]  0000000000000000 ffff88022139fce8 ffffffff81052b14 000018212139fac8
[148402.988136]  ffff88022b0db400 0000000000000000 0000000000000001 0000000000000000
[148402.988136] Call Trace:
[148402.988136]  [<ffffffff8126b42c>] dump_stack+0x67/0x90
[148402.988136]  [<ffffffff81052b14>] __warn+0xc2/0xdd
[148402.988136]  [<ffffffff81052beb>] warn_slowpath_null+0x1d/0x1f
[148402.988136]  [<ffffffffa04bc831>] btrfs_ioctl_send+0xe03/0xe51 [btrfs]
[148402.988136]  [<ffffffffa048b358>] btrfs_ioctl+0x14f/0x1f81 [btrfs]
[148402.988136]  [<ffffffff8108e456>] ? arch_local_irq_save+0x9/0xc
[148402.988136]  [<ffffffff8108eb51>] ? __lock_is_held+0x3c/0x57
[148402.988136]  [<ffffffff8118da05>] vfs_ioctl+0x18/0x34
[148402.988136]  [<ffffffff8118e00c>] do_vfs_ioctl+0x550/0x5be
[148402.988136]  [<ffffffff81196f0c>] ? __fget+0x6b/0x77
[148402.988136]  [<ffffffff81196fa1>] ? __fget_light+0x62/0x71
[148402.988136]  [<ffffffff8118e0d1>] SyS_ioctl+0x57/0x79
[148402.988136]  [<ffffffff8149e025>] entry_SYSCALL_64_fastpath+0x18/0xa8
[148402.988136]  [<ffffffff8108e89d>] ? trace_hardirqs_off_caller+0x3f/0xaa
[148403.011373] ---[ end trace a4539270c8056f8b ]---
[148403.012296] ------------[ cut here ]------------
[148403.013071] WARNING: CPU: 14 PID: 4117 at fs/btrfs/send.c:6194 btrfs_ioctl_send+0xe19/0xe51 [btrfs]
[148403.014447] Modules linked in: btrfs crc32c_generic xor raid6_pq acpi_cpufreq tpm_tis ppdev tpm parport_pc psmouse parport sg pcspkr i2c_piix4 i2c_core evdev processor serio_raw button loop autofs4 ext4 crc16 jbd2 mbcache sr_mod cdrom sd_mod ata_generic virtio_scsi ata_piix libata virtio_pci virtio_ring virtio e1000 scsi_mod floppy [last unloaded: btrfs]
[148403.019708] CPU: 14 PID: 4117 Comm: btrfs Tainted: G        W       4.6.0-rc7-btrfs-next-31+ #1
[148403.020104] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS by qemu-project.org 04/01/2014
[148403.020104]  0000000000000000 ffff88022139fca8 ffffffff8126b42c 0000000000000000
[148403.020104]  0000000000000000 ffff88022139fce8 ffffffff81052b14 000018322139fac8
[148403.020104]  ffff88022b0db400 0000000000000000 0000000000000001 0000000000000000
[148403.020104] Call Trace:
[148403.020104]  [<ffffffff8126b42c>] dump_stack+0x67/0x90
[148403.020104]  [<ffffffff81052b14>] __warn+0xc2/0xdd
[148403.020104]  [<ffffffff81052beb>] warn_slowpath_null+0x1d/0x1f
[148403.020104]  [<ffffffffa04bc847>] btrfs_ioctl_send+0xe19/0xe51 [btrfs]
[148403.020104]  [<ffffffffa048b358>] btrfs_ioctl+0x14f/0x1f81 [btrfs]
[148403.020104]  [<ffffffff8108e456>] ? arch_local_irq_save+0x9/0xc
[148403.020104]  [<ffffffff8108eb51>] ? __lock_is_held+0x3c/0x57
[148403.020104]  [<ffffffff8118da05>] vfs_ioctl+0x18/0x34
[148403.020104]  [<ffffffff8118e00c>] do_vfs_ioctl+0x550/0x5be
[148403.020104]  [<ffffffff81196f0c>] ? __fget+0x6b/0x77
[148403.020104]  [<ffffffff81196fa1>] ? __fget_light+0x62/0x71
[148403.020104]  [<ffffffff8118e0d1>] SyS_ioctl+0x57/0x79
[148403.020104]  [<ffffffff8149e025>] entry_SYSCALL_64_fastpath+0x18/0xa8
[148403.020104]  [<ffffffff8108e89d>] ? trace_hardirqs_off_caller+0x3f/0xaa
[148403.038981] ---[ end trace a4539270c8056f8c ]---

There's another issue caused by similar (but more complex) changes in the
directory hierarchy that makes move/rename operations fail, described with
the following example:

  Parent snapshot:

  .
  |---- a/                                                   (ino 262)
  |     |---- c/                                             (ino 268)
  |
  |---- d/                                                   (ino 263)
        |---- ance/                                          (ino 267)
                |---- e/                                     (ino 264)
                |---- f/                                     (ino 265)
                |---- ance/                                  (ino 266)

  Send snapshot:

  .
  |---- a/                                                   (ino 262)
  |---- c/                                                   (ino 268)
  |     |---- ance/                                          (ino 267)
  |
  |---- d/                                                   (ino 263)
  |     |---- ance/                                          (ino 266)
  |
  |---- f/                                                   (ino 265)
        |---- e/                                             (ino 264)

When the inode 265 is processed, the path for inode 267 is computed, which
at that time corresponds to "d/ance", and it's stored in the names cache.
Later on when processing inode 266, we end up orphanizing (renaming to a
name matching the pattern o<ino>-<gen>-<seq>) inode 267 because it has
the same name as inode 266 and it's currently a child of the new parent
directory (inode 263) for inode 266. After the orphanization and while we
are still processing inode 266, a rename operation for inode 266 is
generated. However the source path for that rename operation is incorrect
because it ends up using the old, pre-orphanization, name of inode 267.
The no longer valid name for inode 267 was previously cached when
processing inode 265 and it remains usable and considered valid until
the inode currently being processed has a number greater than 267.
This resulted in the receiving side failing with the following error:

  ERROR: rename d/ance/ance -> d/ance failed: No such file or directory

So fix these issues by detecting such circular dependencies for rename
operations and by clearing the cached name of an inode once the inode
is orphanized.

A test case for fstests will follow soon.

Signed-off-by: Robbie Ko <robbieko@synology.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
[Rewrote change log to be more detailed and organized, and improved
 comments]

Signed-off-by: Filipe Manana <fdmanana@suse.com>
2016-08-01 07:23:10 +01:00
..
9p 9p: use file_dentry() 2016-06-30 23:28:09 -04:00
adfs fs/adfs/adfs.h: tidy up comments 2016-01-20 17:09:18 -08:00
affs affs: fix remount failure when there are no options changed 2016-05-28 16:50:24 -07:00
afs remove lots of IS_ERR_VALUE abuses 2016-05-27 15:26:11 -07:00
autofs4 autofs: don't get stuck in a loop if vfs_write() returns an error 2016-06-24 17:23:52 -07:00
befs fs/befs/io.c:befs_bread(): remove unneeded initialization to NULL 2016-05-23 17:04:14 -07:00
bfs more trivial ->iterate_shared conversions 2016-05-09 11:41:14 -04:00
btrfs Btrfs: send, fix failure to move directories with the same name around 2016-08-01 07:23:10 +01:00
cachefiles FS-Cache: make check_consistency callback return int 2016-06-01 10:29:39 +02:00
ceph Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-07-01 15:20:11 -07:00
cifs File names with trailing period or space need special case conversion 2016-06-24 12:05:52 -05:00
coda introduce a parallel variant of ->iterate() 2016-05-02 19:49:29 -04:00
configfs configfs_readdir(): make safe under shared lock 2016-05-09 11:41:13 -04:00
cramfs more trivial ->iterate_shared conversions 2016-05-09 11:41:14 -04:00
crypto fscrypto/f2fs: allow fs-specific key prefix for fs encryption 2016-05-07 10:32:33 -07:00
debugfs debugfs: open_proxy_open(): avoid double fops release 2016-06-15 04:56:35 -07:00
devpts devpts: Make each mount of devpts an independent filesystem. 2016-06-05 10:36:01 -07:00
dlm mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros 2016-04-04 10:41:08 -07:00
ecryptfs Merge branch 'stacking-fixes' (vfs stacking fixes from Jann) 2016-06-10 12:10:02 -07:00
efivarfs fs/efivarfs/inode.c: use generic UUID library 2016-05-20 17:58:30 -07:00
efs fs/efs/super.c: fix return value 2016-05-20 17:58:30 -07:00
exofs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2016-05-17 17:05:30 -07:00
exportfs introduce a parallel variant of ->iterate() 2016-05-02 19:49:29 -04:00
ext2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-05-27 17:14:05 -07:00
ext4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-05-27 17:14:05 -07:00
f2fs switch xattr_handler->set() to passing dentry and inode separately 2016-05-27 15:39:43 -04:00
fat Merge branch 'work.preadv2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-05-17 15:05:23 -07:00
freevxfs more trivial ->iterate_shared conversions 2016-05-09 11:41:14 -04:00
fscache FS-Cache: wake write waiter after invalidating writes 2016-06-01 10:29:09 +02:00
fuse fuse: serialize dirops by default 2016-06-30 13:10:49 +02:00
gfs2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-05-27 17:14:05 -07:00
hfs switch ->setxattr() to passing dentry and inode separately 2016-05-27 20:09:16 -04:00
hfsplus switch xattr_handler->set() to passing dentry and inode separately 2016-05-27 15:39:43 -04:00
hostfs hostfs: switch to ->iterate_shared() 2016-05-12 19:49:30 -04:00
hpfs hpfs: implement the show_options method 2016-05-28 16:50:24 -07:00
hugetlbfs mm, fs: remove remaining PAGE_CACHE_* and page_cache_{get,release} usage 2016-04-04 10:41:08 -07:00
isofs Merge branch 'ovl-fixes' into for-linus 2016-05-11 00:00:29 -04:00
jbd2 jbd2: get rid of superfluous __GFP_REPEAT 2016-06-24 17:23:52 -07:00
jffs2 switch xattr_handler->set() to passing dentry and inode separately 2016-05-27 15:39:43 -04:00
jfs switch xattr_handler->set() to passing dentry and inode separately 2016-05-27 15:39:43 -04:00
kernfs switch ->setxattr() to passing dentry and inode separately 2016-05-27 20:09:16 -04:00
lockd lockd: unregister notifier blocks if the service fails to come up completely 2016-06-30 16:35:07 -04:00
logfs logfs: no need to lock directory in lseek 2016-05-09 11:42:19 -04:00
minix simple local filesystems: switch to ->iterate_shared() 2016-05-02 19:49:32 -04:00
ncpfs mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros 2016-04-04 10:41:08 -07:00
nfs NFS: Fix another OPEN_DOWNGRADE bug 2016-06-28 16:55:34 -04:00
nfs_common
nfsd nfsd: check permissions when setting ACLs 2016-06-24 12:11:52 -04:00
nilfs2 fs/nilfs2: fix potential underflow in call to crc32_le 2016-06-24 17:23:52 -07:00
nls
notify fsnotify: avoid spurious EMFILE errors from inotify_init() 2016-05-19 19:12:14 -07:00
ntfs fs: simplify the generic_write_sync prototype 2016-05-01 19:58:39 -04:00
ocfs2 ocfs2: disable BUG assertions in reading blocks 2016-06-24 17:23:52 -07:00
omfs more trivial ->iterate_shared conversions 2016-05-09 11:41:14 -04:00
openpromfs more trivial ->iterate_shared conversions 2016-05-09 11:41:14 -04:00
orangefs switch xattr_handler->set() to passing dentry and inode separately 2016-05-27 15:39:43 -04:00
overlayfs ovl: warn instead of error if d_type is not supported 2016-07-03 09:39:31 +02:00
proc Merge branch 'stacking-fixes' (vfs stacking fixes from Jann) 2016-06-10 12:10:02 -07:00
pstore mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros 2016-04-04 10:41:08 -07:00
qnx4 more trivial ->iterate_shared conversions 2016-05-09 11:41:14 -04:00
qnx6 more trivial ->iterate_shared conversions 2016-05-09 11:41:14 -04:00
quota fs/quota: use nla_put_u64_64bit() 2016-04-26 12:00:48 -04:00
ramfs tmpfs/ramfs: fix VM_MAYSHARE mappings for NOMMU 2016-05-20 17:58:30 -07:00
reiserfs Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2016-06-19 07:05:14 -10:00
romfs romfs, squashfs: switch to ->iterate_shared() 2016-05-09 11:41:15 -04:00
squashfs romfs, squashfs: switch to ->iterate_shared() 2016-05-09 11:41:15 -04:00
sysfs platform/chrome: Branch for v4.4 2015-11-13 21:53:18 -08:00
sysv simple local filesystems: switch to ->iterate_shared() 2016-05-02 19:49:32 -04:00
tracefs wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
ubifs UBIFS: Implement ->migratepage() 2016-06-23 00:29:53 +02:00
udf udf: Use correct partition reference number for metadata 2016-05-19 13:00:35 +02:00
ufs simple local filesystems: switch to ->iterate_shared() 2016-05-02 19:49:32 -04:00
xfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-05-27 17:14:05 -07:00
aio.c aio: make aio_setup_ring killable 2016-05-23 17:04:14 -07:00
anon_inodes.c
attr.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
bad_inode.c switch ->setxattr() to passing dentry and inode separately 2016-05-27 20:09:16 -04:00
binfmt_aout.c fs: fix binfmt_aout.c build error 2016-05-28 16:34:59 -07:00
binfmt_elf_fdpic.c coredump: fix dumping through pipes 2016-06-07 22:07:09 -04:00
binfmt_elf.c coredump: fix dumping through pipes 2016-06-07 22:07:09 -04:00
binfmt_em86.c
binfmt_flat.c remove lots of IS_ERR_VALUE abuses 2016-05-27 15:26:11 -07:00
binfmt_misc.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
binfmt_script.c
block_dev.c DAX error handling for 4.7 2016-05-26 19:34:26 -07:00
buffer.c mm, page_alloc: avoid looking up the first zone in a zonelist twice 2016-05-19 19:12:14 -07:00
char_dev.c chrdev: emit a warning when we go below dynamic major range 2016-03-29 10:11:44 -07:00
compat_binfmt_elf.c
compat_ioctl.c Merge 4.5-rc4 into char-misc-next 2016-02-14 14:25:59 -08:00
compat.c Fix a number of bugs, most notably a potential stale data exposure 2016-05-24 12:55:26 -07:00
coredump.c coredump: fix dumping through pipes 2016-06-07 22:07:09 -04:00
dax.c dax: fix offset overflow in dax_io 2016-06-27 12:18:44 -07:00
dcache.c fix idiotic braino in d_alloc_parallel() 2016-06-20 10:07:42 -04:00
dcookies.c
direct-io.c direct-io: fix direct write stale data exposure from concurrent buffered read 2016-05-27 14:49:37 -07:00
drop_caches.c
eventfd.c eventfd: document lockless access in eventfd_poll 2016-03-22 15:36:02 -07:00
eventpoll.c fs: poll/select/recvmmsg: use timespec64 for timeout events 2016-05-19 19:12:14 -07:00
exec.c exec: make exec path waiting for mmap_sem killable 2016-05-23 17:04:14 -07:00
fcntl.c fcntl: allow to set O_DIRECT flag on pipe 2016-01-09 02:55:37 -05:00
fhandle.c fs/coredump: prevent fsuid=0 dumps into user-controlled directories 2016-03-22 15:36:02 -07:00
file_table.c
file.c give readdir(2)/getdents(2)/etc. uniform exclusion with lseek() 2016-05-02 19:49:28 -04:00
filesystems.c find_filesystem(): simplify comparison 2016-01-19 12:02:23 -05:00
fs_pin.c
fs_struct.c
fs-writeback.c mm,writeback: don't use memory reserves for wb_start_writeback 2016-05-20 17:58:30 -07:00
inode.c parallel lookups: actual switch to rwsem 2016-05-02 19:49:28 -04:00
internal.h much milder d_walk() race 2016-06-10 11:32:47 -04:00
ioctl.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
Kconfig dax: Make huge page handling depend of CONFIG_BROKEN 2016-05-19 15:13:17 -06:00
Kconfig.binfmt ELF/MIPS build fix 2016-05-23 17:04:14 -07:00
libfs.c lockless next_positive() 2016-06-20 17:11:29 -04:00
locks.c locks: use file_inode() 2016-07-01 10:24:18 -04:00
Makefile Merge tag 'ofs-pull-tag-1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux 2016-03-26 12:59:04 -07:00
mbcache.c mbcache: add reusable flag to cache entries 2016-02-22 22:44:04 -05:00
mount.h
mpage.c mm, fs: remove remaining PAGE_CACHE_* and page_cache_{get,release} usage 2016-04-04 10:41:08 -07:00
namei.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-06-07 20:41:36 -07:00
namespace.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-07-01 15:20:11 -07:00
no-block.c
nsfs.c
open.c Merge branch 'work.const-path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-05-17 14:41:03 -07:00
pipe.c mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros 2016-04-04 10:41:08 -07:00
pnode.c propogate_mnt: Handle the first propogated copy being a slave 2016-05-05 09:54:45 -05:00
pnode.h
posix_acl.c posix_acl: Add set_posix_acl 2016-06-24 12:11:34 -04:00
proc_namespace.c vfs: show_vfsstat: do not ignore errors from show_devname method 2016-03-16 13:09:08 -04:00
read_write.c Merge branch 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-05-18 11:46:23 -07:00
readdir.c restore killability of old mutex_lock_killable(&inode->i_mutex) users 2016-05-26 00:13:25 -04:00
select.c fs: poll/select/recvmmsg: use timespec64 for timeout events 2016-05-19 19:12:14 -07:00
seq_file.c Make file credentials available to the seqfile interfaces 2016-04-14 12:56:09 -07:00
signalfd.c
splice.c Merge branch 'ovl-fixes' into for-linus 2016-05-11 00:00:29 -04:00
stack.c
stat.c fs/stat.c: drop the last new_valid_dev check 2016-01-16 11:17:23 -08:00
statfs.c
super.c Merge branch 'master' into for-next 2016-04-18 11:18:55 +02:00
sync.c mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros 2016-04-04 10:41:08 -07:00
timerfd.c timerfd: Handle relative timers with CONFIG_TIME_LOW_RES proper 2016-01-17 11:13:55 +01:00
userfaultfd.c userfaultfd: don't pin the user memory in userfaultfd_file_create() 2016-05-20 17:58:30 -07:00
utimes.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
xattr.c switch ->setxattr() to passing dentry and inode separately 2016-05-27 20:09:16 -04:00