linux_dsm_epyc7002/fs
David Howells f76e0829bb afs: Fix speculative status fetches
[ Upstream commit 22650f148126571be1098d34160eb4931fc77241 ]

The generic/464 xfstest causes kAFS to emit occasional warnings of the
form:

        kAFS: vnode modified {100055:8a} 30->31 YFS.StoreData64 (c=6015)

This indicates that the data version received back from the server did not
match the expected value (the DV should be incremented monotonically for
each individual modification op committed to a vnode).

What is happening is that a lookup call is doing a bulk status fetch
speculatively on a bunch of vnodes in a directory besides getting the
status of the vnode it's actually interested in.  This is racing with a
StoreData operation (though it could also occur with, say, a MakeDir op).

On the client, a modification operation locks the vnode, but the bulk
status fetch only locks the parent directory, so no ordering is imposed
there (thereby avoiding an avenue to deadlock).

On the server, the StoreData op handler doesn't lock the vnode until it's
received all the request data, and downgrades the lock after committing the
data until it has finished sending change notifications to other clients -
which allows the status fetch to occur before it has finished.

This means that:

 - a status fetch can access the target vnode either side of the exclusive
   section of the modification

 - the status fetch could start before the modification, yet finish after,
   and vice-versa.

 - the status fetch and the modification RPCs can complete in either order.

 - the status fetch can return either the before or the after DV from the
   modification.

 - the status fetch might regress the locally cached DV.

Some of these are handled by the previous fix[1], but that's not sufficient
because it checks the DV it received against the DV it cached at the start
of the op, but the DV might've been updated in the meantime by a locally
generated modification op.

Fix this by the following means:

 (1) Keep track of when we're performing a modification operation on a
     vnode.  This is done by marking vnode parameters with a 'modification'
     note that causes the AFS_VNODE_MODIFYING flag to be set on the vnode
     for the duration.

 (2) Alter the speculation race detection to ignore speculative status
     fetches if either the vnode is marked as being modified or the data
     version number is not what we expected.

Note that whilst the "vnode modified" warning does get recovered from as it
causes the client to refetch the status at the next opportunity, it will
also invalidate the pagecache, so changes might get lost.

Fixes: a9e5c87ca7 ("afs: Fix speculative status fetch going out of order wrt to modifications")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-and-reviewed-by: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/160605082531.252452.14708077925602709042.stgit@warthog.procyon.org.uk/ [1]
Link: https://lore.kernel.org/linux-fsdevel/161961335926.39335.2552653972195467566.stgit@warthog.procyon.org.uk/ # v1
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14 09:50:45 +02:00
..
9p
adfs
affs fs/affs: release old buffer head on error path 2021-03-04 11:38:37 +01:00
afs afs: Fix speculative status fetches 2021-05-14 09:50:45 +02:00
autofs
befs
bfs bfs: don't use WARNING: string when it's just info. 2021-01-06 14:56:52 +01:00
btrfs btrfs: fix race when picking most recent mod log operation for an old root 2021-05-11 14:47:33 +02:00
cachefiles fs/cachefiles: Remove wait_bit_key layout dependency 2021-03-30 14:32:07 +02:00
ceph ceph: fix flush_snap logic after putting caps 2021-03-04 11:38:08 +01:00
cifs smb3: do not attempt multichannel to server which does not support it 2021-05-11 14:47:37 +02:00
coda
configfs configfs: fix a use-after-free in __configfs_open_file 2021-03-17 17:06:34 +01:00
cramfs
crypto fscrypt: add fscrypt_is_nokey_name() 2020-12-26 16:02:43 +01:00
debugfs debugfs: do not attempt to create a new file before the filesystem is initalized 2021-03-04 11:37:17 +01:00
devpts
dlm
ecryptfs ecryptfs: fix kernel panic with null dev_name 2021-05-11 14:47:12 +02:00
efivarfs
efs
erofs erofs: add unsupported inode i_format check 2021-05-11 14:47:13 +02:00
exfat exfat: fix erroneous discard when clear cluster bit 2021-05-11 14:47:36 +02:00
exportfs
ext2
ext4 ext4: Fix occasional generic/418 failure 2021-05-11 14:47:38 +02:00
f2fs f2fs: fix to avoid out-of-bounds memory access 2021-05-11 14:47:34 +02:00
fat
freevxfs
fscache
fuse fuse: fix write deadlock 2021-05-11 14:47:36 +02:00
gfs2 gfs2: report "already frozen/thawed" errors 2021-04-16 11:43:20 +02:00
hfs
hfsplus
hostfs hostfs: fix memory handling in follow_link() 2021-04-14 08:42:06 +02:00
hpfs
hugetlbfs mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page 2021-02-10 09:29:20 +01:00
iomap iomap: Fix negative assignment to unsigned sis->pages in iomap_swapfile_activate 2021-04-07 15:00:04 +02:00
isofs isofs: release buffer head before return 2021-03-04 11:38:00 +01:00
jbd2 ext4: annotate data race in jbd2_journal_dirty_metadata() 2021-05-11 14:47:37 +02:00
jffs2 jffs2: check the validity of dstlen in jffs2_zlib_compress() 2021-05-11 14:47:36 +02:00
jfs JFS: more checks for invalid superblock 2021-03-07 12:34:04 +01:00
kernfs kernfs: wire up ->splice_read and ->splice_write 2021-01-27 11:55:29 +01:00
lockd lockd: don't use interval-based rebinding over TCP 2020-12-30 11:53:30 +01:00
minix
nfs NFSv4: Don't discard segments marked for return in _pnfs_return_layout() 2021-05-11 14:47:34 +02:00
nfs_common nfs_common: need lock during iterate through the list 2020-12-30 11:53:45 +01:00
nfsd NFSv4.2: fix copy stateid copying for the async copy 2021-05-14 09:50:13 +02:00
nilfs2 nilfs2: make splice write available again 2021-02-13 13:55:16 +01:00
nls
notify fanotify: Fix sys_fanotify_mark() on native x86-32 2021-01-17 14:16:59 +01:00
ntfs ntfs: check for valid standard information attribute 2021-02-26 10:13:00 +01:00
ocfs2 ocfs2: fix deadlock between setattr and dio_end_io_write 2021-04-14 08:41:58 +02:00
omfs
openpromfs
orangefs
overlayfs ovl: invalidate readdir cache on changes to dir with origin 2021-05-14 09:50:35 +02:00
proc seccomp: Fix CONFIG tests for Seccomp_filters 2021-05-14 09:50:24 +02:00
pstore pstore: Fix warning in pstore_kill_sb() 2021-03-25 09:04:08 +01:00
qnx4
qnx6
quota quota: Fix memory leak when handling corrupted quota file 2021-03-04 11:37:53 +01:00
ramfs
reiserfs reiserfs: update reiserfs_xattrs_initialized() condition 2021-04-07 15:00:10 +02:00
romfs
squashfs squashfs: fix xattr id and id lookup sanity checks 2021-03-30 14:31:54 +02:00
sysfs
sysv
tracefs
ubifs ubifs: Only check replay with inode type to judge if inode linked 2021-05-11 14:47:33 +02:00
udf udf: fix silent AED tagLocation corruption 2021-03-17 17:06:23 +01:00
ufs
unicode
vboxsf
verity
xfs xfs: fix return of uninitialized value in variable error 2021-05-14 09:50:34 +02:00
zonefs zonefs: fix to update .i_wr_refcnt correctly in zonefs_open_zone() 2021-03-25 09:04:05 +01:00
aio.c
anon_inodes.c
attr.c
bad_inode.c
binfmt_aout.c
binfmt_elf_fdpic.c
binfmt_elf.c
binfmt_em86.c
binfmt_flat.c
binfmt_misc.c binfmt_misc: fix possible deadlock in bm_register_write 2021-03-17 17:06:35 +01:00
binfmt_script.c
block_dev.c block: don't ignore REQ_NOWAIT for direct IO 2021-04-16 11:43:21 +02:00
buffer.c
char_dev.c
compat_binfmt_elf.c
coredump.c coredump: fix core_pattern parse error 2020-12-06 10:19:07 -08:00
d_path.c
dax.c mm: provide a saner PTE walking API for modules 2021-02-26 10:13:01 +01:00
dcache.c
dcookies.c
direct-io.c fs: direct-io: fix missing sdio->boundary 2021-04-14 08:41:58 +02:00
drop_caches.c
eventfd.c
eventpoll.c fs/epoll: restore waking from ep_done_scan() 2021-05-11 14:47:12 +02:00
exec.c exec: Transform exec_update_mutex into a rw_semaphore 2021-01-09 13:46:24 +01:00
fcntl.c fcntl: Fix potential deadlock in send_sig{io, urg}() 2021-01-06 14:56:53 +01:00
fhandle.c
file_table.c
file.c kernel/io_uring: cancel io_uring before task works 2021-01-30 13:55:18 +01:00
filesystems.c
fs_context.c
fs_parser.c
fs_pin.c
fs_struct.c
fs_types.c
fs-writeback.c fs: fix lazytime expiration handling in __writeback_single_inode() 2021-01-27 11:54:53 +01:00
fsopen.c
init.c
inode.c fs: Handle I_DONTCACHE in iput_final() instead of generic_drop_inode() 2020-12-30 11:53:49 +01:00
internal.h
io_uring.c io_uring: fix overflows checks in provide buffers 2021-05-14 09:50:28 +02:00
io-wq.c io_uring: always batch cancel in *cancel_files() 2021-02-13 13:54:56 +01:00
io-wq.h io_uring: always batch cancel in *cancel_files() 2021-02-13 13:54:56 +01:00
ioctl.c
Kconfig tmpfs: disallow CONFIG_TMPFS_INODE64 on alpha 2021-02-17 11:02:21 +01:00
Kconfig.binfmt
kernel_read_file.c
libfs.c
locks.c Revert "nfsd4: a client's own opens needn't prevent delegations" 2021-03-20 10:43:44 +01:00
Makefile
mbcache.c
mount.h
mpage.c
namei.c LOOKUP_MOUNTPOINT: we are cleaning "jumped" flag too late 2021-04-14 08:41:58 +02:00
namespace.c umount(2): move the flag validity checks first 2021-01-19 18:27:32 +01:00
no-block.c
nsfs.c
open.c openat2: reject RESOLVE_BENEATH|RESOLVE_IN_ROOT 2020-12-30 11:54:24 +01:00
pipe.c fs/pipe: allow sendfile() to pipe again 2021-01-27 11:55:29 +01:00
pnode.c
pnode.h mount: fix mounting of detached mounts onto targets that reside on shared mounts 2021-03-17 17:06:13 +01:00
posix_acl.c
proc_namespace.c proc mountinfo: make splice available again 2020-12-30 11:54:02 +01:00
read_write.c
readdir.c readdir: make sure to verify directory entry for legacy interfaces too 2021-04-21 13:00:54 +02:00
remap_range.c
select.c kernel, fs: Introduce and use set_restart_fn() and arch_set_restart_data() 2021-03-25 09:04:16 +01:00
seq_file.c
signalfd.c
splice.c
stack.c
stat.c fs: fix reporting supported extra file attributes for statx() 2021-05-11 14:47:33 +02:00
statfs.c
super.c
sync.c
timerfd.c
userfaultfd.c
utimes.c
xattr.c