[ Upstream commit 3cded66330591cfd2554a3fd5edca8859ea365a2 ]
Fix to return PTR_ERR() error code from the error handling case where
ubifs_hash_get_desc() failed instead of 0 in ubifs_init_authentication(),
as done elsewhere in this function.
Fixes: 49525e5eec ("ubifs: Add helper functions for authentication support")
Signed-off-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
Reviewed-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 88149082bb8ef31b289673669e080ec6a00c2e59 ]
If generic_drop_inode() returns true, it means iput_final() can evict
this inode regardless of whether it is dirty or not. If we check
I_DONTCACHE in generic_drop_inode(), any inode with this bit set will be
evicted unconditionally. This is not the desired behavior because
I_DONTCACHE only means the inode shouldn't be cached on the LRU list.
As for whether we need to evict this inode, this is what
generic_drop_inode() should do. This patch corrects the usage of
I_DONTCACHE.
This patch was proposed in [1].
[1]: https://lore.kernel.org/linux-fsdevel/20200831003407.GE12096@dread.disaster.area/
Fixes: dae2f8ed79 ("fs: Lift XFS_IDONTCACHE to the VFS layer")
Signed-off-by: Hao Li <lihao2018.fnst@cn.fujitsu.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ca9364dde50daba93eff711b4b945fd08beafcc2 ]
Since commit b4868b44c5 ("NFSv4: Wait for stateid updates after
CLOSE/OPEN_DOWNGRADE"), every inter server copy operation suffers 5
seconds delay regardless of the size of the copy. The delay is from
nfs_set_open_stateid_locked when the check by nfs_stateid_is_sequential
fails because the seqid in both nfs4_state and nfs4_stateid are 0.
Fix by modifying nfs4_init_cp_state to return the stateid with seqid 1
instead of 0. This is also to conform with section 4.8 of RFC 7862.
Here is the relevant paragraph from section 4.8 of RFC 7862:
A copy offload stateid's seqid MUST NOT be zero. In the context of a
copy offload operation, it is inappropriate to indicate "the most
recent copy offload operation" using a stateid with a seqid of zero
(see Section 8.2.2 of [RFC5661]). It is inappropriate because the
stateid refers to internal state in the server and there may be
several asynchronous COPY operations being performed in parallel on
the same file by the server. Therefore, a copy offload stateid with
a seqid of zero MUST be considered invalid.
Fixes: ce0887ac96 ("NFSD add nfs4 inter ssc to nfsd4_copy")
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4420440c57892779f265108f46f83832a88ca795 ]
The warning message from nfsd terminating normally
can confuse system adminstrators or monitoring software.
Though it's not exactly fair to pin-point a commit where it
originated, the current form in the current place started
to appear in:
Fixes: e096bbc648 ("knfsd: remove special handling for SIGHUP")
Signed-off-by: kazuo ito <kzpn200@gmail.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 89ff6005039a878afac87889fee748fa3f957c3a ]
In case of retrying fill_super with skip_recovery,
s_encoding for casefold would not be loaded again even though it's
already been freed because it's not NULL.
Set NULL after free to prevent double freeing when unmount.
Fixes: eca4873ee1 ("f2fs: Use generic casefolding support")
Signed-off-by: Hyeongseok Kim <hyeongseok@gmail.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bf701b765eaa82dd164d65edc5747ec7288bb5c3 ]
nfsiod is currently a concurrency-managed workqueue (CMWQ).
This means that workitems scheduled to nfsiod on a given CPU are queued
behind all other work items queued on any CMWQ on the same CPU. This
can introduce unexpected latency.
Occaionally nfsiod can even cause excessive latency. If the work item
to complete a CLOSE request calls the final iput() on an inode, the
address_space of that inode will be dismantled. This takes time
proportional to the number of in-memory pages, which on a large host
working on large files (e.g.. 5TB), can be a large number of pages
resulting in a noticable number of seconds.
We can avoid these latency problems by switching nfsiod to WQ_UNBOUND.
This causes each concurrent work item to gets a dedicated thread which
can be scheduled to an idle CPU.
There is precedent for this as several other filesystems use WQ_UNBOUND
workqueue for handling various async events.
Signed-off-by: NeilBrown <neilb@suse.de>
Fixes: ada609ee2a ("workqueue: use WQ_MEM_RECLAIM instead of WQ_RESCUER")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9b82d88d5976e5f2b8015d58913654856576ace5 ]
NLM uses an interval-based rebinding, i.e. it clears the transport's
binding under certain conditions if more than 60 seconds have elapsed
since the connection was last bound.
This rebinding is not necessary for an autobind RPC client over a
connection-oriented protocol like TCP.
It can also cause problems: it is possible for nlm_bind_host() to clear
XPRT_BOUND whilst a connection worker is in the middle of trying to
reconnect, after it had already been checked in xprt_connect().
When the connection worker notices that XPRT_BOUND has been cleared
under it, in xs_tcp_finish_connecting(), that results in:
xs_tcp_setup_socket: connect returned unhandled error -107
Worse, it's possible that the two can get into lockstep, resulting in
the same behaviour repeated indefinitely, with the above error every
300 seconds, without ever recovering, and the connection never being
established. This has been seen in practice, with a large number of NLM
client tasks, following a server restart.
The existing callers of nlm_bind_host & nlm_rebind_host should not need
to force the rebind, for TCP, so restrict the interval-based rebinding
to UDP only.
For TCP, we will still rebind when needed, e.g. on timeout, and connection
error (including closure), since connection-related errors on an existing
connection, ECONNREFUSED when trying to connect, and rpc_check_timeout(),
already unconditionally clear XPRT_BOUND.
To avoid having to add the fix, and explanation, to both nlm_bind_host()
and nlm_rebind_host(), remove the duplicate code from the former, and
have it call the latter.
Drop the dprintk, which adds no value over a trace.
Signed-off-by: Calum Mackay <calum.mackay@oracle.com>
Fixes: 35f5a422ce ("SUNRPC: new interface to force an RPC rebind")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 046e5ccb4198b990190e11fb52fd9cfd264402eb ]
We can fit the device_addr4 opaque data padding in the pages.
Fixes: cf500bac8f ("SUNRPC: Introduce rpc_prepare_reply_pages()")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 05ad917561fca39a03338cb21fe9622f998b0f9c ]
Currently, the client will always ask for security_labels if the server
returns that it supports that feature regardless of any LSM modules
(such as Selinux) enforcing security policy. This adds performance
penalty to the READDIR operation.
Client adjusts superblock's support of the security_label based on
the server's support but also current client's configuration of the
LSM modules. Thus, prior to using the default bitmask in READDIR,
this patch checks the server's capabilities and then instructs
READDIR to remove FATTR4_WORD2_SECURITY_LABEL from the bitmask.
v5: fixing silly mistakes of the rushed v4
v4: simplifying logic
v3: changing label's initialization per Ondrej's comment
v2: dropping selinux hook and using the sb cap.
Suggested-by: Ondrej Mosnacek <omosnace@redhat.com>
Suggested-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Fixes: 2b0143b5c9 ("VFS: normal filesystems (and lustre): d_inode() annotations")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 66ab33bf6d4341574f88b511e856a73f6f2a921e ]
This can be triggered for example by adding the "-omand" mount option,
which will be rejected and virtio_fs_fill_super() will return an error.
In such a case the allocations for fuse_conn and fuse_mount will leak due
to s_root not yet being set and so ->put_super() not being called.
Fixes: a62a8ef9d9 ("virtio-fs: add virtiofs filesystem")
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3acc4522d89e0a326db69e9d0afaad8cf763a54c ]
When running fault injection test, if we don't stop checkpoint, some stale
NAT entries were flushed which breaks consistency.
Fixes: 86f33603f8 ("f2fs: handle errors of f2fs_get_meta_page_nofail")
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit e51d68e76d604c6d5d1eb13ae1d6da7f6c8c0dfc upstream.
When dquot_resume() was last updated, the argument that got passed
to vfs_cleanup_quota_inode was incorrectly set.
If type = -1 and dquot_load_quota_sb() returns a negative value,
then vfs_cleanup_quota_inode() gets called with -1 passed as an
argument, and this leads to an array-index-out-of-bounds bug.
Fix this issue by correctly passing the arguments.
Fixes: ae45f07d47 ("quota: Simplify dquot_resume()")
Link: https://lore.kernel.org/r/20201208194338.7064-1-anant.thazhemadam@gmail.com
Reported-by: syzbot+2643e825238d7aabb37f@syzkaller.appspotmail.com
Tested-by: syzbot+2643e825238d7aabb37f@syzkaller.appspotmail.com
CC: stable@vger.kernel.org
Signed-off-by: Anant Thazhemadam <anant.thazhemadam@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bfc2b7e8518999003a61f91c1deb5e88ed77b07d upstream.
As described in "fscrypt: add fscrypt_is_nokey_name()", it's possible to
create a duplicate filename in an encrypted directory by creating a file
concurrently with adding the directory's encryption key.
Fix this bug on f2fs by rejecting no-key dentries in f2fs_add_link().
Note that the weird check for the current task in f2fs_do_add_link()
seems to make this bug difficult to reproduce on f2fs.
Fixes: 9ea97163c6 ("f2fs crypto: add filename encryption for f2fs_add_link")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20201118075609.120337-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 75d18cd1868c2aee43553723872c35d7908f240f upstream.
As described in "fscrypt: add fscrypt_is_nokey_name()", it's possible to
create a duplicate filename in an encrypted directory by creating a file
concurrently with adding the directory's encryption key.
Fix this bug on ext4 by rejecting no-key dentries in ext4_add_entry().
Note that the duplicate check in ext4_find_dest_de() sometimes prevented
this bug. However in many cases it didn't, since ext4_find_dest_de()
doesn't examine every dentry.
Fixes: 4461471107 ("ext4 crypto: enable filename encryption")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20201118075609.120337-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 76786a0f083473de31678bdb259a3d4167cf756d upstream.
As described in "fscrypt: add fscrypt_is_nokey_name()", it's possible to
create a duplicate filename in an encrypted directory by creating a file
concurrently with adding the directory's encryption key.
Fix this bug on ubifs by rejecting no-key dentries in ubifs_create(),
ubifs_mkdir(), ubifs_mknod(), and ubifs_symlink().
Note that ubifs doesn't actually report the duplicate filenames from
readdir, but rather it seems to replace the original dentry with a new
one (which is still wrong, just a different effect from ext4).
On ubifs, this fixes xfstest generic/595 as well as the new xfstest I
wrote specifically for this bug.
Fixes: f4f61d2cc6 ("ubifs: Implement encrypted filenames")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20201118075609.120337-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 159e1de201b6fca10bfec50405a3b53a561096a8 upstream.
It's possible to create a duplicate filename in an encrypted directory
by creating a file concurrently with adding the encryption key.
Specifically, sys_open(O_CREAT) (or sys_mkdir(), sys_mknod(), or
sys_symlink()) can lookup the target filename while the directory's
encryption key hasn't been added yet, resulting in a negative no-key
dentry. The VFS then calls ->create() (or ->mkdir(), ->mknod(), or
->symlink()) because the dentry is negative. Normally, ->create() would
return -ENOKEY due to the directory's key being unavailable. However,
if the key was added between the dentry lookup and ->create(), then the
filesystem will go ahead and try to create the file.
If the target filename happens to already exist as a normal name (not a
no-key name), a duplicate filename may be added to the directory.
In order to fix this, we need to fix the filesystems to prevent
->create(), ->mkdir(), ->mknod(), and ->symlink() on no-key names.
(->rename() and ->link() need it too, but those are already handled
correctly by fscrypt_prepare_rename() and fscrypt_prepare_link().)
In preparation for this, add a helper function fscrypt_is_nokey_name()
that filesystems can use to do this check. Use this helper function for
the existing checks that fs/crypto/ does for rename and link.
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20201118075609.120337-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3ceb6543e9cf6ed87cc1fbc6f23ca2db903564cd upstream.
There isn't really any valid reason to use __FSCRYPT_MODE_MAX or
FSCRYPT_POLICY_FLAGS_VALID in a userspace program. These constants are
only meant to be used by the kernel internally, and they are defined in
the UAPI header next to the mode numbers and flags only so that kernel
developers don't forget to update them when adding new modes or flags.
In https://lkml.kernel.org/r/20201005074133.1958633-2-satyat@google.com
there was an example of someone wanting to use __FSCRYPT_MODE_MAX in a
user program, and it was wrong because the program would have broken if
__FSCRYPT_MODE_MAX were ever increased. So having this definition
available is harmful. FSCRYPT_POLICY_FLAGS_VALID has the same problem.
So, remove these definitions from the UAPI header. Replace
FSCRYPT_POLICY_FLAGS_VALID with just listing the valid flags explicitly
in the one kernel function that needs it. Move __FSCRYPT_MODE_MAX to
fscrypt_private.h, remove the double underscores (which were only
present to discourage use by userspace), and add a BUILD_BUG_ON() and
comments to (hopefully) ensure it is kept in sync.
Keep the old name FS_POLICY_FLAGS_VALID, since it's been around for
longer and there's a greater chance that removing it would break source
compatibility with some program. Indeed, mtd-utils is using it in
an #ifdef, and removing it would introduce compiler warnings (about
FS_POLICY_FLAGS_PAD_* being redefined) into the mtd-utils build.
However, reduce its value to 0x07 so that it only includes the flags
with old names (the ones present before Linux 5.4), and try to make it
clear that it's now "frozen" and no new flags should be added to it.
Fixes: 2336d0deb2 ("fscrypt: use FSCRYPT_ prefix for uapi constants")
Cc: <stable@vger.kernel.org> # v5.4+
Link: https://lore.kernel.org/r/20201024005132.495952-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5335bfc6eb688344bfcd4b4133c002c0ae0d0719 upstream.
section is dirty, but dirty_secmap may not set
Reported-by: Jia Yang <jiayang5@huawei.com>
Fixes: da52f8ade4 ("f2fs: get the right gc victim section when section has several segments")
Cc: <stable@vger.kernel.org>
Signed-off-by: Jack Qiu <jack.qiu@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A single patch in this pull request to fix a BIO and page reference
leak when writing sequential zone files.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQSRPv8tYSvhwAzJdzjdoc3SxdoYdgUCX9NY0gAKCRDdoc3SxdoY
dga9AQCq+hAOyA1FeCkWwBG0gywIIVhPscNphe1bH576JoaIZAEA86DsGB9BED7H
yaezfh5W1lr63ctxtSVdyo+x5172fgY=
=Te7s
-----END PGP SIGNATURE-----
Merge tag 'zonefs-5.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs
Pull zonefs fix from Damien Le Moal:
"A single patch in this pull request to fix a BIO and page reference
leak when writing sequential zone files"
* tag 'zonefs-5.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs:
zonefs: fix page reference and BIO leak
When we try to visit the pagemap of a tagged userspace pointer, we find
that the start_vaddr is not correct because of the tag.
To fix it, we should untag the userspace pointers in pagemap_read().
I tested with 5.10-rc4 and the issue remains.
Explanation from Catalin in [1]:
"Arguably, that's a user-space bug since tagged file offsets were never
supported. In this case it's not even a tag at bit 56 as per the arm64
tagged address ABI but rather down to bit 47. You could say that the
problem is caused by the C library (malloc()) or whoever created the
tagged vaddr and passed it to this function. It's not a kernel
regression as we've never supported it.
Now, pagemap is a special case where the offset is usually not
generated as a classic file offset but rather derived by shifting a
user virtual address. I guess we can make a concession for pagemap
(only) and allow such offset with the tag at bit (56 - PAGE_SHIFT + 3)"
My test code is based on [2]:
A userspace pointer which has been tagged by 0xb4: 0xb400007662f541c8
userspace program:
uint64 OsLayer::VirtualToPhysical(void *vaddr) {
uint64 frame, paddr, pfnmask, pagemask;
int pagesize = sysconf(_SC_PAGESIZE);
off64_t off = ((uintptr_t)vaddr) / pagesize * 8; // off = 0xb400007662f541c8 / pagesize * 8 = 0x5a00003b317aa0
int fd = open(kPagemapPath, O_RDONLY);
...
if (lseek64(fd, off, SEEK_SET) != off || read(fd, &frame, 8) != 8) {
int err = errno;
string errtxt = ErrorString(err);
if (fd >= 0)
close(fd);
return 0;
}
...
}
kernel fs/proc/task_mmu.c:
static ssize_t pagemap_read(struct file *file, char __user *buf,
size_t count, loff_t *ppos)
{
...
src = *ppos;
svpfn = src / PM_ENTRY_BYTES; // svpfn == 0xb400007662f54
start_vaddr = svpfn << PAGE_SHIFT; // start_vaddr == 0xb400007662f54000
end_vaddr = mm->task_size;
/* watch out for wraparound */
// svpfn == 0xb400007662f54
// (mm->task_size >> PAGE) == 0x8000000
if (svpfn > mm->task_size >> PAGE_SHIFT) // the condition is true because of the tag 0xb4
start_vaddr = end_vaddr;
ret = 0;
while (count && (start_vaddr < end_vaddr)) { // we cannot visit correct entry because start_vaddr is set to end_vaddr
int len;
unsigned long end;
...
}
...
}
[1] https://lore.kernel.org/patchwork/patch/1343258/
[2] https://github.com/stressapptest/stressapptest/blob/master/src/os.cc#L158
Link: https://lkml.kernel.org/r/20201204024347.8295-1-miles.chen@mediatek.com
Signed-off-by: Miles Chen <miles.chen@mediatek.com>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Song Bao Hua (Barry Song) <song.bao.hua@hisilicon.com>
Cc: <stable@vger.kernel.org> [5.4-]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Bugfixes:
- Fix array overflow when flexfiles mirroring is enabled
- Fix rpcrdma_inline_fixup() crash with new LISTXATTRS
- Fix 5 second delay when doing inter-server copy
- Disable READ_PLUS by default
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEnZ5MQTpR7cLU7KEp18tUv7ClQOsFAl/Sl70ACgkQ18tUv7Cl
QOuGdw/9GMmrLF26brO3q3lSTD9xdljJGQi0q1DEF2amNUL9cPgxB0xZB9lyUPqs
AbpLxsCgfJ+rN3sbOnDdz8Ae/wiWS1PkeJWw4cu0/kIIjyPD2Uu4jWuIwxpFvP1v
lFfBHeEdsNgqS5t9t9cF9u30PjPKt9SHG/64/8GY92zlrKDRUaJUyvW1zUKo4ne4
m3JxZ1rCxNkBYV+odrQIYE9gqFI6OBWpVsUeIXjBWWDZ/+zK9rGiO5fxZ8oe+yT2
cmrR/vOznoPgUGaH380HrCbXcIKwim2mn76t1MH9fScZQc1ocSWckrZRRdOVWlJ+
228ImDPACLFaTqpgi8ZhFmuD3Mwbz4AtPLce52lvZLuMAPVMoFhR8xrm/VpJFSim
pNjwsl/j9qlGSAe5XdGW+X/DzmHPVd6uhfzOFAd7ZErt7rU0gjyJTWzrmayoB3N1
GF5g4LaK1dYYj7SG8d8DrEV2CGyYle2UP4aDm8qJ/ZLQhd10RHeVt4BHfSKwY05l
03Gim7wlH/ercYfgU6wrgmEG2SWjuFJr+j1v//aqcmJVhA7FvTKygGpGyNK16pNL
9dZ81wUHKHzVt5C/ruG+3tARvnobvmB4ngQAQmtu0aXe9c+Kr2XlaXUAusHPK+Me
pVqtmaKBgddUtJwFms8JF6dePiSBRQgpwyvluqNi1D7ntRhS+Ik=
=EXrS
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-5.10-3' of git://git.linux-nfs.org/projects/anna/linux-nfs
Pull NFS client fixes from Anna Schumaker:
"Here are a handful more bugfixes for 5.10.
Unfortunately, we found some problems with the new READ_PLUS operation
that aren't easy to fix. We've decided to disable this codepath
through a Kconfig option for now, but a series of patches going into
5.11 will clean up the code and fix the issues at the same time. This
seemed like the best way to go about it.
Summary:
- Fix array overflow when flexfiles mirroring is enabled
- Fix rpcrdma_inline_fixup() crash with new LISTXATTRS
- Fix 5 second delay when doing inter-server copy
- Disable READ_PLUS by default"
* tag 'nfs-for-5.10-3' of git://git.linux-nfs.org/projects/anna/linux-nfs:
NFS: Disable READ_PLUS by default
NFSv4.2: Fix 5 seconds delay when doing inter server copy
NFS: Fix rpcrdma_inline_fixup() crash with new LISTXATTRS operation
pNFS/flexfiles: Fix array overflow when flexfiles mirroring is enabled
We've been seeing failures with xfstests generic/091 and generic/263
when using READ_PLUS. I've made some progress on these issues, and the
tests fail later on but still don't pass. Let's disable READ_PLUS by
default until we can work out what is going on.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Since commit b4868b44c5 ("NFSv4: Wait for stateid updates after
CLOSE/OPEN_DOWNGRADE"), every inter server copy operation suffers 5
seconds delay regardless of the size of the copy. The delay is from
nfs_set_open_stateid_locked when the check by nfs_stateid_is_sequential
fails because the seqid in both nfs4_state and nfs4_stateid are 0.
Fix __nfs42_ssc_open to delay setting of NFS_OPEN_STATE in nfs4_state,
until after the call to update_open_stateid, to indicate this is the 1st
open. This fix is part of a 2 patches, the other patch is the fix in the
source server to return the stateid for COPY_NOTIFY request with seqid 1
instead of 0.
Fixes: ce0887ac96 ("NFSD add nfs4 inter ssc to nfsd4_copy")
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
By switching to an XFS-backed export, I am able to reproduce the
ibcomp worker crash on my client with xfstests generic/013.
For the failing LISTXATTRS operation, xdr_inline_pages() is called
with page_len=12 and buflen=128.
- When ->send_request() is called, rpcrdma_marshal_req() does not
set up a Reply chunk because buflen is smaller than the inline
threshold. Thus rpcrdma_convert_iovs() does not get invoked at
all and the transport's XDRBUF_SPARSE_PAGES logic is not invoked
on the receive buffer.
- During reply processing, rpcrdma_inline_fixup() tries to copy
received data into rq_rcv_buf->pages because page_len is positive.
But there are no receive pages because rpcrdma_marshal_req() never
allocated them.
The result is that the ibcomp worker faults and dies. Sometimes that
causes a visible crash, and sometimes it results in a transport hang
without other symptoms.
RPC/RDMA's XDRBUF_SPARSE_PAGES support is not entirely correct, and
should eventually be fixed or replaced. However, my preference is
that upper-layer operations should explicitly allocate their receive
buffers (using GFP_KERNEL) when possible, rather than relying on
XDRBUF_SPARSE_PAGES.
Reported-by: Olga kornievskaia <kolga@netapp.com>
Suggested-by: Olga kornievskaia <kolga@netapp.com>
Fixes: c10a75145f ("NFSv4.2: add the extended attribute proc functions.")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Olga kornievskaia <kolga@netapp.com>
Reviewed-by: Frank van der Linden <fllinden@amazon.com>
Tested-by: Olga kornievskaia <kolga@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
In zonefs_file_dio_append(), the pages obtained using
bio_iov_iter_get_pages() are not released on completion of the
REQ_OP_APPEND BIO, nor when bio_iov_iter_get_pages() fails.
Furthermore, a call to bio_put() is missing when
bio_iov_iter_get_pages() fails.
Fix these resource leaks by adding BIO resource release code (bio_put()i
and bio_release_pages()) at the end of the function after the BIO
execution and add a jump to this resource cleanup code in case of
bio_iov_iter_get_pages() failure.
While at it, also fix the call to task_io_account_write() to be passed
the correct BIO size instead of bio_iov_iter_get_pages() return value.
Reported-by: Christoph Hellwig <hch@lst.de>
Fixes: 02ef12a663 ("zonefs: use REQ_OP_ZONE_APPEND for sync DIO")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
There's a memory leak in afs_parse_source() whereby multiple source=
parameters overwrite fc->source in the fs_context struct without freeing
the previously recorded source.
Fix this by only permitting a single source parameter and rejecting with
an error all subsequent ones.
This was caught by syzbot with the kernel memory leak detector, showing
something like the following trace:
unreferenced object 0xffff888114375440 (size 32):
comm "repro", pid 5168, jiffies 4294923723 (age 569.948s)
backtrace:
slab_post_alloc_hook+0x42/0x79
__kmalloc_track_caller+0x125/0x16a
kmemdup_nul+0x24/0x3c
vfs_parse_fs_string+0x5a/0xa1
generic_parse_monolithic+0x9d/0xc5
do_new_mount+0x10d/0x15a
do_mount+0x5f/0x8e
__do_sys_mount+0xff/0x127
do_syscall_64+0x2d/0x3a
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: 13fcc68370 ("afs: Add fs_context support")
Reported-by: syzbot+86dc6632faaca40133ab@syzkaller.appspotmail.com
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull seq_file fix from Al Viro:
"This fixes a regression introduced in this cycle wrt iov_iter based
variant for reading a seq_file"
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fix return values of seq_read_iter()
After io_identity_cow() copies an work.identity it wants to copy creds
to the new just allocated id, not the old one. Otherwise it's
akin to req->work.identity->creds = req->work.identity->creds.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
'format_corename()' will splite 'core_pattern' on spaces when it is in
pipe mode, and take helper_argv[0] as the path to usermode executable.
It works fine in most cases.
However, if there is a space between '|' and '/file/path', such as
'| /usr/lib/systemd/systemd-coredump %P %u %g', then helper_argv[0] will
be parsed as '', and users will get a 'Core dump to | disabled'.
It is not friendly to users, as the pattern above was valid previously.
Fix this by ignoring the spaces between '|' and '/file/path'.
Fixes: 315c69261d ("coredump: split pipe command whitespace before expanding template")
Signed-off-by: Menglong Dong <dong.menglong@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Paul Wise <pabs3@bonedaddy.net>
Cc: Jakub Wilk <jwilk@jwilk.net> [https://bugs.debian.org/924398]
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/5fb62870.1c69fb81.8ef5d.af76@mx.google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl/L/o4QHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpifFD/9tqDSwqcRIKRhSHqsD6OJRagZ0UB37cPvH
ZXmtHblK9BkeXc83OPy+NmlcPq1mpDmbMgy0rls8UOGnZs0RINN9En5hHGOeRVD4
Ma6Abh9ypzXhNKqY4HO+lcsN+elGqoaYNSnu/Nbcib7Z5yYzUSNDnpIeTQy7Onbw
dTGNTJalZYqrQ+6ytglfVeQuXSSKYPJfjFdcYd1It6Ks0aSKDgcQxJZgPeuF6xE+
mxEWADs+zpZG8UqZYyLnS6+9dxuJwhx+Cd47Ntky0VdaWmxCogQiI3X/Q3vspnVq
gz9VqnUIA3Qgabm0UJkyjCi87W2KNDf4r5OOaQw4dLc0kDhIVS6ii67iGVs9Dl7F
xCZMCE4pkEvs0pPYCyITUUNPh/fBONHgDHAWxxhztkE8ldvsAUmqNyV/LWMMvtSl
SUQli6OZMVyyUX9sroHOWKhi8rQYt1UZa0tm+UZE/+YrZLfp6qBCZHFC6hdNfQoS
k7PiuxecsH4i/OweJMPtggChJ7VWHZUorTljfHnl14oDuFuVD22AJvc3q+OSStI1
Ua5N27b2/pWXnf9P3dPGzikprOU/vXPQnlifmBxXNj+/TD29+GPUHhV2+kibc5/F
gMbrhX3TX4tstHIGxUCoeB8XepiBqOmn1eLkWNQQGi8RbqCKfE4j/5o07HgxST2P
Ns2GaEKb4A==
=CPrM
-----END PGP SIGNATURE-----
Merge tag 'io_uring-5.10-2020-12-05' of git://git.kernel.dk/linux-block
Pull io_uring fix from Jens Axboe:
"Just a small fix this time, for an issue with 32-bit compat apps and
buffer selection with recvmsg"
* tag 'io_uring-5.10-2020-12-05' of git://git.kernel.dk/linux-block:
io_uring: fix recvmsg setup with compat buf-select
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAl/K3U0ACgkQiiy9cAdy
T1Gewwv7Bjtq+c4tvc0LZj04oCfgsmcdu9s82Xff+e81lJlPyK+04zrp1kPiVQ8O
iV5+bQr96BPgA96f2Et/OdnO4ei4T1vp9yTfXkTHQCWHQhd57N98TNudtJ+JNKCv
ziaY+X6BFBBFnbCiPQxLZjvXsXenS3PFTq+wJB6ot4Qsj7z3kk88nXrvDDP7kiFL
3WqZzUWKPJaRHjJQIrUch9M77P1cGk1duKT8Ydj/XKQujocWFWYXEWVYMGKcFFpK
cXBEbN6vBUTogn6fA8lQKy3KTarRGT8qpl8c91UCZ61sH8t6RlHQ53rtPrLPG/9k
KH8xyN+Lu4+w9LDni5HqsIHAqi0+dnPvKRZWIkxMI/ZbZcV4Mwoi4kMZ/H85psp6
wEDgA6czaQSS9c3jZ0lvNgOHvpKkz1lz8sXwUuTYVehMaQrTcxNdEgh8TdlfZ9P6
St404ZXSsthXChac2N22G9JX4XcM/wocSWq/sfpoUvr9eKJMjFMdB7rP7ORPQofb
S/u8v5pD
=3b+C
-----END PGP SIGNATURE-----
Merge tag '5.10-rc6-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"Three smb3 fixes (two for stable) fixing
- a null pointer issue in a DFS error path
- a problem with excessive padding when mounted with "idsfromsid"
causing owner fields to get corrupted
- a more recent problem with compounded reparse point query found in
testing to the Linux kernel server"
* tag '5.10-rc6-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
cifs: refactor create_sd_buf() and and avoid corrupting the buffer
cifs: add NULL check for ses->tcon_ipc
smb3: set COMPOUND_FID to FileID field of subsequent compound request
When mounting with "idsfromsid" mount option, Azure
corrupted the owner SIDs due to excessive padding
caused by placing the owner fields at the end of the
security descriptor on create. Placing owners at the
front of the security descriptor (rather than the end)
is also safer, as the number of ACEs (that follow it)
are variable.
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Suggested-by: Rohith Surabattula <rohiths@microsoft.com>
CC: Stable <stable@vger.kernel.org> # v5.8
Signed-off-by: Steve French <stfrench@microsoft.com>
In some scenarios (DFS and BAD_NETWORK_NAME) set_root_set() can be
called with a NULL ses->tcon_ipc.
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
For an operation compounded with an SMB2 CREATE request, client must set
COMPOUND_FID(0xFFFFFFFFFFFFFFFF) to FileID field of smb2 ioctl.
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Fixes: 2e4564b31b ("smb3: add support stat of WSL reparse points for special file types")
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEEJZs3krPW0xkhLMTc1b+f6wMTZToFAl/H/pUUHGFncnVlbmJh
QHJlZGhhdC5jb20ACgkQ1b+f6wMTZTqxhRAAuTKHz1jMYHNAWm1NVVRiGVENhjlV
g5kRcRmf8m/VAWmI+HM0J34CbHpWc1IgJnjID/vZ/VB8QQPmoDoEFrD/ple9V6A4
xQJzQcG9uvLir5h43R2SWR6OSSijtEDjvpQpRQSK3B2e1otRV+W7mLjPBkg410rY
l9Cj01QhCwCv1dVO7VknSEB1Jpkhe0Vvr1z4DropeU5JEBSADTl5deTkFWbinDo1
9nygDw01AdWXhR1FNJTAy8XJYlcM3dx6xDuHD9xJH9HciWsocOr+EqtPfJEEsCSf
X6CrWY3mYZBr1kJFcvp2HD5rKYhw0xC0gZiHEnvmZx5XKTKoDKq2+2s2wQOwWk3B
QIvEfLoAzV6mKBHRC8mCDw2bImj8FDrvBkmmJnlyL6nJDrriBXimFAJPX7MryKTJ
jdvm0JFRyY0ja3LxLxBVwiKnKiw6yY6fPjpMWzi042P/PhshRSvazxgUf+pc69CB
Czo3W9MnNGWzP73ALIsuVxG81RW48CvAJdXOmoWIqjEeqnC0YZ3vgwInQcqBzton
fCI38XN6CG1UbwaVxoZGoPS5FIfBbrTL5v62e3U8JIv3J4ol5fWCEruUhImsByGI
w3UkiuNqWjOab/774m9ZS0eBXCEn6LAizvZSgf3pt+8VRoN20RxF3k1Jhvm7g7I3
IbOHK9kddrycH9Q=
=EHzo
-----END PGP SIGNATURE-----
Merge tag 'gfs2-v5.10-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2
Pull gfs2 fixes from Andreas Gruenbacher:
"Various gfs2 fixes"
* tag 'gfs2-v5.10-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
gfs2: Fix deadlock between gfs2_{create_inode,inode_lookup} and delete_work_func
gfs2: Upgrade shared glocks for atime updates
gfs2: Don't freeze the file system during unmount
gfs2: check for empty rgrp tree in gfs2_ri_update
gfs2: set lockdep subclass for iopen glocks
gfs2: Fix deadlock dumping resource group glocks
The default splice operations got removed recently, add it back to 9p
with iter_file_splice_write like many other filesystems do.
Link: http://lkml.kernel.org/r/1606837496-21717-1-git-send-email-asmadeus@codewreck.org
Fixes: 36e2c7421f ("fs: don't allow splice read/write without explicit ops")
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
The v9fs file operations were missing the splice_read operations, which
breaks sendfile() of files on such a filesystem. I discovered this while
trying to load an eBPF program using iproute2 inside a 'virtme' environment
which uses 9pfs for the virtual file system. iproute2 relies on sendfile()
with an AF_ALG socket to hash files, which was erroring out in the virtual
environment.
Since generic_file_splice_read() seems to just implement splice_read in
terms of the read_iter operation, I simply added the generic implementation
to the file operations, which fixed the error I was seeing. A quick grep
indicates that this is what most other file systems do as well.
Link: http://lkml.kernel.org/r/20201201135409.55510-1-toke@redhat.com
Fixes: 36e2c7421f ("fs: don't allow splice read/write without explicit ops")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
In gfs2_create_inode and gfs2_inode_lookup, make sure to cancel any pending
delete work before taking the inode glock. Otherwise, gfs2_cancel_delete_work
may block waiting for delete_work_func to complete, and delete_work_func may
block trying to acquire the inode glock in gfs2_inode_lookup.
Reported-by: Alexander Aring <aahringo@redhat.com>
Fixes: a0e3cc65fa ("gfs2: Turn gl_delete into a delayed work")
Cc: stable@vger.kernel.org # v5.8+
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
This patch fixes a potential use-after-free bug in
cifs_echo_request().
For instance,
thread 1
--------
cifs_demultiplex_thread()
clean_demultiplex_info()
kfree(server)
thread 2 (workqueue)
--------
apic_timer_interrupt()
smp_apic_timer_interrupt()
irq_exit()
__do_softirq()
run_timer_softirq()
call_timer_fn()
cifs_echo_request() <- use-after-free in server ptr
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
CC: Stable <stable@vger.kernel.org>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
A customer has reported that several files in their multi-threaded app
were left with size of 0 because most of the read(2) calls returned
-EINTR and they assumed no bytes were read. Obviously, they could
have fixed it by simply retrying on -EINTR.
We noticed that most of the -EINTR on read(2) were due to real-time
signals sent by glibc to process wide credential changes (SIGRT_1),
and its signal handler had been established with SA_RESTART, in which
case those calls could have been automatically restarted by the
kernel.
Let the kernel decide to whether or not restart the syscalls when
there is a signal pending in __smb_send_rqst() by returning
-ERESTARTSYS. If it can't, it will return -EINTR anyway.
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
CC: Stable <stable@vger.kernel.org>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
__io_compat_recvmsg_copy_hdr() with REQ_F_BUFFER_SELECT reads out iov
len but never assigns it to iov/fast_iov, leaving sr->len with garbage.
Hopefully, following io_buffer_select() truncates it to the selected
buffer size, but the value is still may be under what was specified.
Cc: <stable@vger.kernel.org> # 5.7
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
If the flexfiles mirroring is enabled, then the read code expects to be
able to set pgio->pg_mirror_idx to point to the data server that is
being used for this particular read. However it does not change the
pg_mirror_count because we only need to send a single read.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>