We had a report of a crash while stress testing the NFS client:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000150
IP: [<ffffffff8127b698>] locks_get_lock_context+0x8/0x90
PGD 0
Oops: 0000 [#1] SMP
Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_filter ebtable_broute bridge stp llc ebtables ip6table_security ip6table_mangle ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_raw ip6table_filter ip6_tables iptable_security iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_raw coretemp crct10dif_pclmul ppdev crc32_pclmul crc32c_intel ghash_clmulni_intel vmw_balloon serio_raw vmw_vmci i2c_piix4 shpchp parport_pc acpi_cpufreq parport nfsd auth_rpcgss nfs_acl lockd grace sunrpc vmwgfx drm_kms_helper ttm drm mptspi scsi_transport_spi mptscsih mptbase e1000 ata_generic pata_acpi
CPU: 1 PID: 399 Comm: kworker/1:1H Not tainted 4.1.0-0.rc1.git0.1.fc23.x86_64 #1
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/30/2013
Workqueue: rpciod rpc_async_schedule [sunrpc]
task: ffff880036aea7c0 ti: ffff8800791f4000 task.ti: ffff8800791f4000
RIP: 0010:[<ffffffff8127b698>] [<ffffffff8127b698>] locks_get_lock_context+0x8/0x90
RSP: 0018:ffff8800791f7c00 EFLAGS: 00010293
RAX: ffff8800791f7c40 RBX: ffff88001f2ad8c0 RCX: ffffe8ffffc80305
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff8800791f7c88 R08: ffff88007fc971d8 R09: 279656d600000000
R10: 0000034a01000000 R11: 279656d600000000 R12: ffff88001f2ad918
R13: ffff88001f2ad8c0 R14: 0000000000000000 R15: 0000000100e73040
FS: 0000000000000000(0000) GS:ffff88007fc80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000150 CR3: 0000000001c0b000 CR4: 00000000000407e0
Stack:
ffffffff8127c5b0 ffff8800791f7c18 ffffffffa0171e29 ffff8800791f7c58
ffffffffa0171ef8 ffff8800791f7c78 0000000000000246 ffff88001ea0ba00
ffff8800791f7c40 ffff8800791f7c40 00000000ff5d86a3 ffff8800791f7ca8
Call Trace:
[<ffffffff8127c5b0>] ? __posix_lock_file+0x40/0x760
[<ffffffffa0171e29>] ? rpc_make_runnable+0x99/0xa0 [sunrpc]
[<ffffffffa0171ef8>] ? rpc_wake_up_task_queue_locked.part.35+0xc8/0x250 [sunrpc]
[<ffffffff8127cd3a>] posix_lock_file_wait+0x4a/0x120
[<ffffffffa03e4f12>] ? nfs41_wake_and_assign_slot+0x32/0x40 [nfsv4]
[<ffffffffa03bf108>] ? nfs41_sequence_done+0xd8/0x2d0 [nfsv4]
[<ffffffffa03c116d>] do_vfs_lock+0x2d/0x30 [nfsv4]
[<ffffffffa03c251d>] nfs4_lock_done+0x1ad/0x210 [nfsv4]
[<ffffffffa0171a30>] ? __rpc_sleep_on_priority+0x390/0x390 [sunrpc]
[<ffffffffa0171a30>] ? __rpc_sleep_on_priority+0x390/0x390 [sunrpc]
[<ffffffffa0171a5c>] rpc_exit_task+0x2c/0xa0 [sunrpc]
[<ffffffffa0167450>] ? call_refreshresult+0x150/0x150 [sunrpc]
[<ffffffffa0172640>] __rpc_execute+0x90/0x460 [sunrpc]
[<ffffffffa0172a25>] rpc_async_schedule+0x15/0x20 [sunrpc]
[<ffffffff810baa1b>] process_one_work+0x1bb/0x410
[<ffffffff810bacc3>] worker_thread+0x53/0x480
[<ffffffff810bac70>] ? process_one_work+0x410/0x410
[<ffffffff810bac70>] ? process_one_work+0x410/0x410
[<ffffffff810c0b38>] kthread+0xd8/0xf0
[<ffffffff810c0a60>] ? kthread_worker_fn+0x180/0x180
[<ffffffff817a1aa2>] ret_from_fork+0x42/0x70
[<ffffffff810c0a60>] ? kthread_worker_fn+0x180/0x180
Jean says:
"Running locktests with a large number of iterations resulted in a
client crash. The test run took a while and hasn't finished after close
to 2 hours. The crash happened right after I gave up and killed the test
(after 107m) with Ctrl+C."
The crash happened because a NULL inode pointer got passed into
locks_get_lock_context. The call chain indicates that file_inode(filp)
returned NULL, which means that f_inode was NULL. Since that's zeroed
out in __fput, that suggests that this filp pointer outlived the last
reference.
Looking at the code, that seems possible. We copy the struct file_lock
that's passed in, but if the task is signalled at an inopportune time we
can end up trying to use that file_lock in rpciod context after the process
that requested it has already returned (and possibly put its filp
reference).
Fix this by taking an extra reference to the filp when we allocate the
lock info, and put it in nfs4_lock_release.
Reported-by: Jean Spector <jean@primarydata.com>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Highlights include:
Stable patches:
- Fix a regression in /proc/self/mountstats
- Fix the pNFS flexfiles O_DIRECT support
- Fix high load average due to callback thread sleeping
Bugfixes:
- Various patches to fix the pNFS layoutcommit support
- Do not cache pNFS deviceids unless server notifications are enabled
- Fix a SUNRPC transport reconnection regression
- make debugfs file creation failure non-fatal in SUNRPC
- Another fix for circular directory warnings on NFSv4 "junctioned" mountpoints
- Fix locking around NFSv4.2 fallocate() support
- Truncating NFSv4 file opens should also sync O_DIRECT writes
- Prevent infinite loop in rpcrdma_ep_create()
Features:
- Various improvements to the RDMA transport code's handling of memory
registration
- Various code cleanups
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJVOmT6AAoJEGcL54qWCgDyrhYQAMPKXB55jrdOR/7UVSF/xPML
7OjMGHvBnTn/y0pNIyLyS1PjTZZsD/WZjoW9EFGpTv727qQNVoFxFRLNUcgi3NoL
1YledCkLf7Q32aqod93SRRFPc9hzBoKhOZpOzBuWaAviyAB3KLi70DWAq9qRReYM
prXUQQjpW5FLU+B2ifaVc2RCnu/rZ2c02YdR2XdtkBaAJxuhB2vR8IY1evwjCv3R
5zyLDd9zSDDoArdpUzM97cxZPcYRSqbOwgTKvaaRnDDq/mKbKMZaqmEfjblwzNFt
b43FbveJzZ3hlPADIpmaiMHjRTbxWjIKc9K1sOF2FPfcuPe2yM3DMAxDegUkEveS
7fkbv/qRZ30NqfchGanX/pmBlLOcdI76qe/bwhN19wCnw48O1eeHi1HK8rWGhU+E
qcrRZ3ZS2ufP/MVBuhauy0qU9Q4wcEtm7NGGP1231ZtmfjHKyBa4pLirNfG1AlJt
dK7tBrknVx+WVm/UddJp/fEsxbP0+fki6TwzioHUSWcz8rDVYF6PFT/QPM54SX2h
0oqwvu6d/uShpkVRm+fbje8FHmUxKdgqDsCYX2fNjWskh1oXSPsItvjqmTmTlE0i
EBmBwVwI0uB1ZQ3PrJLadhRcO3ZJmLQ5gNj456dstvWy6UQds1xyIQ/DgvmlzxWO
E9t0l18xHGRwbndsDa8f
=j5dP
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-4.1-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
"Another set of mainly bugfixes and a couple of cleanups. No new
functionality in this round.
Highlights include:
Stable patches:
- Fix a regression in /proc/self/mountstats
- Fix the pNFS flexfiles O_DIRECT support
- Fix high load average due to callback thread sleeping
Bugfixes:
- Various patches to fix the pNFS layoutcommit support
- Do not cache pNFS deviceids unless server notifications are enabled
- Fix a SUNRPC transport reconnection regression
- make debugfs file creation failure non-fatal in SUNRPC
- Another fix for circular directory warnings on NFSv4 "junctioned"
mountpoints
- Fix locking around NFSv4.2 fallocate() support
- Truncating NFSv4 file opens should also sync O_DIRECT writes
- Prevent infinite loop in rpcrdma_ep_create()
Features:
- Various improvements to the RDMA transport code's handling of
memory registration
- Various code cleanups"
* tag 'nfs-for-4.1-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (55 commits)
fs/nfs: fix new compiler warning about boolean in switch
nfs: Remove unneeded casts in nfs
NFS: Don't attempt to decode missing directory entries
Revert "nfs: replace nfs_add_stats with nfs_inc_stats when add one"
NFS: Rename idmap.c to nfs4idmap.c
NFS: Move nfs_idmap.h into fs/nfs/
NFS: Remove CONFIG_NFS_V4 checks from nfs_idmap.h
NFS: Add a stub for GETDEVICELIST
nfs: remove WARN_ON_ONCE from nfs_direct_good_bytes
nfs: fix DIO good bytes calculation
nfs: Fetch MOUNTED_ON_FILEID when updating an inode
sunrpc: make debugfs file creation failure non-fatal
nfs: fix high load average due to callback thread sleeping
NFS: Reduce time spent holding the i_mutex during fallocate()
NFS: Don't zap caches on fallocate()
xprtrdma: Make rpcrdma_{un}map_one() into inline functions
xprtrdma: Handle non-SEND completions via a callout
xprtrdma: Add "open" memreg op
xprtrdma: Add "destroy MRs" memreg op
xprtrdma: Add "reset MRs" memreg op
...
The brand new GCC 5.1.0 warns by default on using a boolean in the
switch condition. This results in the following warning:
fs/nfs/nfs4proc.c: In function 'nfs4_proc_get_rootfh':
fs/nfs/nfs4proc.c:3100:10: warning: switch condition has boolean value [-Wswitch-bool]
switch (auth_probe) {
^
This code was obviously using switch to make use of the fall-through
semantics (without the usual comment, though).
Rewrite that code using if statements to avoid the warning and make
the code a bit more readable on the way.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
This file is only used internally to the NFS v4 module, so it doesn't
need to be in the global include path. I also renamed it from
nfs_idmap.h to nfs4idmap.h to emphasize that it's an NFSv4-only include
file.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2ef47eb1 (NFS: Fix use of nfs_attr_use_mounted_on_fileid()) was a good
start to fixing a circular directory structure warning for NFS v4
"junctioned" mountpoints. Unfortunately, further testing continued to
generate this error.
My server is configured like this:
anna@nfsd ~ % df
Filesystem Size Used Avail Use% Mounted on
/dev/vda1 9.1G 2.0G 6.5G 24% /
/dev/vdc1 1014M 33M 982M 4% /exports
/dev/vdc2 1014M 33M 982M 4% /exports/vol1
/dev/vdc3 1014M 33M 982M 4% /exports/vol1/vol2
anna@nfsd ~ % cat /etc/exports
/exports/ *(rw,async,no_subtree_check,no_root_squash)
/exports/vol1/ *(rw,async,no_subtree_check,no_root_squash)
/exports/vol1/vol2 *(rw,async,no_subtree_check,no_root_squash)
I've been running chown across the entire mountpoint twice in a row to
hit this problem. The first run succeeds, but the second one fails with
the circular directory warning along with:
anna@client ~ % dmesg
[Apr 3 14:28] NFS: server 192.168.100.204 error: fileid changed
fsid 0:39: expected fileid 0x100080, got 0x80
WHere 0x80 is the mountpoint's fileid and 0x100080 is the mounted-on
fileid.
This patch fixes the issue by requesting an updated mounted-on fileid
from the server during nfs_update_inode(), and then checking that the
fileid stored in the nfs_inode matches either the fileid or mounted-on
fileid returned by the server.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
that's the bulk of filesystem drivers dealing with inodes of their own
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The spec says that once all layouts that reference a given deviceid
have been returned, then we are only allowed to continue to cache
the deviceid if the metadata server supports notifications.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
We are only allowed to cache deviceinfo if the server supports notifications
and actually promises to call us back when changes occur. Right now, we
request those notifications, but then we don't check the server's reply.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
If the call to exchange-id returns with the EXCHGID4_FLAG_CONFIRMED_R flag
set, then that means our lease was established by a previous mount instance.
Ensure that we detect this situation, and that we clear the state held by
that mount.
Reported-by: Jorge Mora <Jorge.Mora@netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
After 566fcec60 the client uses the "current stateid" from the
nfs4_state structure to close a file. This could potentially contain a
delegation stateid, which is disallowed by the protocol and causes
servers to return NFS4ERR_BAD_STATEID. This patch restores the
(correct) behavior of sending the open stateid to close a file.
Reported-by: Olga Kornievskaia <kolga@netapp.com>
Fixes: 566fcec60 (NFSv4: Fix an atomicity problem in CLOSE)
Signed-off-by: Anna Schumaker <Anna.Schumaker@netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Ensure that we don't regress the changes that were made to the
directory.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
Ensure that other operations that race with our write RPC calls
cannot revert the file size updates that were made on the server.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
Ensure that other operations which raced with our setattr RPC call
cannot revert the file attribute changes that were made on the server.
To do so, we artificially bump the attribute generation counter on
the inode so that all calls to nfs_fattr_init() that precede ours
will be dropped.
The motivation for the patch came from Chuck Lever's reports of readaheads
racing with truncate operations and causing the file size to be reverted.
Reported-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
The share access mode is now specified as an argument in the nfs4_opendata,
and so nfs4_open_recover_helper() needs to call nfs4_map_atomic_open_share()
in order to set it.
Fixes: 6ae373394c ("NFSv4.1: Ask for no delegation on OPEN if using O_DIRECT")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
If we're sending an asynchronous layoutreturn, then we need to ensure
that the inode and the super block remain pinned.
Cc: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Reviewed-by: Peng Tao <tao.peng@primarydata.com>
If we're sending an asynchronous layoutcommit, then we need to ensure
that the inode and the super block remain pinned.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Reviewed-by: Peng Tao <tao.peng@primarydata.com>
If we have to do a return-on-close in the delegreturn code, then
we must ensure that the inode and super block remain referenced.
Cc: Peng Tao <tao.peng@primarydata.com>
Cc: stable@vger.kernel.org # 3.17.x
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Reviewed-by: Peng Tao <tao.peng@primarydata.com>
If we're using NFSv4.1, then we have the ability to let the server know
whether or not we believe that returning a delegation as part of our OPEN
request would be useful.
The feature needs to be used with care, since the client sending the request
doesn't necessarily know how other clients are using that file, and how
they may be affected by the delegation.
For this reason, our initial use of the feature will be to let the server
know when the client believes that handing out a delegation would not be
useful.
The first application for this function is when opening the file using
O_DIRECT.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* flexfiles: (53 commits)
pnfs: lookup new lseg at lseg boundary
nfs41: .init_read and .init_write can be called with valid pg_lseg
pnfs: Update documentation on the Layout Drivers
pnfs/flexfiles: Add the FlexFile Layout Driver
nfs: count DIO good bytes correctly with mirroring
nfs41: wait for LAYOUTRETURN before retrying LAYOUTGET
nfs: add a helper to set NFS_ODIRECT_RESCHED_WRITES to direct writes
nfs41: add NFS_LAYOUT_RETRY_LAYOUTGET to layout header flags
nfs/flexfiles: send layoutreturn before freeing lseg
nfs41: introduce NFS_LAYOUT_RETURN_BEFORE_CLOSE
nfs41: allow async version layoutreturn
nfs41: add range to layoutreturn args
pnfs: allow LD to ask to resend read through pnfs
nfs: add nfs_pgio_current_mirror helper
nfs: only reset desc->pg_mirror_idx when mirroring is supported
nfs41: add a debug warning if we destroy an unempty layout
pnfs: fail comparison when bucket verifier not set
nfs: mirroring support for direct io
nfs: add mirroring support to pgio layer
pnfs: pass ds_commit_idx through the commit path
...
Conflicts:
fs/nfs/pnfs.c
fs/nfs/pnfs.h
The flexfile layout is a new layout that extends the
file layout. It is currently being drafted as a specification at
https://datatracker.ietf.org/doc/draft-ietf-nfsv4-layout-types/
Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: Tom Haynes <loghyr@primarydata.com>
Signed-off-by: Tao Peng <bergwolf@primarydata.com>
When it is set, generic pnfs would try to send layoutreturn right
before last close/delegation_return regard less NFS_LAYOUT_ROC is
set or not. LD can then make sure layoutreturn is always sent
rather than being omitted.
The difference against NFS_LAYOUT_RETURN is that
NFS_LAYOUT_RETURN_BEFORE_CLOSE does not block usage of the layout so
LD can set it and expect generic layer to try pnfs path at the
same time.
Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Tom Haynes <loghyr@primarydata.com>
So that pnfs path is not disabled for ever.
Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
And if we are to return the same type of layouts, don't bother
sending more layoutgets.
Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
flexclient needs this as there is no nfs_server to DS connection.
Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
Optimise the layout return on close code by ensuring that
1) Add a check for whether we hold a layout before taking any spinlocks
2) Only take the spin lock once
3) Use nfs_state->state to speed up open file checks
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Note, however, that we still serialise on the open stateid if the lock
stateid is unconfirmed. Hopefully that will not prove too much of a
burden for first time locks; it should leave the ability to parallelise
OPENs unchanged, since they no longer call the serialisation primitives.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Ensure that we test the lock stateid remained unchanged while we were
updating the VFS tracking of the byte range lock. Have the process
replay the lock to the server if we detect that was not the case.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
This patch ensures that the server cannot reorder our LOCK/LOCKU
requests if they are sent in parallel on the wire.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
The original text in RFC3530 was terribly confusing since it conflated
lockowners and lock stateids. RFC3530bis clarifies that you must use
open_to_lock_owner when there is no lock state for that file+lockowner
combination.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
When we update the lock stateid, we really do need to ensure that this is
done under the state->state_lock, and that we are indeed only updating
confirmed locks with a newer version of the same stateid.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Remove the serialisation of OPEN/OPEN_DOWNGRADE and CLOSE calls for the
case of NFSv4.1 and newer.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
When we relax the sequencing on the NFSv4.1 OPEN/CLOSE code, we will want
to use the value NULL to indicate that no sequencing is needed.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
If an OPEN RPC call races with a CLOSE or OPEN_DOWNGRADE so that it
updates the nfs_state structure before the CLOSE/OPEN_DOWNGRADE has
a chance to do so, then we know that the state->flags need to be
recalculated from scratch.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
If we are to remove the serialisation of OPEN/CLOSE, then we need to
ensure that the stateid sent as part of a CLOSE operation does not
change after we test the state in nfs4_close_prepare.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Remove an incorrect check for NFS_DELEGATION_NEED_RECLAIM in
can_open_delegated(). We are allowed to cache opens even in
a situation where we're doing reboot recovery.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Ensure that we cache the NFSv4/v4.1 client owner_id so that we can
verify it when we're doing trunking detection.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
This patch adds support for using the NFS v4.2 operation DEALLOCATE to
punch holes in a file.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
This patch adds support for using the NFS v4.2 operation ALLOCATE to
preallocate data in a file.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
nfs4_layoutget_release() drops layout hdr refcnt. Grab the refcnt
early so that it is safe to call .release in case nfs4_alloc_pages
fails.
Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Fixes: a47970ff78 ("NFSv4.1: Hold reference to layout hdr in layoutget")
Cc: stable@vger.kernel.org # 3.9+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
NFS4ERR_ACCESS has number 13 and thus is matched and returned
immediately at the beginning of nfs4_map_errors() and there's no point
in checking it later.
Coverity-id: 733891
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Any attempt to call nfs_remove_bad_delegation() while a delegation is being
returned is currently a no-op. This means that we can end up looping
forever in nfs_end_delegation_return() if something causes the delegation
to be revoked.
This patch adds a mechanism whereby the state recovery code can communicate
to the delegation return code that the delegation is no longer valid and
that it should not be used when reclaiming state.
It also changes the return value for nfs4_handle_delegation_recall_error()
to ensure that nfs_end_delegation_return() does not reattempt the lock
reclaim before state recovery is done.
http://lkml.kernel.org/r/CAN-5tyHwG=Cn2Q9KsHWadewjpTTy_K26ee+UnSvHvG4192p-Xw@mail.gmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
This patch removes the assumption made previously, that we only need to
check the delegation stateid when it matches the stateid on a cached
open.
If we believe that we hold a delegation for this file, then we must assume
that its stateid may have been revoked or expired too. If we don't test it
then our state recovery process may end up caching open/lock state in a
situation where it should not.
We therefore rename the function nfs41_clear_delegation_stateid as
nfs41_check_delegation_stateid, and change it to always run through the
delegation stateid test and recovery process as outlined in RFC5661.
http://lkml.kernel.org/r/CAN-5tyHwG=Cn2Q9KsHWadewjpTTy_K26ee+UnSvHvG4192p-Xw@mail.gmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>