TEST_STATEID and FREE_STATEID can use the machine credential.
Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
SECINFO and SECINFO_NONAME can use the machine credential.
Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
CLOSE and LOCKU can use the machine credential.
Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This is a minimal client side implementation of SP4_MACH_CRED. It will
attempt to negotiate SP4_MACH_CRED iff the EXCHANGE_ID is using
krb5i or krb5p auth. SP4_MACH_CRED will be used if the server supports the
minimal operations:
BIND_CONN_TO_SESSION
EXCHANGE_ID
CREATE_SESSION
DESTROY_SESSION
DESTROY_CLIENTID
This patch only includes the EXCHANGE_ID negotiation code because
the client will already use the machine cred for these operations.
If the server doesn't support SP4_MACH_CRED or doesn't support the minimal
operations, the exchange id will be resent with SP4_NONE.
Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Rename the new 'recover_locks' kernel parameter to 'recover_lost_locks'
and change the default to 'false'. Document why in
Documentation/kernel-parameters.txt
Move the 'recover_lost_locks' kernel parameter to fs/nfs/super.c to
make it easy to backport to kernels prior to 3.6.x, which don't have
a separate NFSv4 module.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
When an NFSv4 client loses contact with the server it can lose any
locks that it holds.
Currently when it reconnects to the server it simply tries to reclaim
those locks. This might succeed even though some other client has
held and released a lock in the mean time. So the first client might
think the file is unchanged, but it isn't. This isn't good.
If, when recovery happens, the locks cannot be claimed because some
other client still holds the lock, then we get a message in the kernel
logs, but the client can still write. So two clients can both think
they have a lock and can both write at the same time. This is equally
not good.
There was a patch a while ago
http://comments.gmane.org/gmane.linux.nfs/41917
which tried to address some of this, but it didn't seem to go
anywhere. That patch would also send a signal to the process. That
might be useful but for now this patch just causes writes to fail.
For NFSv4 (unlike v2/v3) there is a strong link between the lock and
the write request so we can fairly easily fail any IO of the lock is
gone. While some applications might not expect this, it is still
safer than allowing the write to succeed.
Because this is a fairly big change in behaviour a module parameter,
"recover_locks", is introduced which defaults to true (the current
behaviour) but can be set to "false" to tell the client not to try to
recover things that were lost.
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Ensure OPEN_CONFIRM is not emitted while the transport is plugged.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Ensure RELEASE_LOCKOWNER is not emitted while the transport is
plugged.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
When CONFIG_NFS_V4_1 is disabled, the calls to nfs4_setup_sequence()
and nfs4_sequence_done() are compiled out for the DELEGRETURN
operation. To allow NFSv4.0 transport blocking to work for
DELEGRETURN, these call sites have to be present all the time.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Plumb in a mechanism for plugging an NFSv4.0 mount, using the
same infrastructure as NFSv4.1 sessions.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Anchor an nfs4_slot_table in the nfs_client for use with NFSv4.0
transport blocking. It is initialized only for NFSv4.0 nfs_client's.
Introduce appropriate minor version ops to handle nfs_client
initialization and shutdown requirements that differ for each minor
version.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Refactor nfs4_call_sync_sequence() so it is used for NFSv4.0 now.
The RPC callouts will house transport blocking logic similar to
NFSv4.1 sessions.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
NFSv4.0 will have need for this functionality when I add the ability
to block NFSv4.0 traffic before migration recovery.
I'm not really clear on why nfs4_set_sequence_privileged() gets a
generic name, but nfs41_init_sequence() gets a minor
version-specific name.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Clean up: Both the NFSv4.0 and NFSv4.1 version of
nfs4_setup_sequence() are used only in fs/nfs/nfs4proc.c. No need
to keep global header declarations for either version.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Clean up: rename nfs41_call_sync_data for use as a data structure
common to all NFSv4 minor versions.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Clean up, since slot and sequence numbers are all unsigned anyway.
Among other things, squelch compiler warnings:
linux/fs/nfs/nfs4proc.c: In function ‘nfs4_setup_sequence’:
linux/fs/nfs/nfs4proc.c:703:2: warning: signed and unsigned type in
conditional expression [-Wsign-compare]
and
linux/fs/nfs/nfs4session.c: In function ‘nfs4_alloc_slot’:
linux/fs/nfs/nfs4session.c:151:31: warning: signed and unsigned type in
conditional expression [-Wsign-compare]
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Commit 5ec16a8500 introduced a regression
that causes SECINFO to fail without actualy sending an RPC if:
1) the nfs_client's rpc_client was using KRB5i/p (now tried by default)
2) the current user doesn't have valid kerberos credentials
This situation is quite common - as of now a sec=sys mount would use
krb5i for the nfs_client's rpc_client and a user would hardly be faulted
for not having run kinit.
The solution is to use the machine cred when trying to use an integrity
protected auth flavor for SECINFO.
Older servers may not support using the machine cred or an integrity
protected auth flavor for SECINFO in every circumstance, so we fall back
to using the user's cred and the filesystem's auth flavor in this case.
We run into another problem when running against linux nfs servers -
they return NFS4ERR_WRONGSEC when using integrity auth flavor (unless the
mount is also that flavor) even though that is not a valid error for
SECINFO*. Even though it's against spec, handle WRONGSEC errors on SECINFO
by falling back to using the user cred and the filesystem's auth flavor.
Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The clnt->cl_principal is being used exclusively to store the service
target name for RPCSEC_GSS/krb5 callbacks. Replace it with something that
is stored only in the RPCSEC_GSS-specific code.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Add tracepoints to nfs41_setup_sequence and nfs41_sequence_done
to track session and slot table state changes.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Set up tracepoints to track read, write and commit, as well as
pNFS reads and writes and commits to the data server.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Set up tracepoints to track when delegations are set, reclaimed,
returned by the client, or recalled by the server.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Set up basic tracepoints for debugging NFSv4 setattr, access,
readlink, readdir, get_acl set_acl get_security_label,
and set_security_label.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Set up basic tracepoints for debugging NFSv4 lookup, unlink/remove,
symlink, mkdir, mknod, fs_locations and secinfo.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Set up basic tracepoints for debugging client id creation/destruction
and session creation/destruction.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Technically, we don't really need to convert these time stamps,
since they are actually cookies.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Chuck Lever <Chuck.Lever@oracle.com>
As per RFC 5661 Security Considerations
Commit 4edaa308 "NFS: Use "krb5i" to establish NFSv4 state whenever possible"
uses the nfs_client cl_rpcclient for all clientid management operations.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
As per RFC 3530 and RFC 5661 Security Considerations
Commit 4edaa308 "NFS: Use "krb5i" to establish NFSv4 state whenever possible"
uses the nfs_client cl_rpcclient for all clientid management operations.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Currently, we do not check the return value of client = rpc_clone_client(),
nor do we shut down the resulting cloned rpc_clnt in the case where a
NFS4ERR_WRONGSEC has caused nfs4_proc_lookup_common() to replace the
original value of 'client' (causing a memory leak).
Fix both issues and simplify the code by moving the call to
rpc_clone_client() until after nfs4_proc_lookup_common() has
done its business.
Reported-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Never try to use a non-UID 0 user credential for lease management,
as that credential can change out from under us. The server will
block NFSv4 lease recovery with NFS4ERR_CLID_INUSE.
Since the mechanism to acquire a credential for lease management
is now the same for all minor versions, replace the minor version-
specific callout with a single function.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Commit 6f2ea7f2a (NFS: Add nfs4_unique_id boot parameter) introduces a
boot parameter that allows client administrators to set a string
identifier for use by the EXCHANGE_ID and SETCLIENTID arguments in order
to make them more globally unique.
Unfortunately, that uniquifier is no longer globally unique in the presence
of net namespaces, since each container expects to be able to set up their
own lease when mounting a new NFSv4/4.1 partition.
The fix is to add back in the container-specific hostname in addition to
the unique id.
Cc: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Should not use the clientid maintenance rpc_clnt.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
nfs4_proc_setattr removes ATTR_OPEN from sattr->ia_valid, but later
nfs4_do_setattr checks for it
Signed-off-by: Nadav Shemer <nadav@tonian.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Feature highlights include:
- Add basic client support for NFSv4.2
- Add basic client support for Labeled NFS (selinux for NFSv4.2)
- Fix the use of credentials in NFSv4.1 stateful operations, and
add support for NFSv4.1 state protection.
Bugfix highlights:
- Fix another NFSv4 open state recovery race
- Fix an NFSv4.1 back channel session regression
- Various rpc_pipefs races
- Fix another issue with NFSv3 auth negotiation
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
iQIcBAABAgAGBQJR2vsSAAoJEGcL54qWCgDyWBIP/AqlpBBAblxbNQ1Bl/0m1Pdb
iKH961qgM4U1BzK0svGtHTZqkovpm4o/VbkbKBT5mQ4g6SbbsJ/AsS1plCyfnIZi
bdnKNJyj6zg0NsAkJ3vKWqd4BTaP+icdSfEIlRKQxAPESewN7b5B3OWgY4KdYmnk
q5BP25anC1ryxVycSY67ux8S2IKXVSRZeCZv+RO21rvZ2G0bV5y7t8Om28ztxEnU
RKrHgQHgaaktR7i8QVO0sbiWq3iqLa3GPkUvFLwWGr8PQJtTkYY0QwYSrsV3N4rY
hYpMRUZFHpZ8UG5YvBT6xyOy/XaGwMGKSfZjB9/YG4QVju+tTy50U1JbTil5PEWY
GHWYF68aurIeUkXrhSv8AVnOnhir0mISx5ou/SV7p0QoAZ92V6kq+LkPrW520qlc
z8ILh3j28pN3ZUCIEArcaZhYCt48uO2hwBi5TqevQyyGRsXFGbN1moD5jvHkllft
Fi0XGuCBdvhrzFRZcsEl+PDq7fT8lXUK2BHe8oR5jz9PhUp+jpEl9m/eg3RsjJjN
DuxsHye2U4chScdnRtLBQvpFtdINvWX/Gy8Bi7kdE5tsQySvOa+rdwuBc7h88PHC
+4xI2iX3z4O1+GpsAe/T9+pjW689jEilS+eVDRVEGl6yHGn9q8PYOayjPjwbJHxS
R2mLTRhKu1DKguTzO13f
=wGjn
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-3.11-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
"Feature highlights include:
- Add basic client support for NFSv4.2
- Add basic client support for Labeled NFS (selinux for NFSv4.2)
- Fix the use of credentials in NFSv4.1 stateful operations, and add
support for NFSv4.1 state protection.
Bugfix highlights:
- Fix another NFSv4 open state recovery race
- Fix an NFSv4.1 back channel session regression
- Various rpc_pipefs races
- Fix another issue with NFSv3 auth negotiation
Please note that Labeled NFS does require some additional support from
the security subsystem. The relevant changesets have all been
reviewed and acked by James Morris."
* tag 'nfs-for-3.11-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (54 commits)
NFS: Set NFS_CS_MIGRATION for NFSv4 mounts
NFSv4.1 Refactor nfs4_init_session and nfs4_init_channel_attrs
nfs: have NFSv3 try server-specified auth flavors in turn
nfs: have nfs_mount fake up a auth_flavs list when the server didn't provide it
nfs: move server_authlist into nfs_try_mount_request
nfs: refactor "need_mount" code out of nfs_try_mount
SUNRPC: PipeFS MOUNT notification optimization for dying clients
SUNRPC: split client creation routine into setup and registration
SUNRPC: fix races on PipeFS UMOUNT notifications
SUNRPC: fix races on PipeFS MOUNT notifications
NFSv4.1 use pnfs_device maxcount for the objectlayout gdia_maxcount
NFSv4.1 use pnfs_device maxcount for the blocklayout gdia_maxcount
NFSv4.1 Fix gdia_maxcount calculation to fit in ca_maxresponsesize
NFS: Improve legacy idmapping fallback
NFSv4.1 end back channel session draining
NFS: Apply v4.1 capabilities to v4.2
NFSv4.1: Clean up layout segment comparison helper names
NFSv4.1: layout segment comparison helpers should take 'const' parameters
NFSv4: Move the DNS resolver into the NFSv4 module
rpc_pipefs: only set rpc_dentry_ops if d_op isn't already set
...
* labeled-nfs:
NFS: Apply v4.1 capabilities to v4.2
NFS: Add in v4.2 callback operation
NFS: Make callbacks minor version generic
Kconfig: Add Kconfig entry for Labeled NFS V4 client
NFS: Extend NFS xattr handlers to accept the security namespace
NFS: Client implementation of Labeled-NFS
NFS: Add label lifecycle management
NFS:Add labels to client function prototypes
NFSv4: Extend fattr bitmaps to support all 3 words
NFSv4: Introduce new label structure
NFSv4: Add label recommended attribute and NFSv4 flags
NFSv4.2: Added NFS v4.2 support to the NFS client
SELinux: Add new labeling type native labels
LSM: Add flags field to security_sb_set_mnt_opts for in kernel mount data.
Security: Add Hook to test if the particular xattr is part of a MAC model.
Security: Add hook to calculate context based on a negative dentry.
NFS: Add NFSv4.2 protocol constants
Conflicts:
fs/nfs/nfs4proc.c
nfs4_init_session was originally written to be called prior to
nfs4_init_channel_attrs, setting the session target_max response and request
sizes that nfs4_init_channel_attrs would pay attention to.
In the current code flow, nfs4_init_session, just like nfs4_init_ds_session
for the data server case, is called after the session is all negotiated, and
is actually used in a RECLAIM COMPLETE call to the server.
Remove the un-needed fc_target_max response and request fields from
nfs4_session and just set the max_resp_sz and max_rqst_sz in
nfs4_init_channel_attrs.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* freezer:
af_unix: use freezable blocking calls in read
sigtimedwait: use freezable blocking call
nanosleep: use freezable blocking call
futex: use freezable blocking call
select: use freezable blocking call
epoll: use freezable blocking call
binder: use freezable blocking calls
freezer: add new freezable helpers using freezer_do_not_count()
freezer: convert freezable helpers to static inline where possible
freezer: convert freezable helpers to freezer_do_not_count()
freezer: skip waking up tasks with PF_FREEZER_SKIP set
freezer: shorten freezer sleep time using exponential backoff
lockdep: check that no locks held at freeze time
lockdep: remove task argument from debug_check_no_locks_held
freezer: add unsafe versions of freezable helpers for CIFS
freezer: add unsafe versions of freezable helpers for NFS
This fixes POSIX locks and possibly a few other v4.2 features, like
readdir plus.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Make sure that NFSv4 SETCLIENTID does not parse the NETID as a
format string.
Signed-off-by: Djalal Harouni <tixxdz@opendz.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The existing NFSv4 xattr handlers do not accept xattr calls to the security
namespace. This patch extends these handlers to accept xattrs from the security
namespace in addition to the default NFSv4 ACL namespace.
Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Matthew N. Dodd <Matthew.Dodd@sparta.com>
Signed-off-by: Miguel Rodel Felipe <Rodel_FM@dsi.a-star.edu.sg>
Signed-off-by: Phua Eu Gene <PHUA_Eu_Gene@dsi.a-star.edu.sg>
Signed-off-by: Khin Mi Mi Aung <Mi_Mi_AUNG@dsi.a-star.edu.sg>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This patch implements the client transport and handling support for labeled
NFS. The patch adds two functions to encode and decode the security label
recommended attribute which makes use of the LSM hooks added earlier. It also
adds code to grab the label from the file attribute structures and encode the
label to be sent back to the server.
Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Matthew N. Dodd <Matthew.Dodd@sparta.com>
Signed-off-by: Miguel Rodel Felipe <Rodel_FM@dsi.a-star.edu.sg>
Signed-off-by: Phua Eu Gene <PHUA_Eu_Gene@dsi.a-star.edu.sg>
Signed-off-by: Khin Mi Mi Aung <Mi_Mi_AUNG@dsi.a-star.edu.sg>
Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This patch adds the lifecycle management for the security label structure
introduced in an earlier patch. The label is not used yet but allocations and
freeing of the structure is handled.
Signed-off-by: Matthew N. Dodd <Matthew.Dodd@sparta.com>
Signed-off-by: Miguel Rodel Felipe <Rodel_FM@dsi.a-star.edu.sg>
Signed-off-by: Phua Eu Gene <PHUA_Eu_Gene@dsi.a-star.edu.sg>
Signed-off-by: Khin Mi Mi Aung <Mi_Mi_AUNG@dsi.a-star.edu.sg>
Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
After looking at all of the nfsv4 operations the label structure has been added
to the prototypes of the functions which can transmit label data.
Signed-off-by: Matthew N. Dodd <Matthew.Dodd@sparta.com>
Signed-off-by: Miguel Rodel Felipe <Rodel_FM@dsi.a-star.edu.sg>
Signed-off-by: Phua Eu Gene <PHUA_Eu_Gene@dsi.a-star.edu.sg>
Signed-off-by: Khin Mi Mi Aung <Mi_Mi_AUNG@dsi.a-star.edu.sg>
Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The fattr handling bitmap code only uses the first two fattr words sofar. This
patch adds the 3rd word to being sent but doesn't populate it yet.
Signed-off-by: Miguel Rodel Felipe <Rodel_FM@dsi.a-star.edu.sg>
Signed-off-by: Phua Eu Gene <PHUA_Eu_Gene@dsi.a-star.edu.sg>
Signed-off-by: Khin Mi Mi Aung <Mi_Mi_AUNG@dsi.a-star.edu.sg>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This enable NFSv4.2 support. To enable this code the
CONFIG_NFS_V4_2 Kconfig define needs to be set and
the -o v4.2 mount option need to be used.
Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
State recovery currently relies on being able to find a valid
nfs_open_context in the inode->open_files list.
We therefore need to put the nfs_open_context on the list while
we're still protected by the sp->so_reclaim_seqcount in order
to avoid reboot races.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
All the callers have an open_context at this point, and since we always
need one in order to do state recovery, it makes sense to use it as the
basis for the nfs4_do_open() call.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Use the EXCHGID4_FLAG_BIND_PRINC_STATEID exchange_id flag to enable
stateid protection. This means that if we create a stateid using a
particular principal, then we must use the same principal if we
want to change that state.
IOW: if we OPEN a file using a particular credential, then we have
to use the same credential in subsequent OPEN_DOWNGRADE, CLOSE,
or DELEGRETURN operations that use that stateid.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This is not strictly needed, since get_deviceinfo is not allowed to
return NFS4ERR_ACCESS or NFS4ERR_WRONG_CRED, but lets do it anyway
for consistency with other pNFS operations.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We want to use the same credential for reclaim_complete as we used
for the exchange_id call.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We need to use the same credential as was used for the layoutget
and/or layoutcommit operations.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We need to pass the full open mode flags to nfs_may_open() when doing
a delegated open.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
On a CB_RECALL the callback service thread flushes the inode using
filemap_flush prior to scheduling the state manager thread to return the
delegation. When pNFS is used and I/O has not yet gone to the data server
servicing the inode, a LAYOUTGET can preceed the I/O. Unlike the async
filemap_flush call, the LAYOUTGET must proceed to completion.
If the state manager starts to recover data while the inode flush is sending
the LAYOUTGET, a deadlock occurs as the callback service thread holds the
single callback session slot until the flushing is done which blocks the state
manager thread, and the state manager thread has set the session draining bit
which puts the inode flush LAYOUTGET RPC to sleep on the forechannel slot
table waitq.
Separate the draining of the back channel from the draining of the fore channel
by moving the NFS4_SESSION_DRAINING bit from session scope into the fore
and back slot tables. Drain the back channel first allowing the LAYOUTGET
call to proceed (and fail) so the callback service thread frees the callback
slot. Then proceed with draining the forechannel.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
NFS calls the freezable helpers with locks held, which is unsafe
and will cause lockdep warnings when 6aa9707 "lockdep: check
that no locks held at freeze time" is reapplied (it was reverted
in dbf520a). NFS shouldn't be doing this, but it has
long-running syscalls that must hold a lock but also shouldn't
block suspend. Until NFS freeze handling is rewritten to use a
signal to exit out of the critical section, add new *_unsafe
versions of the helpers that will not run the lockdep test when
6aa9707 is reapplied, and call them from NFS.
In practice the likley result of holding the lock while freezing
is that a second task blocked on the lock will never freeze,
aborting suspend, but it is possible to manufacture a case using
the cgroup freezer, the lock, and the suspend freezer to create
a deadlock. Silencing the lockdep warning here will allow
problems to be found in other drivers that may have a more
serious deadlock risk, and prevent new problems from being added.
Signed-off-by: Colin Cross <ccross@android.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This ensures that the server doesn't need to keep huge numbers of
lock stateids waiting around for the final CLOSE.
See section 8.2.4 in RFC5661.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The main reason for doing this is will be to allow for an asynchronous
RPC mode that we can use for freeing lock stateids as per section
8.2.4 of RFC5661.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
If a NFS client receives a delegation for a file after it has taken
a lock on that file, we can currently end up in a situation where
we mistakenly skip unlocking that file.
The following patch swaps an erroneous check in nfs4_proc_unlck for
whether or not the file has a delegation to one which checks whether
or not we hold a lock stateid for that file.
Reported-by: Chuck Lever <Chuck.Lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org [>=3.7]
Tested-by: Chuck Lever <Chuck.Lever@oracle.com>
Debugging aid to help identify servers that incorrectly apply open mode
checks to setattr requests that are not changing the file size.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The NFSv4 and NFSv4.1 specs are both clear that the server should only check
stateid open mode if a SETATTR specifies the size attribute. If the
open mode is not one that allows writing, then it returns NFS4ERR_OPENMODE.
In the case where the SETATTR is not changing the size, the client will
still pass it the delegation stateid to ensure that the server does not
recall that delegation. In that case, the server should _ignore_ the
delegation open mode, and simply apply standard permission checks.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* rpcsec_gss-from_cel: (21 commits)
NFS: Retry SETCLIENTID with AUTH_SYS instead of AUTH_NONE
NFSv4: Don't clear the machine cred when client establish returns EACCES
NFSv4: Fix issues in nfs4_discover_server_trunking
NFSv4: Fix the fallback to AUTH_NULL if krb5i is not available
NFS: Use server-recommended security flavor by default (NFSv3)
SUNRPC: Don't recognize RPC_AUTH_MAXFLAVOR
NFS: Use "krb5i" to establish NFSv4 state whenever possible
NFS: Try AUTH_UNIX when PUTROOTFH gets NFS4ERR_WRONGSEC
NFS: Use static list of security flavors during root FH lookup recovery
NFS: Avoid PUTROOTFH when managing leases
NFS: Clean up nfs4_proc_get_rootfh
NFS: Handle missing rpc.gssd when looking up root FH
SUNRPC: Remove EXPORT_SYMBOL_GPL() from GSS mech switch
SUNRPC: Make gss_mech_get() static
SUNRPC: Refactor nfsd4_do_encode_secinfo()
SUNRPC: Consider qop when looking up pseudoflavors
SUNRPC: Load GSS kernel module by OID
SUNRPC: Introduce rpcauth_get_pseudoflavor()
SUNRPC: Define rpcsec_gss_info structure
NFS: Remove unneeded forward declaration
...
If we already checked the user access permissions on the original open,
then don't bother checking again on recovery. Doing so can cause a
deadlock with NFSv4.1, since the may_open() operation is not privileged.
Furthermore, we can't report an access permission failure here anyway.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
When we're doing open-by-filehandle in NFSv4.1, we shouldn't need to
do the cache consistency revalidation on the directory. It is
therefore more efficient to just use open_noattr, which returns the
file attributes, but not the directory attributes.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We should always clear it before initiating file recovery.
Also ensure that we clear it after a CLOSE and/or after TEST_STATEID fails.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Defensive patch to ensure that we copy the state->open_stateid, which
can never be set to the delegation stateid.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Fix nfs4_select_rw_stateid() so that it chooses the open stateid
(or an all-zero stateid) if the delegation does not match the selected
read/write mode.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
If we're doing NFSv4.1 against a server that has persistent sessions,
then we should not need to call SETATTR in order to reset the file
attributes immediately after doing an exclusive create.
Note that since the create mode depends on the type of session that
has been negotiated with the server, we should not choose the
mode until after we've got a session slot.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Currently, _nfs4_do_setattr() will use the delegation stateid if no
writeable open file stateid is available.
If the server revokes that delegation stateid, then the call to
nfs4_handle_exception() will fail to handle the error due to the
lack of a struct nfs4_state, and will just convert the error into
an EIO.
This patch just removes the requirement that we must have a
struct nfs4_state in order to invalidate the delegation and
retry.
Reported-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Otherwise we deadlock if state recovery is initiated while we
sleep.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Don't hold the NFSv4 sequence id while we check for open permission.
The call to ACCESS may block due to reboot recovery.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
When we send a RENEW or SEQUENCE operation in order to probe if the
lease is still valid, we want it to be able to time out since the
lease we are probing is likely to time out too. Currently, because
we use soft mount semantics for these RPC calls, the return value
is EIO, which causes the state manager to exit with an "unhandled
error" message.
This patch changes the call semantics, so that the RPC layer returns
ETIMEDOUT instead of EIO. We then have the state manager default to
a simple retry instead of exiting.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
All error cases are handled by the switch() statement, meaning that the
call to nfs4_handle_exception() is unreachable.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
A server shouldn't normally return NFS4ERR_GRACE if the client holds a
delegation, since no conflicting lock reclaims can be granted, however
the spec does not require the server to grant the open in this
instance
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
A server shouldn't normally return NFS4ERR_GRACE if the client holds a
delegation, since no conflicting lock reclaims can be granted, however
the spec does not require the server to grant the lock in this
instance.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
Most NFSv4 servers implement AUTH_UNIX, and administrators will
prefer this over AUTH_NULL. It is harmless for our client to try
this flavor in addition to the flavors mandated by RFC 3530/5661.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
If the Linux NFS client receives an NFS4ERR_WRONGSEC error while
trying to look up an NFS server's root file handle, it retries the
lookup operation with various security flavors to see what flavor
the NFS server will accept for pseudo-fs access.
The list of flavors the client uses during retry consists only of
flavors that are currently registered in the kernel RPC client.
This list may not include any GSS pseudoflavors if auth_rpcgss.ko
has not yet been loaded.
Let's instead use a static list of security flavors that the NFS
standard requires the server to implement (RFC 3530bis, section
3.2.1). The RPC client should now be able to load support for
these dynamically; if not, they are skipped.
Recovery behavior here is prescribed by RFC 3530bis, section
15.33.5:
> For LOOKUPP, PUTROOTFH and PUTPUBFH, the client will be unable to
> use the SECINFO operation since SECINFO requires a current
> filehandle and none exist for these two [sic] operations. Therefore,
> the client must iterate through the security triples available at
> the client and reattempt the PUTROOTFH or PUTPUBFH operation. In
> the unfortunate event none of the MANDATORY security triples are
> supported by the client and server, the client SHOULD try using
> others that support integrity. Failing that, the client can try
> using AUTH_NONE, but because such forms lack integrity checks,
> this puts the client at risk.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Currently, the compound operation the Linux NFS client sends to the
server to confirm a client ID looks like this:
{ SETCLIENTID_CONFIRM; PUTROOTFH; GETATTR(lease_time) }
Once the lease is confirmed, it makes sense to know how long before
the client will have to renew it. And, performing these operations
in the same compound saves a round trip.
Unfortunately, this arrangement assumes that the security flavor
used for establishing a client ID can also be used to access the
server's pseudo-fs.
If the server requires a different security flavor to access its
pseudo-fs than it allowed for the client's SETCLIENTID operation,
the PUTROOTFH in this compound fails with NFS4ERR_WRONGSEC. Even
though the SETCLIENTID_CONFIRM succeeded, our client's trunking
detection logic interprets the failure of the compound as a failure
by the server to confirm the client ID.
As part of server trunking detection, the client then begins another
SETCLIENTID pass with the same nfs4_client_id. This fails with
NFS4ERR_CLID_INUSE because the first SETCLIENTID/SETCLIENTID_CONFIRM
already succeeded in confirming that client ID -- it was the
PUTROOTFH operation that caused the SETCLIENTID_CONFIRM compound to
fail.
To address this issue, separate the "establish client ID" step from
the "accessing the server's pseudo-fs root" step. The first access
of the server's pseudo-fs may require retrying the PUTROOTFH
operation with different security flavors. This access is done in
nfs4_proc_get_rootfh().
That leaves the matter of how to retrieve the server's lease time.
nfs4_proc_fsinfo() already retrieves the lease time value, though
none of its callers do anything with the retrieved value (nor do
they mark the lease as "renewed").
Note that NFSv4.1 state recovery invokes nfs4_proc_get_lease_time()
using the lease management security flavor. This may cause some
heartburn if that security flavor isn't the same as the security
flavor the server requires for accessing the pseudo-fs.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The long lines with no vertical white space make this function
difficult for humans to read. Add a proper documenting comment
while we're here.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
When rpc.gssd is not running, any NFS operation that needs to use a
GSS security flavor of course does not work.
If looking up a server's root file handle results in an
NFS4ERR_WRONGSEC, nfs4_find_root_sec() is called to try a bunch of
security flavors until one works or all reasonable flavors have
been tried. When rpc.gssd isn't running, this loop seems to fail
immediately after rpcauth_create() craps out on the first GSS
flavor.
When the rpcauth_create() call in nfs4_lookup_root_sec() fails
because rpc.gssd is not available, nfs4_lookup_root_sec()
unconditionally returns -EIO. This prevents nfs4_find_root_sec()
from retrying any other flavors; it drops out of its loop and fails
immediately.
Having nfs4_lookup_root_sec() return -EACCES instead allows
nfs4_find_root_sec() to try all flavors in its list.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
If the open_context for the file is not yet fully initialised,
then open recovery cannot succeed, and since nfs4_state_find_open_context
returns an ENOENT, we end up treating the file as being irrecoverable.
What we really want to do, is just defer the recovery until later.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
With unlink is an asynchronous operation in the sillyrename case, it
expects nfs4_async_handle_error() to map the error correctly.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Now that we do CLAIM_FH opens, we may run into situations where we
get a delegation but don't have perfect knowledge of the file path.
When returning the delegation, we might therefore not be able to
us CLAIM_DELEGATE_CUR opens to convert the delegation into OPEN
stateids and locks.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Sometimes, we actually _want_ to do open-by-filehandle, for instance
when recovering opens after a network partition, or when called
from nfs4_file_open.
Enable that functionality using a new capability NFS_CAP_ATOMIC_OPEN_V1,
and which is only enabled for NFSv4.1 servers that support it.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Follow the practice described in section 8.2.2 of RFC5661: When sending a
read/write or setattr stateid, set the seqid field to zero in order to
signal that the NFS server should apply the most recent locking state.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Clean up the setting of the nfs_server->caps, by shoving it all
into nfs4_server_common_setup().
Then add an 'initial capabilities' field into struct nfs4_minor_version_ops.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>