2005-04-17 05:20:36 +07:00
|
|
|
/*
|
|
|
|
* File operations used by nfsd. Some of these have been ripped from
|
|
|
|
* other parts of the kernel because they weren't exported, others
|
|
|
|
* are partial duplicates with added or changed functionality.
|
|
|
|
*
|
|
|
|
* Note that several functions dget() the dentry upon which they want
|
|
|
|
* to act, most notably those that create directory entries. Response
|
|
|
|
* dentry's are dput()'d if necessary in the release callback.
|
|
|
|
* So if you notice code paths that apparently fail to dput() the
|
|
|
|
* dentry, don't worry--they have been taken care of.
|
|
|
|
*
|
|
|
|
* Copyright (C) 1995-1999 Olaf Kirch <okir@monad.swb.de>
|
|
|
|
* Zerocpy NFS support (C) 2002 Hirokazu Takahashi <taka@valinux.co.jp>
|
|
|
|
*/
|
|
|
|
|
|
|
|
#include <linux/fs.h>
|
|
|
|
#include <linux/file.h>
|
2007-06-04 14:59:47 +07:00
|
|
|
#include <linux/splice.h>
|
2014-11-08 02:44:26 +07:00
|
|
|
#include <linux/falloc.h>
|
2005-04-17 05:20:36 +07:00
|
|
|
#include <linux/fcntl.h>
|
|
|
|
#include <linux/namei.h>
|
|
|
|
#include <linux/delay.h>
|
[PATCH] inotify
inotify is intended to correct the deficiencies of dnotify, particularly
its inability to scale and its terrible user interface:
* dnotify requires the opening of one fd per each directory
that you intend to watch. This quickly results in too many
open files and pins removable media, preventing unmount.
* dnotify is directory-based. You only learn about changes to
directories. Sure, a change to a file in a directory affects
the directory, but you are then forced to keep a cache of
stat structures.
* dnotify's interface to user-space is awful. Signals?
inotify provides a more usable, simple, powerful solution to file change
notification:
* inotify's interface is a system call that returns a fd, not SIGIO.
You get a single fd, which is select()-able.
* inotify has an event that says "the filesystem that the item
you were watching is on was unmounted."
* inotify can watch directories or files.
Inotify is currently used by Beagle (a desktop search infrastructure),
Gamin (a FAM replacement), and other projects.
See Documentation/filesystems/inotify.txt.
Signed-off-by: Robert Love <rml@novell.com>
Cc: John McCutchan <ttb@tentacle.dhs.org>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-13 04:06:03 +07:00
|
|
|
#include <linux/fsnotify.h>
|
2005-04-17 05:20:36 +07:00
|
|
|
#include <linux/posix_acl_xattr.h>
|
|
|
|
#include <linux/xattr.h>
|
2009-12-04 01:30:56 +07:00
|
|
|
#include <linux/jhash.h>
|
|
|
|
#include <linux/ima.h>
|
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-24 15:04:11 +07:00
|
|
|
#include <linux/slab.h>
|
2009-12-04 01:30:56 +07:00
|
|
|
#include <asm/uaccess.h>
|
2010-02-18 03:05:11 +07:00
|
|
|
#include <linux/exportfs.h>
|
|
|
|
#include <linux/writeback.h>
|
2013-05-03 00:19:10 +07:00
|
|
|
#include <linux/security.h>
|
2009-12-04 01:30:56 +07:00
|
|
|
|
|
|
|
#ifdef CONFIG_NFSD_V3
|
|
|
|
#include "xdr3.h"
|
|
|
|
#endif /* CONFIG_NFSD_V3 */
|
|
|
|
|
2006-01-10 11:51:55 +07:00
|
|
|
#ifdef CONFIG_NFSD_V4
|
2015-12-03 18:59:52 +07:00
|
|
|
#include "../internal.h"
|
2011-01-05 05:37:15 +07:00
|
|
|
#include "acl.h"
|
|
|
|
#include "idmap.h"
|
2005-04-17 05:20:36 +07:00
|
|
|
#endif /* CONFIG_NFSD_V4 */
|
|
|
|
|
2009-12-04 01:30:56 +07:00
|
|
|
#include "nfsd.h"
|
|
|
|
#include "vfs.h"
|
2015-11-17 18:52:23 +07:00
|
|
|
#include "trace.h"
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
#define NFSDDBG_FACILITY NFSDDBG_FILEOP
|
|
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
* This is a cache of readahead params that help us choose the proper
|
|
|
|
* readahead strategy. Initially, we set all readahead parameters to 0
|
|
|
|
* and let the VFS handle things.
|
|
|
|
* If you increase the number of cached files very much, you'll need to
|
|
|
|
* add a hash table here.
|
|
|
|
*/
|
|
|
|
struct raparms {
|
|
|
|
struct raparms *p_next;
|
|
|
|
unsigned int p_count;
|
|
|
|
ino_t p_ino;
|
|
|
|
dev_t p_dev;
|
|
|
|
int p_set;
|
|
|
|
struct file_ra_state p_ra;
|
2006-10-04 16:15:49 +07:00
|
|
|
unsigned int p_hindex;
|
2005-04-17 05:20:36 +07:00
|
|
|
};
|
|
|
|
|
2006-10-04 16:15:49 +07:00
|
|
|
struct raparm_hbucket {
|
|
|
|
struct raparms *pb_head;
|
|
|
|
spinlock_t pb_lock;
|
|
|
|
} ____cacheline_aligned_in_smp;
|
|
|
|
|
|
|
|
#define RAPARM_HASH_BITS 4
|
|
|
|
#define RAPARM_HASH_SIZE (1<<RAPARM_HASH_BITS)
|
|
|
|
#define RAPARM_HASH_MASK (RAPARM_HASH_SIZE-1)
|
|
|
|
static struct raparm_hbucket raparm_hash[RAPARM_HASH_SIZE];
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Called from nfsd_lookup and encode_dirent. Check if we have crossed
|
|
|
|
* a mount point.
|
2006-12-13 15:35:25 +07:00
|
|
|
* Returns -EAGAIN or -ETIMEDOUT leaving *dpp and *expp unchanged,
|
2005-04-17 05:20:36 +07:00
|
|
|
* or nfs_ok having possibly changed *dpp and *expp
|
|
|
|
*/
|
|
|
|
int
|
|
|
|
nfsd_cross_mnt(struct svc_rqst *rqstp, struct dentry **dpp,
|
|
|
|
struct svc_export **expp)
|
|
|
|
{
|
|
|
|
struct svc_export *exp = *expp, *exp2 = NULL;
|
|
|
|
struct dentry *dentry = *dpp;
|
2009-04-18 13:42:05 +07:00
|
|
|
struct path path = {.mnt = mntget(exp->ex_path.mnt),
|
|
|
|
.dentry = dget(dentry)};
|
2006-10-20 13:28:58 +07:00
|
|
|
int err = 0;
|
2005-04-17 05:20:36 +07:00
|
|
|
|
2011-03-18 20:04:20 +07:00
|
|
|
err = follow_down(&path);
|
Add a dentry op to allow processes to be held during pathwalk transit
Add a dentry op (d_manage) to permit a filesystem to hold a process and make it
sleep when it tries to transit away from one of that filesystem's directories
during a pathwalk. The operation is keyed off a new dentry flag
(DCACHE_MANAGE_TRANSIT).
The filesystem is allowed to be selective about which processes it holds and
which it permits to continue on or prohibits from transiting from each flagged
directory. This will allow autofs to hold up client processes whilst letting
its userspace daemon through to maintain the directory or the stuff behind it
or mounted upon it.
The ->d_manage() dentry operation:
int (*d_manage)(struct path *path, bool mounting_here);
takes a pointer to the directory about to be transited away from and a flag
indicating whether the transit is undertaken by do_add_mount() or
do_move_mount() skipping through a pile of filesystems mounted on a mountpoint.
It should return 0 if successful and to let the process continue on its way;
-EISDIR to prohibit the caller from skipping to overmounted filesystems or
automounting, and to use this directory; or some other error code to return to
the user.
->d_manage() is called with namespace_sem writelocked if mounting_here is true
and no other locks held, so it may sleep. However, if mounting_here is true,
it may not initiate or wait for a mount or unmount upon the parameter
directory, even if the act is actually performed by userspace.
Within fs/namei.c, follow_managed() is extended to check with d_manage() first
on each managed directory, before transiting away from it or attempting to
automount upon it.
follow_down() is renamed follow_down_one() and should only be used where the
filesystem deliberately intends to avoid management steps (e.g. autofs).
A new follow_down() is added that incorporates the loop done by all other
callers of follow_down() (do_add/move_mount(), autofs and NFSD; whilst AFS, NFS
and CIFS do use it, their use is removed by converting them to use
d_automount()). The new follow_down() calls d_manage() as appropriate. It
also takes an extra parameter to indicate if it is being called from mount code
(with namespace_sem writelocked) which it passes to d_manage(). follow_down()
ignores automount points so that it can be used to mount on them.
__follow_mount_rcu() is made to abort rcu-walk mode if it hits a directory with
DCACHE_MANAGE_TRANSIT set on the basis that we're probably going to have to
sleep. It would be possible to enter d_manage() in rcu-walk mode too, and have
that determine whether to abort or not itself. That would allow the autofs
daemon to continue on in rcu-walk mode.
Note that DCACHE_MANAGE_TRANSIT on a directory should be cleared when it isn't
required as every tranist from that directory will cause d_manage() to be
invoked. It can always be set again when necessary.
==========================
WHAT THIS MEANS FOR AUTOFS
==========================
Autofs currently uses the lookup() inode op and the d_revalidate() dentry op to
trigger the automounting of indirect mounts, and both of these can be called
with i_mutex held.
autofs knows that the i_mutex will be held by the caller in lookup(), and so
can drop it before invoking the daemon - but this isn't so for d_revalidate(),
since the lock is only held on _some_ of the code paths that call it. This
means that autofs can't risk dropping i_mutex from its d_revalidate() function
before it calls the daemon.
The bug could manifest itself as, for example, a process that's trying to
validate an automount dentry that gets made to wait because that dentry is
expired and needs cleaning up:
mkdir S ffffffff8014e05a 0 32580 24956
Call Trace:
[<ffffffff885371fd>] :autofs4:autofs4_wait+0x674/0x897
[<ffffffff80127f7d>] avc_has_perm+0x46/0x58
[<ffffffff8009fdcf>] autoremove_wake_function+0x0/0x2e
[<ffffffff88537be6>] :autofs4:autofs4_expire_wait+0x41/0x6b
[<ffffffff88535cfc>] :autofs4:autofs4_revalidate+0x91/0x149
[<ffffffff80036d96>] __lookup_hash+0xa0/0x12f
[<ffffffff80057a2f>] lookup_create+0x46/0x80
[<ffffffff800e6e31>] sys_mkdirat+0x56/0xe4
versus the automount daemon which wants to remove that dentry, but can't
because the normal process is holding the i_mutex lock:
automount D ffffffff8014e05a 0 32581 1 32561
Call Trace:
[<ffffffff80063c3f>] __mutex_lock_slowpath+0x60/0x9b
[<ffffffff8000ccf1>] do_path_lookup+0x2ca/0x2f1
[<ffffffff80063c89>] .text.lock.mutex+0xf/0x14
[<ffffffff800e6d55>] do_rmdir+0x77/0xde
[<ffffffff8005d229>] tracesys+0x71/0xe0
[<ffffffff8005d28d>] tracesys+0xd5/0xe0
which means that the system is deadlocked.
This patch allows autofs to hold up normal processes whilst the daemon goes
ahead and does things to the dentry tree behind the automouter point without
risking a deadlock as almost no locks are held in d_manage() and none in
d_automount().
Signed-off-by: David Howells <dhowells@redhat.com>
Was-Acked-by: Ian Kent <raven@themaw.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-01-15 01:45:26 +07:00
|
|
|
if (err < 0)
|
|
|
|
goto out;
|
2005-04-17 05:20:36 +07:00
|
|
|
|
2009-04-18 13:42:05 +07:00
|
|
|
exp2 = rqst_exp_get_by_name(rqstp, &path);
|
2005-04-17 05:20:36 +07:00
|
|
|
if (IS_ERR(exp2)) {
|
2009-10-26 08:18:19 +07:00
|
|
|
err = PTR_ERR(exp2);
|
|
|
|
/*
|
|
|
|
* We normally allow NFS clients to continue
|
|
|
|
* "underneath" a mountpoint that is not exported.
|
|
|
|
* The exception is V4ROOT, where no traversal is ever
|
|
|
|
* allowed without an explicit export of the new
|
|
|
|
* directory.
|
|
|
|
*/
|
|
|
|
if (err == -ENOENT && !(exp->ex_flags & NFSEXP_V4ROOT))
|
|
|
|
err = 0;
|
2009-04-18 13:42:05 +07:00
|
|
|
path_put(&path);
|
2005-04-17 05:20:36 +07:00
|
|
|
goto out;
|
|
|
|
}
|
2009-09-10 02:02:40 +07:00
|
|
|
if (nfsd_v4client(rqstp) ||
|
|
|
|
(exp->ex_flags & NFSEXP_CROSSMOUNT) || EX_NOHIDE(exp2)) {
|
2005-04-17 05:20:36 +07:00
|
|
|
/* successfully crossed mount point */
|
2009-04-18 13:32:31 +07:00
|
|
|
/*
|
2009-04-18 13:42:05 +07:00
|
|
|
* This is subtle: path.dentry is *not* on path.mnt
|
|
|
|
* at this point. The only reason we are safe is that
|
|
|
|
* original mnt is pinned down by exp, so we should
|
|
|
|
* put path *before* putting exp
|
2009-04-18 13:32:31 +07:00
|
|
|
*/
|
2009-04-18 13:42:05 +07:00
|
|
|
*dpp = path.dentry;
|
|
|
|
path.dentry = dentry;
|
2009-04-18 13:32:31 +07:00
|
|
|
*expp = exp2;
|
2009-04-18 13:42:05 +07:00
|
|
|
exp2 = exp;
|
2005-04-17 05:20:36 +07:00
|
|
|
}
|
2009-04-18 13:42:05 +07:00
|
|
|
path_put(&path);
|
|
|
|
exp_put(exp2);
|
2005-04-17 05:20:36 +07:00
|
|
|
out:
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
2009-09-27 07:32:24 +07:00
|
|
|
static void follow_to_parent(struct path *path)
|
|
|
|
{
|
|
|
|
struct dentry *dp;
|
|
|
|
|
|
|
|
while (path->dentry == path->mnt->mnt_root && follow_up(path))
|
|
|
|
;
|
|
|
|
dp = dget_parent(path->dentry);
|
|
|
|
dput(path->dentry);
|
|
|
|
path->dentry = dp;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int nfsd_lookup_parent(struct svc_rqst *rqstp, struct dentry *dparent, struct svc_export **exp, struct dentry **dentryp)
|
|
|
|
{
|
|
|
|
struct svc_export *exp2;
|
|
|
|
struct path path = {.mnt = mntget((*exp)->ex_path.mnt),
|
|
|
|
.dentry = dget(dparent)};
|
|
|
|
|
|
|
|
follow_to_parent(&path);
|
|
|
|
|
|
|
|
exp2 = rqst_exp_parent(rqstp, &path);
|
|
|
|
if (PTR_ERR(exp2) == -ENOENT) {
|
|
|
|
*dentryp = dget(dparent);
|
|
|
|
} else if (IS_ERR(exp2)) {
|
|
|
|
path_put(&path);
|
|
|
|
return PTR_ERR(exp2);
|
|
|
|
} else {
|
|
|
|
*dentryp = dget(path.dentry);
|
|
|
|
exp_put(*exp);
|
|
|
|
*exp = exp2;
|
|
|
|
}
|
|
|
|
path_put(&path);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2009-10-26 08:33:15 +07:00
|
|
|
/*
|
|
|
|
* For nfsd purposes, we treat V4ROOT exports as though there was an
|
|
|
|
* export at *every* directory.
|
|
|
|
*/
|
2009-10-26 08:43:01 +07:00
|
|
|
int nfsd_mountpoint(struct dentry *dentry, struct svc_export *exp)
|
2009-10-26 08:33:15 +07:00
|
|
|
{
|
|
|
|
if (d_mountpoint(dentry))
|
|
|
|
return 1;
|
2011-09-13 06:37:26 +07:00
|
|
|
if (nfsd4_is_junction(dentry))
|
|
|
|
return 1;
|
2009-10-26 08:33:15 +07:00
|
|
|
if (!(exp->ex_flags & NFSEXP_V4ROOT))
|
|
|
|
return 0;
|
2015-03-18 05:25:59 +07:00
|
|
|
return d_inode(dentry) != NULL;
|
2009-10-26 08:33:15 +07:00
|
|
|
}
|
|
|
|
|
2006-10-20 13:28:58 +07:00
|
|
|
__be32
|
2007-07-17 18:04:47 +07:00
|
|
|
nfsd_lookup_dentry(struct svc_rqst *rqstp, struct svc_fh *fhp,
|
2007-11-02 03:57:09 +07:00
|
|
|
const char *name, unsigned int len,
|
2007-07-17 18:04:47 +07:00
|
|
|
struct svc_export **exp_ret, struct dentry **dentry_ret)
|
2005-04-17 05:20:36 +07:00
|
|
|
{
|
|
|
|
struct svc_export *exp;
|
|
|
|
struct dentry *dparent;
|
|
|
|
struct dentry *dentry;
|
2006-10-20 13:28:58 +07:00
|
|
|
int host_err;
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
dprintk("nfsd: nfsd_lookup(fh %s, %.*s)\n", SVCFH_fmt(fhp), len,name);
|
|
|
|
|
|
|
|
dparent = fhp->fh_dentry;
|
2014-06-10 21:06:44 +07:00
|
|
|
exp = exp_get(fhp->fh_export);
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
/* Lookup the name, but don't follow links */
|
|
|
|
if (isdotent(name, len)) {
|
|
|
|
if (len==1)
|
|
|
|
dentry = dget(dparent);
|
2008-02-15 10:38:39 +07:00
|
|
|
else if (dparent != exp->ex_path.dentry)
|
2005-04-17 05:20:36 +07:00
|
|
|
dentry = dget_parent(dparent);
|
2009-09-27 03:53:01 +07:00
|
|
|
else if (!EX_NOHIDE(exp) && !nfsd_v4client(rqstp))
|
2005-04-17 05:20:36 +07:00
|
|
|
dentry = dget(dparent); /* .. == . just like at / */
|
|
|
|
else {
|
|
|
|
/* checking mountpoint crossing is very different when stepping up */
|
2009-09-27 07:32:24 +07:00
|
|
|
host_err = nfsd_lookup_parent(rqstp, dparent, &exp, &dentry);
|
|
|
|
if (host_err)
|
2005-04-17 05:20:36 +07:00
|
|
|
goto out_nfserr;
|
|
|
|
}
|
|
|
|
} else {
|
2014-01-25 06:04:40 +07:00
|
|
|
/*
|
|
|
|
* In the nfsd4_open() case, this may be held across
|
|
|
|
* subsequent open and delegation acquisition which may
|
|
|
|
* need to take the child's i_mutex:
|
|
|
|
*/
|
|
|
|
fh_lock_nested(fhp, I_MUTEX_PARENT);
|
2005-04-17 05:20:36 +07:00
|
|
|
dentry = lookup_one_len(name, dparent, len);
|
2006-10-20 13:28:58 +07:00
|
|
|
host_err = PTR_ERR(dentry);
|
2005-04-17 05:20:36 +07:00
|
|
|
if (IS_ERR(dentry))
|
|
|
|
goto out_nfserr;
|
2009-10-26 08:33:15 +07:00
|
|
|
if (nfsd_mountpoint(dentry, exp)) {
|
2016-01-08 04:08:20 +07:00
|
|
|
/*
|
|
|
|
* We don't need the i_mutex after all. It's
|
|
|
|
* still possible we could open this (regular
|
|
|
|
* files can be mountpoints too), but the
|
|
|
|
* i_mutex is just there to prevent renames of
|
|
|
|
* something that we might be about to delegate,
|
|
|
|
* and a mountpoint won't be renamed:
|
|
|
|
*/
|
|
|
|
fh_unlock(fhp);
|
2006-10-20 13:28:58 +07:00
|
|
|
if ((host_err = nfsd_cross_mnt(rqstp, &dentry, &exp))) {
|
2005-04-17 05:20:36 +07:00
|
|
|
dput(dentry);
|
|
|
|
goto out_nfserr;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
2007-07-17 18:04:47 +07:00
|
|
|
*dentry_ret = dentry;
|
|
|
|
*exp_ret = exp;
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
out_nfserr:
|
|
|
|
exp_put(exp);
|
|
|
|
return nfserrno(host_err);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Look up one component of a pathname.
|
|
|
|
* N.B. After this call _both_ fhp and resfh need an fh_put
|
|
|
|
*
|
|
|
|
* If the lookup would cross a mountpoint, and the mounted filesystem
|
|
|
|
* is exported to the client with NFSEXP_NOHIDE, then the lookup is
|
|
|
|
* accepted as it stands and the mounted directory is
|
|
|
|
* returned. Otherwise the covered directory is returned.
|
|
|
|
* NOTE: this mountpoint crossing is not supported properly by all
|
|
|
|
* clients and is explicitly disallowed for NFSv3
|
|
|
|
* NeilBrown <neilb@cse.unsw.edu.au>
|
|
|
|
*/
|
|
|
|
__be32
|
|
|
|
nfsd_lookup(struct svc_rqst *rqstp, struct svc_fh *fhp, const char *name,
|
2007-11-02 03:57:09 +07:00
|
|
|
unsigned int len, struct svc_fh *resfh)
|
2007-07-17 18:04:47 +07:00
|
|
|
{
|
|
|
|
struct svc_export *exp;
|
|
|
|
struct dentry *dentry;
|
|
|
|
__be32 err;
|
|
|
|
|
2011-04-09 22:28:53 +07:00
|
|
|
err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_EXEC);
|
|
|
|
if (err)
|
|
|
|
return err;
|
2007-07-17 18:04:47 +07:00
|
|
|
err = nfsd_lookup_dentry(rqstp, fhp, name, len, &exp, &dentry);
|
|
|
|
if (err)
|
|
|
|
return err;
|
2007-07-17 18:04:48 +07:00
|
|
|
err = check_nfsd_access(exp, rqstp);
|
|
|
|
if (err)
|
|
|
|
goto out;
|
2005-04-17 05:20:36 +07:00
|
|
|
/*
|
|
|
|
* Note: we compose the file handle now, but as the
|
|
|
|
* dentry may be negative, it may need to be updated.
|
|
|
|
*/
|
|
|
|
err = fh_compose(resfh, exp, dentry, fhp);
|
2015-03-18 05:25:59 +07:00
|
|
|
if (!err && d_really_is_negative(dentry))
|
2005-04-17 05:20:36 +07:00
|
|
|
err = nfserr_noent;
|
2007-07-17 18:04:48 +07:00
|
|
|
out:
|
2005-04-17 05:20:36 +07:00
|
|
|
dput(dentry);
|
|
|
|
exp_put(exp);
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
2010-02-18 03:05:11 +07:00
|
|
|
/*
|
|
|
|
* Commit metadata changes to stable storage.
|
|
|
|
*/
|
|
|
|
static int
|
|
|
|
commit_metadata(struct svc_fh *fhp)
|
|
|
|
{
|
2015-03-18 05:25:59 +07:00
|
|
|
struct inode *inode = d_inode(fhp->fh_dentry);
|
2010-02-18 03:05:11 +07:00
|
|
|
const struct export_operations *export_ops = inode->i_sb->s_export_op;
|
|
|
|
|
|
|
|
if (!EX_ISSYNC(fhp->fh_export))
|
|
|
|
return 0;
|
|
|
|
|
2010-10-06 15:48:20 +07:00
|
|
|
if (export_ops->commit_metadata)
|
|
|
|
return export_ops->commit_metadata(inode);
|
|
|
|
return sync_inode_metadata(inode, 1);
|
2010-02-18 03:05:11 +07:00
|
|
|
}
|
2007-07-17 18:04:47 +07:00
|
|
|
|
2005-04-17 05:20:36 +07:00
|
|
|
/*
|
2013-11-18 20:07:30 +07:00
|
|
|
* Go over the attributes and take care of the small differences between
|
|
|
|
* NFS semantics and what Linux expects.
|
2005-04-17 05:20:36 +07:00
|
|
|
*/
|
2013-11-18 20:07:30 +07:00
|
|
|
static void
|
|
|
|
nfsd_sanitize_attrs(struct inode *inode, struct iattr *iap)
|
2005-04-17 05:20:36 +07:00
|
|
|
{
|
knfsd: clear both setuid and setgid whenever a chown is done
Currently, knfsd only clears the setuid bit if the owner of a file is
changed on a SETATTR call, and only clears the setgid bit if the group
is changed. POSIX says this in the spec for chown():
"If the specified file is a regular file, one or more of the
S_IXUSR, S_IXGRP, or S_IXOTH bits of the file mode are set, and the
process does not have appropriate privileges, the set-user-ID
(S_ISUID) and set-group-ID (S_ISGID) bits of the file mode shall
be cleared upon successful return from chown()."
If I'm reading this correctly, then knfsd is doing this wrong. It should
be clearing both the setuid and setgid bit on any SETATTR that changes
the uid or gid. This wasn't really as noticable before, but now that the
ATTR_KILL_S*ID bits are a no-op for the NFS client, it's more evident.
This patch corrects the nfsd_setattr logic so that this occurs. It also
does a bit of cleanup to the function.
There is also one small behavioral change. If a SETATTR call comes in
that changes the uid/gid and the mode, then we now only clear the setgid
bit if the group execute bit isn't set. The setgid bit without a group
execute bit signifies mandatory locking and we likely don't want to
clear the bit in that case. Since there is no call in POSIX that should
generate a SETATTR call like this, then this should rarely happen, but
it's worth noting.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-04-17 03:28:47 +07:00
|
|
|
/* sanitize the mode change */
|
2005-04-17 05:20:36 +07:00
|
|
|
if (iap->ia_valid & ATTR_MODE) {
|
|
|
|
iap->ia_mode &= S_IALLUGO;
|
2008-04-17 03:28:46 +07:00
|
|
|
iap->ia_mode |= (inode->i_mode & ~S_IALLUGO);
|
knfsd: clear both setuid and setgid whenever a chown is done
Currently, knfsd only clears the setuid bit if the owner of a file is
changed on a SETATTR call, and only clears the setgid bit if the group
is changed. POSIX says this in the spec for chown():
"If the specified file is a regular file, one or more of the
S_IXUSR, S_IXGRP, or S_IXOTH bits of the file mode are set, and the
process does not have appropriate privileges, the set-user-ID
(S_ISUID) and set-group-ID (S_ISGID) bits of the file mode shall
be cleared upon successful return from chown()."
If I'm reading this correctly, then knfsd is doing this wrong. It should
be clearing both the setuid and setgid bit on any SETATTR that changes
the uid or gid. This wasn't really as noticable before, but now that the
ATTR_KILL_S*ID bits are a no-op for the NFS client, it's more evident.
This patch corrects the nfsd_setattr logic so that this occurs. It also
does a bit of cleanup to the function.
There is also one small behavioral change. If a SETATTR call comes in
that changes the uid/gid and the mode, then we now only clear the setgid
bit if the group execute bit isn't set. The setgid bit without a group
execute bit signifies mandatory locking and we likely don't want to
clear the bit in that case. Since there is no call in POSIX that should
generate a SETATTR call like this, then this should rarely happen, but
it's worth noting.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-04-17 03:28:47 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
/* Revoke setuid/setgid on chown */
|
Inconsistent setattr behaviour
There is an inconsistency seen in the behaviour of nfs compared to other local
filesystems on linux when changing owner or group of a directory. If the
directory has SUID/SGID flags set, on changing owner or group on the directory,
the flags are stripped off on nfs. These flags are maintained on other
filesystems such as ext3.
To reproduce on a nfs share or local filesystem, run the following commands
mkdir test; chmod +s+g test; chown user1 test; ls -ld test
On the nfs share, the flags are stripped and the output seen is
drwxr-xr-x 2 user1 root 4096 Feb 23 2009 test
On other local filesystems(ex: ext3), the flags are not stripped and the output
seen is
drwsr-sr-x 2 user1 root 4096 Feb 23 13:57 test
chown_common() called from sys_chown() will only strip the flags if the inode is
not a directory.
static int chown_common(struct dentry * dentry, uid_t user, gid_t group)
{
..
if (!S_ISDIR(inode->i_mode))
newattrs.ia_valid |=
ATTR_KILL_SUID | ATTR_KILL_SGID | ATTR_KILL_PRIV;
..
}
See: http://www.opengroup.org/onlinepubs/7990989775/xsh/chown.html
"If the path argument refers to a regular file, the set-user-ID (S_ISUID) and
set-group-ID (S_ISGID) bits of the file mode are cleared upon successful return
from chown(), unless the call is made by a process with appropriate privileges,
in which case it is implementation-dependent whether these bits are altered. If
chown() is successfully invoked on a file that is not a regular file, these
bits may be cleared. These bits are defined in <sys/stat.h>."
The behaviour as it stands does not appear to violate POSIX. However the
actions performed are inconsistent when comparing ext3 and nfs.
Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-02-23 23:22:03 +07:00
|
|
|
if (!S_ISDIR(inode->i_mode) &&
|
2013-12-11 17:16:36 +07:00
|
|
|
((iap->ia_valid & ATTR_UID) || (iap->ia_valid & ATTR_GID))) {
|
knfsd: clear both setuid and setgid whenever a chown is done
Currently, knfsd only clears the setuid bit if the owner of a file is
changed on a SETATTR call, and only clears the setgid bit if the group
is changed. POSIX says this in the spec for chown():
"If the specified file is a regular file, one or more of the
S_IXUSR, S_IXGRP, or S_IXOTH bits of the file mode are set, and the
process does not have appropriate privileges, the set-user-ID
(S_ISUID) and set-group-ID (S_ISGID) bits of the file mode shall
be cleared upon successful return from chown()."
If I'm reading this correctly, then knfsd is doing this wrong. It should
be clearing both the setuid and setgid bit on any SETATTR that changes
the uid or gid. This wasn't really as noticable before, but now that the
ATTR_KILL_S*ID bits are a no-op for the NFS client, it's more evident.
This patch corrects the nfsd_setattr logic so that this occurs. It also
does a bit of cleanup to the function.
There is also one small behavioral change. If a SETATTR call comes in
that changes the uid/gid and the mode, then we now only clear the setgid
bit if the group execute bit isn't set. The setgid bit without a group
execute bit signifies mandatory locking and we likely don't want to
clear the bit in that case. Since there is no call in POSIX that should
generate a SETATTR call like this, then this should rarely happen, but
it's worth noting.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-04-17 03:28:47 +07:00
|
|
|
iap->ia_valid |= ATTR_KILL_PRIV;
|
|
|
|
if (iap->ia_valid & ATTR_MODE) {
|
|
|
|
/* we're setting mode too, just clear the s*id bits */
|
2007-10-18 17:05:19 +07:00
|
|
|
iap->ia_mode &= ~S_ISUID;
|
knfsd: clear both setuid and setgid whenever a chown is done
Currently, knfsd only clears the setuid bit if the owner of a file is
changed on a SETATTR call, and only clears the setgid bit if the group
is changed. POSIX says this in the spec for chown():
"If the specified file is a regular file, one or more of the
S_IXUSR, S_IXGRP, or S_IXOTH bits of the file mode are set, and the
process does not have appropriate privileges, the set-user-ID
(S_ISUID) and set-group-ID (S_ISGID) bits of the file mode shall
be cleared upon successful return from chown()."
If I'm reading this correctly, then knfsd is doing this wrong. It should
be clearing both the setuid and setgid bit on any SETATTR that changes
the uid or gid. This wasn't really as noticable before, but now that the
ATTR_KILL_S*ID bits are a no-op for the NFS client, it's more evident.
This patch corrects the nfsd_setattr logic so that this occurs. It also
does a bit of cleanup to the function.
There is also one small behavioral change. If a SETATTR call comes in
that changes the uid/gid and the mode, then we now only clear the setgid
bit if the group execute bit isn't set. The setgid bit without a group
execute bit signifies mandatory locking and we likely don't want to
clear the bit in that case. Since there is no call in POSIX that should
generate a SETATTR call like this, then this should rarely happen, but
it's worth noting.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-04-17 03:28:47 +07:00
|
|
|
if (iap->ia_mode & S_IXGRP)
|
|
|
|
iap->ia_mode &= ~S_ISGID;
|
|
|
|
} else {
|
|
|
|
/* set ATTR_KILL_* bits and let VFS handle it */
|
|
|
|
iap->ia_valid |= (ATTR_KILL_SUID | ATTR_KILL_SGID);
|
2007-10-18 17:05:19 +07:00
|
|
|
}
|
2005-04-17 05:20:36 +07:00
|
|
|
}
|
2013-11-18 20:07:30 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
static __be32
|
|
|
|
nfsd_get_write_access(struct svc_rqst *rqstp, struct svc_fh *fhp,
|
|
|
|
struct iattr *iap)
|
|
|
|
{
|
2015-03-18 05:25:59 +07:00
|
|
|
struct inode *inode = d_inode(fhp->fh_dentry);
|
2013-11-18 20:07:30 +07:00
|
|
|
int host_err;
|
|
|
|
|
|
|
|
if (iap->ia_size < inode->i_size) {
|
|
|
|
__be32 err;
|
|
|
|
|
|
|
|
err = nfsd_permission(rqstp, fhp->fh_export, fhp->fh_dentry,
|
|
|
|
NFSD_MAY_TRUNC | NFSD_MAY_OWNER_OVERRIDE);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
|
|
|
host_err = get_write_access(inode);
|
|
|
|
if (host_err)
|
|
|
|
goto out_nfserrno;
|
|
|
|
|
|
|
|
host_err = locks_verify_truncate(inode, NULL, iap->ia_size);
|
|
|
|
if (host_err)
|
|
|
|
goto out_put_write_access;
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
out_put_write_access:
|
|
|
|
put_write_access(inode);
|
|
|
|
out_nfserrno:
|
|
|
|
return nfserrno(host_err);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Set various file attributes. After this call fhp needs an fh_put.
|
|
|
|
*/
|
|
|
|
__be32
|
|
|
|
nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
|
|
|
|
int check_guard, time_t guardtime)
|
|
|
|
{
|
|
|
|
struct dentry *dentry;
|
|
|
|
struct inode *inode;
|
|
|
|
int accmode = NFSD_MAY_SATTR;
|
|
|
|
umode_t ftype = 0;
|
|
|
|
__be32 err;
|
|
|
|
int host_err;
|
2014-02-25 02:59:47 +07:00
|
|
|
bool get_write_count;
|
2013-11-18 20:07:30 +07:00
|
|
|
int size_change = 0;
|
|
|
|
|
|
|
|
if (iap->ia_valid & (ATTR_ATIME | ATTR_MTIME | ATTR_SIZE))
|
|
|
|
accmode |= NFSD_MAY_WRITE|NFSD_MAY_OWNER_OVERRIDE;
|
|
|
|
if (iap->ia_valid & ATTR_SIZE)
|
|
|
|
ftype = S_IFREG;
|
|
|
|
|
2014-02-25 02:59:47 +07:00
|
|
|
/* Callers that do fh_verify should do the fh_want_write: */
|
|
|
|
get_write_count = !fhp->fh_dentry;
|
|
|
|
|
2013-11-18 20:07:30 +07:00
|
|
|
/* Get inode */
|
|
|
|
err = fh_verify(rqstp, fhp, ftype, accmode);
|
|
|
|
if (err)
|
|
|
|
goto out;
|
2014-02-25 02:59:47 +07:00
|
|
|
if (get_write_count) {
|
|
|
|
host_err = fh_want_write(fhp);
|
|
|
|
if (host_err)
|
|
|
|
return nfserrno(host_err);
|
|
|
|
}
|
2013-11-18 20:07:30 +07:00
|
|
|
|
|
|
|
dentry = fhp->fh_dentry;
|
2015-03-18 05:25:59 +07:00
|
|
|
inode = d_inode(dentry);
|
2013-11-18 20:07:30 +07:00
|
|
|
|
|
|
|
/* Ignore any mode updates on symlinks */
|
|
|
|
if (S_ISLNK(inode->i_mode))
|
|
|
|
iap->ia_valid &= ~ATTR_MODE;
|
|
|
|
|
|
|
|
if (!iap->ia_valid)
|
|
|
|
goto out;
|
2005-04-17 05:20:36 +07:00
|
|
|
|
2013-11-18 20:07:30 +07:00
|
|
|
nfsd_sanitize_attrs(inode, iap);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* The size case is special, it changes the file in addition to the
|
|
|
|
* attributes.
|
|
|
|
*/
|
|
|
|
if (iap->ia_valid & ATTR_SIZE) {
|
|
|
|
err = nfsd_get_write_access(rqstp, fhp, iap);
|
|
|
|
if (err)
|
|
|
|
goto out;
|
|
|
|
size_change = 1;
|
2014-09-08 02:15:52 +07:00
|
|
|
|
|
|
|
/*
|
|
|
|
* RFC5661, Section 18.30.4:
|
|
|
|
* Changing the size of a file with SETATTR indirectly
|
|
|
|
* changes the time_modify and change attributes.
|
|
|
|
*
|
|
|
|
* (and similar for the older RFCs)
|
|
|
|
*/
|
|
|
|
if (iap->ia_size != i_size_read(inode))
|
|
|
|
iap->ia_valid |= ATTR_MTIME;
|
2013-11-18 20:07:30 +07:00
|
|
|
}
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
iap->ia_valid |= ATTR_CTIME;
|
|
|
|
|
2013-11-18 20:07:47 +07:00
|
|
|
if (check_guard && guardtime != inode->i_ctime.tv_sec) {
|
|
|
|
err = nfserr_notsync;
|
|
|
|
goto out_put_write_access;
|
2005-04-17 05:20:36 +07:00
|
|
|
}
|
2013-11-18 20:07:47 +07:00
|
|
|
|
|
|
|
fh_lock(fhp);
|
|
|
|
host_err = notify_change(dentry, iap, NULL);
|
|
|
|
fh_unlock(fhp);
|
2014-02-18 22:27:53 +07:00
|
|
|
err = nfserrno(host_err);
|
2013-11-18 20:07:47 +07:00
|
|
|
|
|
|
|
out_put_write_access:
|
2005-04-17 05:20:36 +07:00
|
|
|
if (size_change)
|
|
|
|
put_write_access(inode);
|
|
|
|
if (!err)
|
2014-07-03 18:54:19 +07:00
|
|
|
err = nfserrno(commit_metadata(fhp));
|
2005-04-17 05:20:36 +07:00
|
|
|
out:
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
2006-01-10 11:51:55 +07:00
|
|
|
#if defined(CONFIG_NFSD_V4)
|
2012-01-05 04:26:43 +07:00
|
|
|
/*
|
|
|
|
* NFS junction information is stored in an extended attribute.
|
|
|
|
*/
|
|
|
|
#define NFSD_JUNCTION_XATTR_NAME XATTR_TRUSTED_PREFIX "junction.nfs"
|
|
|
|
|
|
|
|
/**
|
|
|
|
* nfsd4_is_junction - Test if an object could be an NFS junction
|
|
|
|
*
|
|
|
|
* @dentry: object to test
|
|
|
|
*
|
|
|
|
* Returns 1 if "dentry" appears to contain NFS junction information.
|
|
|
|
* Otherwise 0 is returned.
|
|
|
|
*/
|
2011-09-13 06:37:26 +07:00
|
|
|
int nfsd4_is_junction(struct dentry *dentry)
|
|
|
|
{
|
2015-03-18 05:25:59 +07:00
|
|
|
struct inode *inode = d_inode(dentry);
|
2011-09-13 06:37:26 +07:00
|
|
|
|
|
|
|
if (inode == NULL)
|
|
|
|
return 0;
|
|
|
|
if (inode->i_mode & S_IXUGO)
|
|
|
|
return 0;
|
|
|
|
if (!(inode->i_mode & S_ISVTX))
|
|
|
|
return 0;
|
2012-01-05 04:26:43 +07:00
|
|
|
if (vfs_getxattr(dentry, NFSD_JUNCTION_XATTR_NAME, NULL, 0) <= 0)
|
2011-09-13 06:37:26 +07:00
|
|
|
return 0;
|
|
|
|
return 1;
|
|
|
|
}
|
2013-05-03 00:19:10 +07:00
|
|
|
#ifdef CONFIG_NFSD_V4_SECURITY_LABEL
|
|
|
|
__be32 nfsd4_set_nfs4_label(struct svc_rqst *rqstp, struct svc_fh *fhp,
|
|
|
|
struct xdr_netobj *label)
|
|
|
|
{
|
|
|
|
__be32 error;
|
|
|
|
int host_error;
|
|
|
|
struct dentry *dentry;
|
|
|
|
|
|
|
|
error = fh_verify(rqstp, fhp, 0 /* S_IFREG */, NFSD_MAY_SATTR);
|
|
|
|
if (error)
|
|
|
|
return error;
|
|
|
|
|
|
|
|
dentry = fhp->fh_dentry;
|
|
|
|
|
2016-01-23 03:40:57 +07:00
|
|
|
inode_lock(d_inode(dentry));
|
2013-05-03 00:19:10 +07:00
|
|
|
host_error = security_inode_setsecctx(dentry, label->data, label->len);
|
2016-01-23 03:40:57 +07:00
|
|
|
inode_unlock(d_inode(dentry));
|
2013-05-03 00:19:10 +07:00
|
|
|
return nfserrno(host_error);
|
|
|
|
}
|
|
|
|
#else
|
|
|
|
__be32 nfsd4_set_nfs4_label(struct svc_rqst *rqstp, struct svc_fh *fhp,
|
|
|
|
struct xdr_netobj *label)
|
|
|
|
{
|
|
|
|
return nfserr_notsupp;
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
2015-12-03 18:59:52 +07:00
|
|
|
__be32 nfsd4_clone_file_range(struct file *src, u64 src_pos, struct file *dst,
|
|
|
|
u64 dst_pos, u64 count)
|
|
|
|
{
|
|
|
|
return nfserrno(vfs_clone_file_range(src, src_pos, dst, dst_pos,
|
|
|
|
count));
|
|
|
|
}
|
|
|
|
|
2016-09-08 02:57:30 +07:00
|
|
|
ssize_t nfsd_copy_file_range(struct file *src, u64 src_pos, struct file *dst,
|
|
|
|
u64 dst_pos, u64 count)
|
|
|
|
{
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Limit copy to 4MB to prevent indefinitely blocking an nfsd
|
|
|
|
* thread and client rpc slot. The choice of 4MB is somewhat
|
|
|
|
* arbitrary. We might instead base this on r/wsize, or make it
|
|
|
|
* tunable, or use a time instead of a byte limit, or implement
|
|
|
|
* asynchronous copy. In theory a client could also recognize a
|
|
|
|
* limit like this and pipeline multiple COPY requests.
|
|
|
|
*/
|
|
|
|
count = min_t(u64, count, 1 << 22);
|
|
|
|
return vfs_copy_file_range(src, src_pos, dst, dst_pos, count, 0);
|
|
|
|
}
|
|
|
|
|
2014-11-08 02:44:26 +07:00
|
|
|
__be32 nfsd4_vfs_fallocate(struct svc_rqst *rqstp, struct svc_fh *fhp,
|
|
|
|
struct file *file, loff_t offset, loff_t len,
|
|
|
|
int flags)
|
|
|
|
{
|
|
|
|
int error;
|
|
|
|
|
|
|
|
if (!S_ISREG(file_inode(file)->i_mode))
|
|
|
|
return nfserr_inval;
|
|
|
|
|
|
|
|
error = vfs_fallocate(file, flags, offset, len);
|
|
|
|
if (!error)
|
|
|
|
error = commit_metadata(fhp);
|
|
|
|
|
|
|
|
return nfserrno(error);
|
|
|
|
}
|
2010-07-06 23:39:12 +07:00
|
|
|
#endif /* defined(CONFIG_NFSD_V4) */
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
#ifdef CONFIG_NFSD_V3
|
|
|
|
/*
|
|
|
|
* Check server access rights to a file system object
|
|
|
|
*/
|
|
|
|
struct accessmap {
|
|
|
|
u32 access;
|
|
|
|
int how;
|
|
|
|
};
|
|
|
|
static struct accessmap nfs3_regaccess[] = {
|
2008-06-16 18:20:29 +07:00
|
|
|
{ NFS3_ACCESS_READ, NFSD_MAY_READ },
|
|
|
|
{ NFS3_ACCESS_EXECUTE, NFSD_MAY_EXEC },
|
|
|
|
{ NFS3_ACCESS_MODIFY, NFSD_MAY_WRITE|NFSD_MAY_TRUNC },
|
|
|
|
{ NFS3_ACCESS_EXTEND, NFSD_MAY_WRITE },
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
{ 0, 0 }
|
|
|
|
};
|
|
|
|
|
|
|
|
static struct accessmap nfs3_diraccess[] = {
|
2008-06-16 18:20:29 +07:00
|
|
|
{ NFS3_ACCESS_READ, NFSD_MAY_READ },
|
|
|
|
{ NFS3_ACCESS_LOOKUP, NFSD_MAY_EXEC },
|
|
|
|
{ NFS3_ACCESS_MODIFY, NFSD_MAY_EXEC|NFSD_MAY_WRITE|NFSD_MAY_TRUNC},
|
|
|
|
{ NFS3_ACCESS_EXTEND, NFSD_MAY_EXEC|NFSD_MAY_WRITE },
|
|
|
|
{ NFS3_ACCESS_DELETE, NFSD_MAY_REMOVE },
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
{ 0, 0 }
|
|
|
|
};
|
|
|
|
|
|
|
|
static struct accessmap nfs3_anyaccess[] = {
|
|
|
|
/* Some clients - Solaris 2.6 at least, make an access call
|
|
|
|
* to the server to check for access for things like /dev/null
|
|
|
|
* (which really, the server doesn't care about). So
|
|
|
|
* We provide simple access checking for them, looking
|
|
|
|
* mainly at mode bits, and we make sure to ignore read-only
|
|
|
|
* filesystem checks
|
|
|
|
*/
|
2008-06-16 18:20:29 +07:00
|
|
|
{ NFS3_ACCESS_READ, NFSD_MAY_READ },
|
|
|
|
{ NFS3_ACCESS_EXECUTE, NFSD_MAY_EXEC },
|
|
|
|
{ NFS3_ACCESS_MODIFY, NFSD_MAY_WRITE|NFSD_MAY_LOCAL_ACCESS },
|
|
|
|
{ NFS3_ACCESS_EXTEND, NFSD_MAY_WRITE|NFSD_MAY_LOCAL_ACCESS },
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
{ 0, 0 }
|
|
|
|
};
|
|
|
|
|
2006-10-20 13:28:58 +07:00
|
|
|
__be32
|
2005-04-17 05:20:36 +07:00
|
|
|
nfsd_access(struct svc_rqst *rqstp, struct svc_fh *fhp, u32 *access, u32 *supported)
|
|
|
|
{
|
|
|
|
struct accessmap *map;
|
|
|
|
struct svc_export *export;
|
|
|
|
struct dentry *dentry;
|
|
|
|
u32 query, result = 0, sresult = 0;
|
2006-10-20 13:28:58 +07:00
|
|
|
__be32 error;
|
2005-04-17 05:20:36 +07:00
|
|
|
|
2008-06-16 18:20:29 +07:00
|
|
|
error = fh_verify(rqstp, fhp, 0, NFSD_MAY_NOP);
|
2005-04-17 05:20:36 +07:00
|
|
|
if (error)
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
export = fhp->fh_export;
|
|
|
|
dentry = fhp->fh_dentry;
|
|
|
|
|
VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry)
Convert the following where appropriate:
(1) S_ISLNK(dentry->d_inode) to d_is_symlink(dentry).
(2) S_ISREG(dentry->d_inode) to d_is_reg(dentry).
(3) S_ISDIR(dentry->d_inode) to d_is_dir(dentry). This is actually more
complicated than it appears as some calls should be converted to
d_can_lookup() instead. The difference is whether the directory in
question is a real dir with a ->lookup op or whether it's a fake dir with
a ->d_automount op.
In some circumstances, we can subsume checks for dentry->d_inode not being
NULL into this, provided we the code isn't in a filesystem that expects
d_inode to be NULL if the dirent really *is* negative (ie. if we're going to
use d_inode() rather than d_backing_inode() to get the inode pointer).
Note that the dentry type field may be set to something other than
DCACHE_MISS_TYPE when d_inode is NULL in the case of unionmount, where the VFS
manages the fall-through from a negative dentry to a lower layer. In such a
case, the dentry type of the negative union dentry is set to the same as the
type of the lower dentry.
However, if you know d_inode is not NULL at the call site, then you can use
the d_is_xxx() functions even in a filesystem.
There is one further complication: a 0,0 chardev dentry may be labelled
DCACHE_WHITEOUT_TYPE rather than DCACHE_SPECIAL_TYPE. Strictly, this was
intended for special directory entry types that don't have attached inodes.
The following perl+coccinelle script was used:
use strict;
my @callers;
open($fd, 'git grep -l \'S_IS[A-Z].*->d_inode\' |') ||
die "Can't grep for S_ISDIR and co. callers";
@callers = <$fd>;
close($fd);
unless (@callers) {
print "No matches\n";
exit(0);
}
my @cocci = (
'@@',
'expression E;',
'@@',
'',
'- S_ISLNK(E->d_inode->i_mode)',
'+ d_is_symlink(E)',
'',
'@@',
'expression E;',
'@@',
'',
'- S_ISDIR(E->d_inode->i_mode)',
'+ d_is_dir(E)',
'',
'@@',
'expression E;',
'@@',
'',
'- S_ISREG(E->d_inode->i_mode)',
'+ d_is_reg(E)' );
my $coccifile = "tmp.sp.cocci";
open($fd, ">$coccifile") || die $coccifile;
print($fd "$_\n") || die $coccifile foreach (@cocci);
close($fd);
foreach my $file (@callers) {
chomp $file;
print "Processing ", $file, "\n";
system("spatch", "--sp-file", $coccifile, $file, "--in-place", "--no-show-diff") == 0 ||
die "spatch failed";
}
[AV: overlayfs parts skipped]
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-01-29 19:02:35 +07:00
|
|
|
if (d_is_reg(dentry))
|
2005-04-17 05:20:36 +07:00
|
|
|
map = nfs3_regaccess;
|
VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry)
Convert the following where appropriate:
(1) S_ISLNK(dentry->d_inode) to d_is_symlink(dentry).
(2) S_ISREG(dentry->d_inode) to d_is_reg(dentry).
(3) S_ISDIR(dentry->d_inode) to d_is_dir(dentry). This is actually more
complicated than it appears as some calls should be converted to
d_can_lookup() instead. The difference is whether the directory in
question is a real dir with a ->lookup op or whether it's a fake dir with
a ->d_automount op.
In some circumstances, we can subsume checks for dentry->d_inode not being
NULL into this, provided we the code isn't in a filesystem that expects
d_inode to be NULL if the dirent really *is* negative (ie. if we're going to
use d_inode() rather than d_backing_inode() to get the inode pointer).
Note that the dentry type field may be set to something other than
DCACHE_MISS_TYPE when d_inode is NULL in the case of unionmount, where the VFS
manages the fall-through from a negative dentry to a lower layer. In such a
case, the dentry type of the negative union dentry is set to the same as the
type of the lower dentry.
However, if you know d_inode is not NULL at the call site, then you can use
the d_is_xxx() functions even in a filesystem.
There is one further complication: a 0,0 chardev dentry may be labelled
DCACHE_WHITEOUT_TYPE rather than DCACHE_SPECIAL_TYPE. Strictly, this was
intended for special directory entry types that don't have attached inodes.
The following perl+coccinelle script was used:
use strict;
my @callers;
open($fd, 'git grep -l \'S_IS[A-Z].*->d_inode\' |') ||
die "Can't grep for S_ISDIR and co. callers";
@callers = <$fd>;
close($fd);
unless (@callers) {
print "No matches\n";
exit(0);
}
my @cocci = (
'@@',
'expression E;',
'@@',
'',
'- S_ISLNK(E->d_inode->i_mode)',
'+ d_is_symlink(E)',
'',
'@@',
'expression E;',
'@@',
'',
'- S_ISDIR(E->d_inode->i_mode)',
'+ d_is_dir(E)',
'',
'@@',
'expression E;',
'@@',
'',
'- S_ISREG(E->d_inode->i_mode)',
'+ d_is_reg(E)' );
my $coccifile = "tmp.sp.cocci";
open($fd, ">$coccifile") || die $coccifile;
print($fd "$_\n") || die $coccifile foreach (@cocci);
close($fd);
foreach my $file (@callers) {
chomp $file;
print "Processing ", $file, "\n";
system("spatch", "--sp-file", $coccifile, $file, "--in-place", "--no-show-diff") == 0 ||
die "spatch failed";
}
[AV: overlayfs parts skipped]
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-01-29 19:02:35 +07:00
|
|
|
else if (d_is_dir(dentry))
|
2005-04-17 05:20:36 +07:00
|
|
|
map = nfs3_diraccess;
|
|
|
|
else
|
|
|
|
map = nfs3_anyaccess;
|
|
|
|
|
|
|
|
|
|
|
|
query = *access;
|
|
|
|
for (; map->access; map++) {
|
|
|
|
if (map->access & query) {
|
2006-10-20 13:28:58 +07:00
|
|
|
__be32 err2;
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
sresult |= map->access;
|
|
|
|
|
2007-07-17 18:04:48 +07:00
|
|
|
err2 = nfsd_permission(rqstp, export, dentry, map->how);
|
2005-04-17 05:20:36 +07:00
|
|
|
switch (err2) {
|
|
|
|
case nfs_ok:
|
|
|
|
result |= map->access;
|
|
|
|
break;
|
|
|
|
|
|
|
|
/* the following error codes just mean the access was not allowed,
|
|
|
|
* rather than an error occurred */
|
|
|
|
case nfserr_rofs:
|
|
|
|
case nfserr_acces:
|
|
|
|
case nfserr_perm:
|
|
|
|
/* simply don't "or" in the access bit. */
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
error = err2;
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
*access = result;
|
|
|
|
if (supported)
|
|
|
|
*supported = sresult;
|
|
|
|
|
|
|
|
out:
|
|
|
|
return error;
|
|
|
|
}
|
|
|
|
#endif /* CONFIG_NFSD_V3 */
|
|
|
|
|
2011-06-07 22:50:23 +07:00
|
|
|
static int nfsd_open_break_lease(struct inode *inode, int access)
|
|
|
|
{
|
|
|
|
unsigned int mode;
|
2005-04-17 05:20:36 +07:00
|
|
|
|
2011-06-07 22:50:23 +07:00
|
|
|
if (access & NFSD_MAY_NOT_BREAK_LEASE)
|
|
|
|
return 0;
|
|
|
|
mode = (access & NFSD_MAY_WRITE) ? O_WRONLY : O_RDONLY;
|
|
|
|
return break_lease(inode, mode | O_NONBLOCK);
|
|
|
|
}
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Open an existing file or directory.
|
2012-03-19 09:44:49 +07:00
|
|
|
* The may_flags argument indicates the type of open (read/write/lock)
|
|
|
|
* and additional flags.
|
2005-04-17 05:20:36 +07:00
|
|
|
* N.B. After this call fhp needs an fh_put
|
|
|
|
*/
|
2006-10-20 13:28:58 +07:00
|
|
|
__be32
|
2011-07-26 14:30:54 +07:00
|
|
|
nfsd_open(struct svc_rqst *rqstp, struct svc_fh *fhp, umode_t type,
|
2012-03-19 09:44:49 +07:00
|
|
|
int may_flags, struct file **filp)
|
2005-04-17 05:20:36 +07:00
|
|
|
{
|
2012-06-27 00:58:53 +07:00
|
|
|
struct path path;
|
2005-04-17 05:20:36 +07:00
|
|
|
struct inode *inode;
|
2014-09-03 07:14:06 +07:00
|
|
|
struct file *file;
|
2006-10-20 13:28:58 +07:00
|
|
|
int flags = O_RDONLY|O_LARGEFILE;
|
|
|
|
__be32 err;
|
2010-03-19 19:06:28 +07:00
|
|
|
int host_err = 0;
|
2005-04-17 05:20:36 +07:00
|
|
|
|
2009-09-02 15:13:40 +07:00
|
|
|
validate_process_creds();
|
|
|
|
|
2005-04-17 05:20:36 +07:00
|
|
|
/*
|
|
|
|
* If we get here, then the client has already done an "open",
|
|
|
|
* and (hopefully) checked permission - so allow OWNER_OVERRIDE
|
|
|
|
* in case a chmod has now revoked permission.
|
nfsd: allow owner_override only for regular files
We normally allow the owner of a file to override permissions checks on
IO operations, since:
- the client will take responsibility for doing an access check
on open;
- the permission checks offer no protection against malicious
clients--if they can authenticate as the file's owner then
they can always just change its permissions;
- checking permission on each IO operation breaks the usual
posix rule that permission is checked only on open.
However, we've never allowed the owner to override permissions on
readdir operations, even though the above logic would also apply to
directories. I've never heard of this causing a problem, probably
because a) simultaneously opening and creating a directory (with
restricted mode) isn't possible, and b) opening a directory, then
chmod'ing it, is rare.
Our disallowal of owner-override on directories appears to be an
accident, though--the readdir itself succeeds, and then we fail just
because lookup_one_len() calls in our filldir methods fail.
I'm not sure what the easiest fix for that would be. For now, just make
this behavior obvious by denying the override right at the start.
This also fixes some odd v4 behavior: with the rdattr_error attribute
requested, it would perform the readdir but return an ACCES error with
each entry.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-06-06 23:12:57 +07:00
|
|
|
*
|
|
|
|
* Arguably we should also allow the owner override for
|
|
|
|
* directories, but we never have and it doesn't seem to have
|
|
|
|
* caused anyone a problem. If we were to change this, note
|
|
|
|
* also that our filldir callbacks would need a variant of
|
|
|
|
* lookup_one_len that doesn't check permissions.
|
2005-04-17 05:20:36 +07:00
|
|
|
*/
|
nfsd: allow owner_override only for regular files
We normally allow the owner of a file to override permissions checks on
IO operations, since:
- the client will take responsibility for doing an access check
on open;
- the permission checks offer no protection against malicious
clients--if they can authenticate as the file's owner then
they can always just change its permissions;
- checking permission on each IO operation breaks the usual
posix rule that permission is checked only on open.
However, we've never allowed the owner to override permissions on
readdir operations, even though the above logic would also apply to
directories. I've never heard of this causing a problem, probably
because a) simultaneously opening and creating a directory (with
restricted mode) isn't possible, and b) opening a directory, then
chmod'ing it, is rare.
Our disallowal of owner-override on directories appears to be an
accident, though--the readdir itself succeeds, and then we fail just
because lookup_one_len() calls in our filldir methods fail.
I'm not sure what the easiest fix for that would be. For now, just make
this behavior obvious by denying the override right at the start.
This also fixes some odd v4 behavior: with the rdattr_error attribute
requested, it would perform the readdir but return an ACCES error with
each entry.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-06-06 23:12:57 +07:00
|
|
|
if (type == S_IFREG)
|
|
|
|
may_flags |= NFSD_MAY_OWNER_OVERRIDE;
|
|
|
|
err = fh_verify(rqstp, fhp, type, may_flags);
|
2005-04-17 05:20:36 +07:00
|
|
|
if (err)
|
|
|
|
goto out;
|
|
|
|
|
2012-06-27 00:58:53 +07:00
|
|
|
path.mnt = fhp->fh_export->ex_path.mnt;
|
|
|
|
path.dentry = fhp->fh_dentry;
|
2015-03-18 05:25:59 +07:00
|
|
|
inode = d_inode(path.dentry);
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
/* Disallow write access to files with the append-only bit set
|
|
|
|
* or any access when mandatory locking enabled
|
|
|
|
*/
|
|
|
|
err = nfserr_perm;
|
2012-03-19 09:44:49 +07:00
|
|
|
if (IS_APPEND(inode) && (may_flags & NFSD_MAY_WRITE))
|
2005-04-17 05:20:36 +07:00
|
|
|
goto out;
|
2007-10-03 01:18:12 +07:00
|
|
|
/*
|
|
|
|
* We must ignore files (but only files) which might have mandatory
|
|
|
|
* locks on them because there is no way to know if the accesser has
|
|
|
|
* the lock.
|
|
|
|
*/
|
|
|
|
if (S_ISREG((inode)->i_mode) && mandatory_lock(inode))
|
2005-04-17 05:20:36 +07:00
|
|
|
goto out;
|
|
|
|
|
|
|
|
if (!inode->i_fop)
|
|
|
|
goto out;
|
|
|
|
|
2012-03-19 09:44:49 +07:00
|
|
|
host_err = nfsd_open_break_lease(inode, may_flags);
|
2006-10-20 13:28:58 +07:00
|
|
|
if (host_err) /* NOMEM or WOULDBLOCK */
|
2005-04-17 05:20:36 +07:00
|
|
|
goto out_nfserr;
|
|
|
|
|
2012-03-19 09:44:49 +07:00
|
|
|
if (may_flags & NFSD_MAY_WRITE) {
|
|
|
|
if (may_flags & NFSD_MAY_READ)
|
2006-06-30 15:56:17 +07:00
|
|
|
flags = O_RDWR|O_LARGEFILE;
|
|
|
|
else
|
|
|
|
flags = O_WRONLY|O_LARGEFILE;
|
2005-04-17 05:20:36 +07:00
|
|
|
}
|
2012-03-19 09:44:49 +07:00
|
|
|
|
2014-09-03 07:14:06 +07:00
|
|
|
file = dentry_open(&path, flags, current_cred());
|
|
|
|
if (IS_ERR(file)) {
|
|
|
|
host_err = PTR_ERR(file);
|
|
|
|
goto out_nfserr;
|
|
|
|
}
|
|
|
|
|
2014-10-12 21:13:55 +07:00
|
|
|
host_err = ima_file_check(file, may_flags, 0);
|
2014-09-03 07:14:06 +07:00
|
|
|
if (host_err) {
|
2015-04-28 20:41:16 +07:00
|
|
|
fput(file);
|
2014-09-03 07:14:06 +07:00
|
|
|
goto out_nfserr;
|
2012-03-19 09:44:50 +07:00
|
|
|
}
|
|
|
|
|
2014-09-03 07:14:06 +07:00
|
|
|
if (may_flags & NFSD_MAY_64BIT_COOKIE)
|
|
|
|
file->f_mode |= FMODE_64BITHASH;
|
|
|
|
else
|
|
|
|
file->f_mode |= FMODE_32BITHASH;
|
|
|
|
|
|
|
|
*filp = file;
|
2005-04-17 05:20:36 +07:00
|
|
|
out_nfserr:
|
2006-10-20 13:28:58 +07:00
|
|
|
err = nfserrno(host_err);
|
2005-04-17 05:20:36 +07:00
|
|
|
out:
|
2009-09-02 15:13:40 +07:00
|
|
|
validate_process_creds();
|
2005-04-17 05:20:36 +07:00
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
2015-06-18 21:44:58 +07:00
|
|
|
struct raparms *
|
|
|
|
nfsd_init_raparms(struct file *file)
|
2005-04-17 05:20:36 +07:00
|
|
|
{
|
2015-06-18 21:44:58 +07:00
|
|
|
struct inode *inode = file_inode(file);
|
|
|
|
dev_t dev = inode->i_sb->s_dev;
|
|
|
|
ino_t ino = inode->i_ino;
|
2005-04-17 05:20:36 +07:00
|
|
|
struct raparms *ra, **rap, **frap = NULL;
|
|
|
|
int depth = 0;
|
2006-10-04 16:15:49 +07:00
|
|
|
unsigned int hash;
|
|
|
|
struct raparm_hbucket *rab;
|
|
|
|
|
|
|
|
hash = jhash_2words(dev, ino, 0xfeedbeef) & RAPARM_HASH_MASK;
|
|
|
|
rab = &raparm_hash[hash];
|
2005-04-17 05:20:36 +07:00
|
|
|
|
2006-10-04 16:15:49 +07:00
|
|
|
spin_lock(&rab->pb_lock);
|
|
|
|
for (rap = &rab->pb_head; (ra = *rap); rap = &ra->p_next) {
|
2005-04-17 05:20:36 +07:00
|
|
|
if (ra->p_ino == ino && ra->p_dev == dev)
|
|
|
|
goto found;
|
|
|
|
depth++;
|
|
|
|
if (ra->p_count == 0)
|
|
|
|
frap = rap;
|
|
|
|
}
|
2011-02-01 21:16:29 +07:00
|
|
|
depth = nfsdstats.ra_size;
|
2005-04-17 05:20:36 +07:00
|
|
|
if (!frap) {
|
2006-10-04 16:15:49 +07:00
|
|
|
spin_unlock(&rab->pb_lock);
|
2005-04-17 05:20:36 +07:00
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
rap = frap;
|
|
|
|
ra = *frap;
|
|
|
|
ra->p_dev = dev;
|
|
|
|
ra->p_ino = ino;
|
|
|
|
ra->p_set = 0;
|
2006-10-04 16:15:49 +07:00
|
|
|
ra->p_hindex = hash;
|
2005-04-17 05:20:36 +07:00
|
|
|
found:
|
2006-10-04 16:15:49 +07:00
|
|
|
if (rap != &rab->pb_head) {
|
2005-04-17 05:20:36 +07:00
|
|
|
*rap = ra->p_next;
|
2006-10-04 16:15:49 +07:00
|
|
|
ra->p_next = rab->pb_head;
|
|
|
|
rab->pb_head = ra;
|
2005-04-17 05:20:36 +07:00
|
|
|
}
|
|
|
|
ra->p_count++;
|
|
|
|
nfsdstats.ra_depth[depth*10/nfsdstats.ra_size]++;
|
2006-10-04 16:15:49 +07:00
|
|
|
spin_unlock(&rab->pb_lock);
|
2015-06-18 21:44:58 +07:00
|
|
|
|
|
|
|
if (ra->p_set)
|
|
|
|
file->f_ra = ra->p_ra;
|
2005-04-17 05:20:36 +07:00
|
|
|
return ra;
|
|
|
|
}
|
|
|
|
|
2015-06-18 21:44:58 +07:00
|
|
|
void nfsd_put_raparams(struct file *file, struct raparms *ra)
|
|
|
|
{
|
|
|
|
struct raparm_hbucket *rab = &raparm_hash[ra->p_hindex];
|
|
|
|
|
|
|
|
spin_lock(&rab->pb_lock);
|
|
|
|
ra->p_ra = file->f_ra;
|
|
|
|
ra->p_set = 1;
|
|
|
|
ra->p_count--;
|
|
|
|
spin_unlock(&rab->pb_lock);
|
|
|
|
}
|
|
|
|
|
2005-04-17 05:20:36 +07:00
|
|
|
/*
|
2007-06-13 02:22:14 +07:00
|
|
|
* Grab and keep cached pages associated with a file in the svc_rqst
|
|
|
|
* so that they can be passed to the network sendmsg/sendpage routines
|
|
|
|
* directly. They will be released after the sending has completed.
|
2005-04-17 05:20:36 +07:00
|
|
|
*/
|
|
|
|
static int
|
2007-06-13 02:22:14 +07:00
|
|
|
nfsd_splice_actor(struct pipe_inode_info *pipe, struct pipe_buffer *buf,
|
|
|
|
struct splice_desc *sd)
|
2005-04-17 05:20:36 +07:00
|
|
|
{
|
2007-06-13 02:22:14 +07:00
|
|
|
struct svc_rqst *rqstp = sd->u.data;
|
2012-12-11 06:01:37 +07:00
|
|
|
struct page **pp = rqstp->rq_next_page;
|
2007-06-13 02:22:14 +07:00
|
|
|
struct page *page = buf->page;
|
|
|
|
size_t size;
|
|
|
|
|
|
|
|
size = sd->len;
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
if (rqstp->rq_res.page_len == 0) {
|
|
|
|
get_page(page);
|
2012-12-11 06:01:37 +07:00
|
|
|
put_page(*rqstp->rq_next_page);
|
|
|
|
*(rqstp->rq_next_page++) = page;
|
2007-06-13 02:22:14 +07:00
|
|
|
rqstp->rq_res.page_base = buf->offset;
|
2005-04-17 05:20:36 +07:00
|
|
|
rqstp->rq_res.page_len = size;
|
2006-10-04 16:15:46 +07:00
|
|
|
} else if (page != pp[-1]) {
|
2005-04-17 05:20:36 +07:00
|
|
|
get_page(page);
|
2012-12-11 06:01:37 +07:00
|
|
|
if (*rqstp->rq_next_page)
|
|
|
|
put_page(*rqstp->rq_next_page);
|
|
|
|
*(rqstp->rq_next_page++) = page;
|
2005-04-17 05:20:36 +07:00
|
|
|
rqstp->rq_res.page_len += size;
|
2006-10-04 16:15:46 +07:00
|
|
|
} else
|
2005-04-17 05:20:36 +07:00
|
|
|
rqstp->rq_res.page_len += size;
|
|
|
|
|
|
|
|
return size;
|
|
|
|
}
|
|
|
|
|
2007-06-13 02:22:14 +07:00
|
|
|
static int nfsd_direct_splice_actor(struct pipe_inode_info *pipe,
|
|
|
|
struct splice_desc *sd)
|
|
|
|
{
|
|
|
|
return __splice_from_pipe(pipe, sd, nfsd_splice_actor);
|
|
|
|
}
|
|
|
|
|
2014-06-17 18:44:13 +07:00
|
|
|
static __be32
|
|
|
|
nfsd_finish_read(struct file *file, unsigned long *count, int host_err)
|
2005-04-17 05:20:36 +07:00
|
|
|
{
|
2006-10-20 13:28:58 +07:00
|
|
|
if (host_err >= 0) {
|
|
|
|
nfsdstats.io_read += host_err;
|
|
|
|
*count = host_err;
|
2009-12-18 09:24:21 +07:00
|
|
|
fsnotify_access(file);
|
2014-03-19 04:01:51 +07:00
|
|
|
return 0;
|
2005-04-17 05:20:36 +07:00
|
|
|
} else
|
2014-03-19 04:01:51 +07:00
|
|
|
return nfserrno(host_err);
|
|
|
|
}
|
|
|
|
|
2014-06-17 18:44:13 +07:00
|
|
|
__be32 nfsd_splice_read(struct svc_rqst *rqstp,
|
2014-03-19 04:01:51 +07:00
|
|
|
struct file *file, loff_t offset, unsigned long *count)
|
|
|
|
{
|
|
|
|
struct splice_desc sd = {
|
|
|
|
.len = 0,
|
|
|
|
.total_len = *count,
|
|
|
|
.pos = offset,
|
|
|
|
.u.data = rqstp,
|
|
|
|
};
|
|
|
|
int host_err;
|
|
|
|
|
|
|
|
rqstp->rq_next_page = rqstp->rq_respages + 1;
|
|
|
|
host_err = splice_direct_to_actor(file, &sd, nfsd_direct_splice_actor);
|
|
|
|
return nfsd_finish_read(file, count, host_err);
|
|
|
|
}
|
|
|
|
|
2014-06-17 18:44:13 +07:00
|
|
|
__be32 nfsd_readv(struct file *file, loff_t offset, struct kvec *vec, int vlen,
|
2014-03-19 04:01:51 +07:00
|
|
|
unsigned long *count)
|
|
|
|
{
|
|
|
|
mm_segment_t oldfs;
|
|
|
|
int host_err;
|
|
|
|
|
|
|
|
oldfs = get_fs();
|
|
|
|
set_fs(KERNEL_DS);
|
2016-03-03 22:03:58 +07:00
|
|
|
host_err = vfs_readv(file, (struct iovec __user *)vec, vlen, &offset, 0);
|
2014-03-19 04:01:51 +07:00
|
|
|
set_fs(oldfs);
|
|
|
|
return nfsd_finish_read(file, count, host_err);
|
|
|
|
}
|
|
|
|
|
|
|
|
static __be32
|
|
|
|
nfsd_vfs_read(struct svc_rqst *rqstp, struct file *file,
|
|
|
|
loff_t offset, struct kvec *vec, int vlen, unsigned long *count)
|
|
|
|
{
|
2014-11-19 19:51:18 +07:00
|
|
|
if (file->f_op->splice_read && test_bit(RQ_SPLICE_OK, &rqstp->rq_flags))
|
2014-03-19 04:01:51 +07:00
|
|
|
return nfsd_splice_read(rqstp, file, offset, count);
|
|
|
|
else
|
|
|
|
return nfsd_readv(file, offset, vec, vlen, count);
|
2005-04-17 05:20:36 +07:00
|
|
|
}
|
|
|
|
|
2009-06-16 06:03:53 +07:00
|
|
|
/*
|
|
|
|
* Gathered writes: If another process is currently writing to the file,
|
|
|
|
* there's a high chance this is another nfsd (triggered by a bulk write
|
|
|
|
* from a client's biod). Rather than syncing the file with each write
|
|
|
|
* request, we sleep for 10 msec.
|
|
|
|
*
|
|
|
|
* I don't know if this roughly approximates C. Juszak's idea of
|
|
|
|
* gathered writes, but it's a nice and simple solution (IMHO), and it
|
|
|
|
* seems to work:-)
|
|
|
|
*
|
|
|
|
* Note: we do this only in the NFSv2 case, since v3 and higher have a
|
|
|
|
* better tool (separate unstable writes and commits) for solving this
|
|
|
|
* problem.
|
|
|
|
*/
|
|
|
|
static int wait_for_concurrent_writes(struct file *file)
|
|
|
|
{
|
2013-01-24 05:07:38 +07:00
|
|
|
struct inode *inode = file_inode(file);
|
2009-06-16 06:03:53 +07:00
|
|
|
static ino_t last_ino;
|
|
|
|
static dev_t last_dev;
|
|
|
|
int err = 0;
|
|
|
|
|
|
|
|
if (atomic_read(&inode->i_writecount) > 1
|
|
|
|
|| (last_ino == inode->i_ino && last_dev == inode->i_sb->s_dev)) {
|
|
|
|
dprintk("nfsd: write defer %d\n", task_pid_nr(current));
|
|
|
|
msleep(10);
|
|
|
|
dprintk("nfsd: write resume %d\n", task_pid_nr(current));
|
|
|
|
}
|
|
|
|
|
|
|
|
if (inode->i_state & I_DIRTY) {
|
|
|
|
dprintk("nfsd: write sync %d\n", task_pid_nr(current));
|
2010-03-22 23:32:25 +07:00
|
|
|
err = vfs_fsync(file, 0);
|
2009-06-16 06:03:53 +07:00
|
|
|
}
|
|
|
|
last_ino = inode->i_ino;
|
|
|
|
last_dev = inode->i_sb->s_dev;
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
2015-06-18 21:45:00 +07:00
|
|
|
__be32
|
2005-04-17 05:20:36 +07:00
|
|
|
nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
|
|
|
|
loff_t offset, struct kvec *vec, int vlen,
|
2009-03-06 08:16:14 +07:00
|
|
|
unsigned long *cnt, int *stablep)
|
2005-04-17 05:20:36 +07:00
|
|
|
{
|
|
|
|
struct svc_export *exp;
|
|
|
|
struct inode *inode;
|
|
|
|
mm_segment_t oldfs;
|
2006-10-20 13:28:58 +07:00
|
|
|
__be32 err = 0;
|
|
|
|
int host_err;
|
2005-04-17 05:20:36 +07:00
|
|
|
int stable = *stablep;
|
2009-06-05 23:35:15 +07:00
|
|
|
int use_wgather;
|
2013-03-23 01:18:24 +07:00
|
|
|
loff_t pos = offset;
|
2014-05-12 08:22:47 +07:00
|
|
|
unsigned int pflags = current->flags;
|
2016-04-07 22:52:04 +07:00
|
|
|
int flags = 0;
|
2014-05-12 08:22:47 +07:00
|
|
|
|
2014-11-19 19:51:15 +07:00
|
|
|
if (test_bit(RQ_LOCAL, &rqstp->rq_flags))
|
2014-05-12 08:22:47 +07:00
|
|
|
/*
|
|
|
|
* We want less throttling in balance_dirty_pages()
|
|
|
|
* and shrink_inactive_list() so that nfs to
|
|
|
|
* localhost doesn't cause nfsd to lock up due to all
|
|
|
|
* the client's dirty pages or its congested queue.
|
|
|
|
*/
|
|
|
|
current->flags |= PF_LESS_THROTTLE;
|
2005-04-17 05:20:36 +07:00
|
|
|
|
2014-10-31 13:42:53 +07:00
|
|
|
inode = file_inode(file);
|
2005-04-17 05:20:36 +07:00
|
|
|
exp = fhp->fh_export;
|
|
|
|
|
2009-06-05 23:35:15 +07:00
|
|
|
use_wgather = (rqstp->rq_vers == 2) && EX_WGATHER(exp);
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
if (!EX_ISSYNC(exp))
|
|
|
|
stable = 0;
|
|
|
|
|
2016-04-07 22:52:04 +07:00
|
|
|
if (stable && !use_wgather)
|
|
|
|
flags |= RWF_SYNC;
|
|
|
|
|
2005-04-17 05:20:36 +07:00
|
|
|
/* Write the data. */
|
|
|
|
oldfs = get_fs(); set_fs(KERNEL_DS);
|
2016-04-07 22:52:04 +07:00
|
|
|
host_err = vfs_writev(file, (struct iovec __user *)vec, vlen, &pos, flags);
|
2005-04-17 05:20:36 +07:00
|
|
|
set_fs(oldfs);
|
2009-06-16 09:07:13 +07:00
|
|
|
if (host_err < 0)
|
|
|
|
goto out_nfserr;
|
|
|
|
*cnt = host_err;
|
|
|
|
nfsdstats.io_write += host_err;
|
2009-12-18 09:24:21 +07:00
|
|
|
fsnotify_modify(file);
|
2005-04-17 05:20:36 +07:00
|
|
|
|
2016-04-07 22:52:04 +07:00
|
|
|
if (stable && use_wgather)
|
|
|
|
host_err = wait_for_concurrent_writes(file);
|
2005-04-17 05:20:36 +07:00
|
|
|
|
2009-06-16 09:07:13 +07:00
|
|
|
out_nfserr:
|
2006-10-20 13:28:58 +07:00
|
|
|
dprintk("nfsd: write complete host_err=%d\n", host_err);
|
2009-05-19 11:03:15 +07:00
|
|
|
if (host_err >= 0)
|
2005-04-17 05:20:36 +07:00
|
|
|
err = 0;
|
2009-05-19 11:03:15 +07:00
|
|
|
else
|
2006-10-20 13:28:58 +07:00
|
|
|
err = nfserrno(host_err);
|
2014-11-19 19:51:15 +07:00
|
|
|
if (test_bit(RQ_LOCAL, &rqstp->rq_flags))
|
2014-05-12 08:22:47 +07:00
|
|
|
tsk_restore_flags(current, pflags, PF_LESS_THROTTLE);
|
2005-04-17 05:20:36 +07:00
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
2014-03-19 04:01:51 +07:00
|
|
|
/*
|
|
|
|
* Read data from a file. count must contain the requested read count
|
|
|
|
* on entry. On return, *count contains the number of bytes actually read.
|
|
|
|
* N.B. After this call fhp needs an fh_put
|
|
|
|
*/
|
|
|
|
__be32 nfsd_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
|
|
|
|
loff_t offset, struct kvec *vec, int vlen, unsigned long *count)
|
|
|
|
{
|
|
|
|
struct file *file;
|
|
|
|
struct raparms *ra;
|
|
|
|
__be32 err;
|
|
|
|
|
2015-11-17 18:52:23 +07:00
|
|
|
trace_read_start(rqstp, fhp, offset, vlen);
|
2015-06-18 21:44:58 +07:00
|
|
|
err = nfsd_open(rqstp, fhp, S_IFREG, NFSD_MAY_READ, &file);
|
2014-03-19 04:01:51 +07:00
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
|
2015-06-18 21:44:58 +07:00
|
|
|
ra = nfsd_init_raparms(file);
|
2015-11-17 18:52:23 +07:00
|
|
|
|
|
|
|
trace_read_opened(rqstp, fhp, offset, vlen);
|
2014-03-19 04:01:51 +07:00
|
|
|
err = nfsd_vfs_read(rqstp, file, offset, vec, vlen, count);
|
2015-11-17 18:52:23 +07:00
|
|
|
trace_read_io_done(rqstp, fhp, offset, vlen);
|
|
|
|
|
2015-06-18 21:44:58 +07:00
|
|
|
if (ra)
|
|
|
|
nfsd_put_raparams(file, ra);
|
|
|
|
fput(file);
|
2014-03-19 04:01:51 +07:00
|
|
|
|
2015-11-17 18:52:23 +07:00
|
|
|
trace_read_done(rqstp, fhp, offset, vlen);
|
|
|
|
|
2010-07-28 03:48:54 +07:00
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
2005-04-17 05:20:36 +07:00
|
|
|
/*
|
|
|
|
* Write data to a file.
|
|
|
|
* The stable flag requests synchronous writes.
|
|
|
|
* N.B. After this call fhp needs an fh_put
|
|
|
|
*/
|
2006-10-20 13:28:58 +07:00
|
|
|
__be32
|
2005-04-17 05:20:36 +07:00
|
|
|
nfsd_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
|
2009-03-06 08:16:14 +07:00
|
|
|
loff_t offset, struct kvec *vec, int vlen, unsigned long *cnt,
|
2005-04-17 05:20:36 +07:00
|
|
|
int *stablep)
|
|
|
|
{
|
2006-10-20 13:28:58 +07:00
|
|
|
__be32 err = 0;
|
2005-04-17 05:20:36 +07:00
|
|
|
|
2015-11-17 18:52:23 +07:00
|
|
|
trace_write_start(rqstp, fhp, offset, vlen);
|
|
|
|
|
2005-04-17 05:20:36 +07:00
|
|
|
if (file) {
|
2007-07-17 18:04:48 +07:00
|
|
|
err = nfsd_permission(rqstp, fhp->fh_export, fhp->fh_dentry,
|
2008-06-16 18:20:29 +07:00
|
|
|
NFSD_MAY_WRITE|NFSD_MAY_OWNER_OVERRIDE);
|
2005-04-17 05:20:36 +07:00
|
|
|
if (err)
|
|
|
|
goto out;
|
2015-11-17 18:52:23 +07:00
|
|
|
trace_write_opened(rqstp, fhp, offset, vlen);
|
2005-04-17 05:20:36 +07:00
|
|
|
err = nfsd_vfs_write(rqstp, fhp, file, offset, vec, vlen, cnt,
|
|
|
|
stablep);
|
2015-11-17 18:52:23 +07:00
|
|
|
trace_write_io_done(rqstp, fhp, offset, vlen);
|
2005-04-17 05:20:36 +07:00
|
|
|
} else {
|
2008-06-16 18:20:29 +07:00
|
|
|
err = nfsd_open(rqstp, fhp, S_IFREG, NFSD_MAY_WRITE, &file);
|
2005-04-17 05:20:36 +07:00
|
|
|
if (err)
|
|
|
|
goto out;
|
|
|
|
|
2015-11-17 18:52:23 +07:00
|
|
|
trace_write_opened(rqstp, fhp, offset, vlen);
|
2005-04-17 05:20:36 +07:00
|
|
|
if (cnt)
|
|
|
|
err = nfsd_vfs_write(rqstp, fhp, file, offset, vec, vlen,
|
|
|
|
cnt, stablep);
|
2015-11-17 18:52:23 +07:00
|
|
|
trace_write_io_done(rqstp, fhp, offset, vlen);
|
2015-04-28 20:41:16 +07:00
|
|
|
fput(file);
|
2005-04-17 05:20:36 +07:00
|
|
|
}
|
|
|
|
out:
|
2015-11-17 18:52:23 +07:00
|
|
|
trace_write_done(rqstp, fhp, offset, vlen);
|
2005-04-17 05:20:36 +07:00
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
|
|
|
#ifdef CONFIG_NFSD_V3
|
|
|
|
/*
|
|
|
|
* Commit all pending writes to stable storage.
|
2010-01-30 04:44:25 +07:00
|
|
|
*
|
|
|
|
* Note: we only guarantee that data that lies within the range specified
|
|
|
|
* by the 'offset' and 'count' parameters will be synced.
|
2005-04-17 05:20:36 +07:00
|
|
|
*
|
|
|
|
* Unfortunately we cannot lock the file to make sure we return full WCC
|
|
|
|
* data to the client, as locking happens lower down in the filesystem.
|
|
|
|
*/
|
2006-10-20 13:28:58 +07:00
|
|
|
__be32
|
2005-04-17 05:20:36 +07:00
|
|
|
nfsd_commit(struct svc_rqst *rqstp, struct svc_fh *fhp,
|
|
|
|
loff_t offset, unsigned long count)
|
|
|
|
{
|
|
|
|
struct file *file;
|
2010-01-30 04:44:25 +07:00
|
|
|
loff_t end = LLONG_MAX;
|
|
|
|
__be32 err = nfserr_inval;
|
2005-04-17 05:20:36 +07:00
|
|
|
|
2010-01-30 04:44:25 +07:00
|
|
|
if (offset < 0)
|
|
|
|
goto out;
|
|
|
|
if (count != 0) {
|
|
|
|
end = offset + (loff_t)count - 1;
|
|
|
|
if (end < offset)
|
|
|
|
goto out;
|
|
|
|
}
|
2005-04-17 05:20:36 +07:00
|
|
|
|
2010-03-19 19:06:28 +07:00
|
|
|
err = nfsd_open(rqstp, fhp, S_IFREG,
|
|
|
|
NFSD_MAY_WRITE|NFSD_MAY_NOT_BREAK_LEASE, &file);
|
2008-06-16 18:20:29 +07:00
|
|
|
if (err)
|
2010-01-30 04:44:25 +07:00
|
|
|
goto out;
|
2005-04-17 05:20:36 +07:00
|
|
|
if (EX_ISSYNC(fhp->fh_export)) {
|
2010-03-22 23:32:25 +07:00
|
|
|
int err2 = vfs_fsync_range(file, offset, end, 0);
|
2010-01-30 04:44:25 +07:00
|
|
|
|
|
|
|
if (err2 != -EINVAL)
|
|
|
|
err = nfserrno(err2);
|
|
|
|
else
|
2005-04-17 05:20:36 +07:00
|
|
|
err = nfserr_notsupp;
|
|
|
|
}
|
|
|
|
|
2015-04-28 20:41:16 +07:00
|
|
|
fput(file);
|
2010-01-30 04:44:25 +07:00
|
|
|
out:
|
2005-04-17 05:20:36 +07:00
|
|
|
return err;
|
|
|
|
}
|
|
|
|
#endif /* CONFIG_NFSD_V3 */
|
|
|
|
|
2008-02-14 04:30:26 +07:00
|
|
|
static __be32
|
2007-12-01 04:55:23 +07:00
|
|
|
nfsd_create_setattr(struct svc_rqst *rqstp, struct svc_fh *resfhp,
|
|
|
|
struct iattr *iap)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* Mode has already been set earlier in create:
|
|
|
|
*/
|
|
|
|
iap->ia_valid &= ~ATTR_MODE;
|
|
|
|
/*
|
|
|
|
* Setting uid/gid works only for root. Irix appears to
|
|
|
|
* send along the gid on create when it tries to implement
|
|
|
|
* setgid directories via NFS:
|
|
|
|
*/
|
2013-02-02 21:53:11 +07:00
|
|
|
if (!uid_eq(current_fsuid(), GLOBAL_ROOT_UID))
|
2007-12-01 04:55:23 +07:00
|
|
|
iap->ia_valid &= ~(ATTR_UID|ATTR_GID);
|
|
|
|
if (iap->ia_valid)
|
|
|
|
return nfsd_setattr(rqstp, resfhp, iap, 0, (time_t)0);
|
2014-07-02 05:27:53 +07:00
|
|
|
/* Callers expect file metadata to be committed here */
|
2014-07-03 18:54:19 +07:00
|
|
|
return nfserrno(commit_metadata(resfhp));
|
2007-12-01 04:55:23 +07:00
|
|
|
}
|
|
|
|
|
2009-02-10 10:27:51 +07:00
|
|
|
/* HPUX client sometimes creates a file in mode 000, and sets size to 0.
|
|
|
|
* setting size to 0 may fail for some specific file systems by the permission
|
|
|
|
* checking which requires WRITE permission but the mode is 000.
|
|
|
|
* we ignore the resizing(to 0) on the just new created file, since the size is
|
|
|
|
* 0 after file created.
|
|
|
|
*
|
|
|
|
* call this only after vfs_create() is called.
|
|
|
|
* */
|
|
|
|
static void
|
|
|
|
nfsd_check_ignore_resizing(struct iattr *iap)
|
|
|
|
{
|
|
|
|
if ((iap->ia_valid & ATTR_SIZE) && (iap->ia_size == 0))
|
|
|
|
iap->ia_valid &= ~ATTR_SIZE;
|
|
|
|
}
|
|
|
|
|
2016-07-21 03:16:06 +07:00
|
|
|
/* The parent directory should already be locked: */
|
2006-10-20 13:28:58 +07:00
|
|
|
__be32
|
2016-07-21 03:16:06 +07:00
|
|
|
nfsd_create_locked(struct svc_rqst *rqstp, struct svc_fh *fhp,
|
2005-04-17 05:20:36 +07:00
|
|
|
char *fname, int flen, struct iattr *iap,
|
|
|
|
int type, dev_t rdev, struct svc_fh *resfhp)
|
|
|
|
{
|
2016-08-04 02:05:00 +07:00
|
|
|
struct dentry *dentry, *dchild;
|
2005-04-17 05:20:36 +07:00
|
|
|
struct inode *dirp;
|
2006-10-20 13:28:58 +07:00
|
|
|
__be32 err;
|
2007-12-01 04:55:23 +07:00
|
|
|
__be32 err2;
|
2006-10-20 13:28:58 +07:00
|
|
|
int host_err;
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
dentry = fhp->fh_dentry;
|
2015-03-18 05:25:59 +07:00
|
|
|
dirp = d_inode(dentry);
|
2005-04-17 05:20:36 +07:00
|
|
|
|
2016-07-21 03:16:06 +07:00
|
|
|
dchild = dget(resfhp->fh_dentry);
|
|
|
|
if (!fhp->fh_locked) {
|
|
|
|
WARN_ONCE(1, "nfsd_create: parent %pd2 not locked!\n",
|
2013-09-16 21:57:01 +07:00
|
|
|
dentry);
|
2016-07-21 03:16:06 +07:00
|
|
|
err = nfserr_io;
|
|
|
|
goto out;
|
2005-04-17 05:20:36 +07:00
|
|
|
}
|
|
|
|
|
2016-07-15 10:20:22 +07:00
|
|
|
err = nfsd_permission(rqstp, fhp->fh_export, dentry, NFSD_MAY_CREATE);
|
|
|
|
if (err)
|
|
|
|
goto out;
|
|
|
|
|
2005-04-17 05:20:36 +07:00
|
|
|
if (!(iap->ia_valid & ATTR_MODE))
|
|
|
|
iap->ia_mode = 0;
|
|
|
|
iap->ia_mode = (iap->ia_mode & S_IALLUGO) | type;
|
|
|
|
|
2006-11-09 08:44:59 +07:00
|
|
|
err = 0;
|
2012-06-12 21:20:33 +07:00
|
|
|
host_err = 0;
|
2005-04-17 05:20:36 +07:00
|
|
|
switch (type) {
|
|
|
|
case S_IFREG:
|
2012-06-11 05:09:36 +07:00
|
|
|
host_err = vfs_create(dirp, dchild, iap->ia_mode, true);
|
2009-02-10 10:27:51 +07:00
|
|
|
if (!host_err)
|
|
|
|
nfsd_check_ignore_resizing(iap);
|
2005-04-17 05:20:36 +07:00
|
|
|
break;
|
|
|
|
case S_IFDIR:
|
2006-10-20 13:28:58 +07:00
|
|
|
host_err = vfs_mkdir(dirp, dchild, iap->ia_mode);
|
2005-04-17 05:20:36 +07:00
|
|
|
break;
|
|
|
|
case S_IFCHR:
|
|
|
|
case S_IFBLK:
|
|
|
|
case S_IFIFO:
|
|
|
|
case S_IFSOCK:
|
2006-10-20 13:28:58 +07:00
|
|
|
host_err = vfs_mknod(dirp, dchild, iap->ia_mode, rdev);
|
2005-04-17 05:20:36 +07:00
|
|
|
break;
|
2016-07-22 23:03:46 +07:00
|
|
|
default:
|
|
|
|
printk(KERN_WARNING "nfsd: bad file type %o in nfsd_create\n",
|
|
|
|
type);
|
|
|
|
host_err = -EINVAL;
|
2005-04-17 05:20:36 +07:00
|
|
|
}
|
2012-06-12 21:20:33 +07:00
|
|
|
if (host_err < 0)
|
2005-04-17 05:20:36 +07:00
|
|
|
goto out_nfserr;
|
|
|
|
|
2010-02-18 03:05:11 +07:00
|
|
|
err = nfsd_create_setattr(rqstp, resfhp, iap);
|
2005-04-17 05:20:36 +07:00
|
|
|
|
2010-02-18 03:05:11 +07:00
|
|
|
/*
|
2014-07-02 05:27:53 +07:00
|
|
|
* nfsd_create_setattr already committed the child. Transactional
|
|
|
|
* filesystems had a chance to commit changes for both parent and
|
2016-07-21 03:16:06 +07:00
|
|
|
* child simultaneously making the following commit_metadata a
|
2014-07-02 05:27:53 +07:00
|
|
|
* noop.
|
2010-02-18 03:05:11 +07:00
|
|
|
*/
|
|
|
|
err2 = nfserrno(commit_metadata(fhp));
|
2007-12-01 04:55:23 +07:00
|
|
|
if (err2)
|
|
|
|
err = err2;
|
2005-04-17 05:20:36 +07:00
|
|
|
/*
|
|
|
|
* Update the file handle to get the new inode info.
|
|
|
|
*/
|
|
|
|
if (!err)
|
|
|
|
err = fh_update(resfhp);
|
|
|
|
out:
|
2016-08-04 02:05:00 +07:00
|
|
|
dput(dchild);
|
2005-04-17 05:20:36 +07:00
|
|
|
return err;
|
|
|
|
|
|
|
|
out_nfserr:
|
2006-10-20 13:28:58 +07:00
|
|
|
err = nfserrno(host_err);
|
2005-04-17 05:20:36 +07:00
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
2016-07-21 03:16:06 +07:00
|
|
|
/*
|
|
|
|
* Create a filesystem object (regular, directory, special).
|
|
|
|
* Note that the parent directory is left locked.
|
|
|
|
*
|
|
|
|
* N.B. Every call to nfsd_create needs an fh_put for _both_ fhp and resfhp
|
|
|
|
*/
|
|
|
|
__be32
|
|
|
|
nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
|
|
|
|
char *fname, int flen, struct iattr *iap,
|
|
|
|
int type, dev_t rdev, struct svc_fh *resfhp)
|
|
|
|
{
|
|
|
|
struct dentry *dentry, *dchild = NULL;
|
|
|
|
struct inode *dirp;
|
|
|
|
__be32 err;
|
|
|
|
int host_err;
|
|
|
|
|
|
|
|
if (isdotent(fname, flen))
|
|
|
|
return nfserr_exist;
|
|
|
|
|
2016-07-22 03:00:12 +07:00
|
|
|
err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_NOP);
|
2016-07-21 03:16:06 +07:00
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
|
|
|
|
dentry = fhp->fh_dentry;
|
|
|
|
dirp = d_inode(dentry);
|
|
|
|
|
|
|
|
host_err = fh_want_write(fhp);
|
|
|
|
if (host_err)
|
|
|
|
return nfserrno(host_err);
|
|
|
|
|
|
|
|
fh_lock_nested(fhp, I_MUTEX_PARENT);
|
|
|
|
dchild = lookup_one_len(fname, dentry, flen);
|
|
|
|
host_err = PTR_ERR(dchild);
|
|
|
|
if (IS_ERR(dchild))
|
|
|
|
return nfserrno(host_err);
|
|
|
|
err = fh_compose(resfhp, fhp->fh_export, dchild, fhp);
|
2016-08-11 01:46:27 +07:00
|
|
|
/*
|
|
|
|
* We unconditionally drop our ref to dchild as fh_compose will have
|
|
|
|
* already grabbed its own ref for it.
|
|
|
|
*/
|
|
|
|
dput(dchild);
|
|
|
|
if (err)
|
2016-07-21 03:16:06 +07:00
|
|
|
return err;
|
|
|
|
return nfsd_create_locked(rqstp, fhp, fname, flen, iap, type,
|
|
|
|
rdev, resfhp);
|
|
|
|
}
|
|
|
|
|
2005-04-17 05:20:36 +07:00
|
|
|
#ifdef CONFIG_NFSD_V3
|
2011-04-20 16:06:25 +07:00
|
|
|
|
2005-04-17 05:20:36 +07:00
|
|
|
/*
|
2011-04-20 16:06:25 +07:00
|
|
|
* NFSv3 and NFSv4 version of nfsd_create
|
2005-04-17 05:20:36 +07:00
|
|
|
*/
|
2006-10-20 13:28:58 +07:00
|
|
|
__be32
|
2011-04-20 16:06:25 +07:00
|
|
|
do_nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
|
2005-04-17 05:20:36 +07:00
|
|
|
char *fname, int flen, struct iattr *iap,
|
|
|
|
struct svc_fh *resfhp, int createmode, u32 *verifier,
|
2011-10-13 22:37:11 +07:00
|
|
|
bool *truncp, bool *created)
|
2005-04-17 05:20:36 +07:00
|
|
|
{
|
|
|
|
struct dentry *dentry, *dchild = NULL;
|
|
|
|
struct inode *dirp;
|
2006-10-20 13:28:58 +07:00
|
|
|
__be32 err;
|
|
|
|
int host_err;
|
2005-04-17 05:20:36 +07:00
|
|
|
__u32 v_mtime=0, v_atime=0;
|
|
|
|
|
|
|
|
err = nfserr_perm;
|
|
|
|
if (!flen)
|
|
|
|
goto out;
|
|
|
|
err = nfserr_exist;
|
|
|
|
if (isdotent(fname, flen))
|
|
|
|
goto out;
|
|
|
|
if (!(iap->ia_valid & ATTR_MODE))
|
|
|
|
iap->ia_mode = 0;
|
2011-04-20 19:09:35 +07:00
|
|
|
err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_EXEC);
|
2005-04-17 05:20:36 +07:00
|
|
|
if (err)
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
dentry = fhp->fh_dentry;
|
2015-03-18 05:25:59 +07:00
|
|
|
dirp = d_inode(dentry);
|
2005-04-17 05:20:36 +07:00
|
|
|
|
2012-06-12 21:20:33 +07:00
|
|
|
host_err = fh_want_write(fhp);
|
|
|
|
if (host_err)
|
|
|
|
goto out_nfserr;
|
|
|
|
|
2006-10-02 16:18:03 +07:00
|
|
|
fh_lock_nested(fhp, I_MUTEX_PARENT);
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Compose the response file handle.
|
|
|
|
*/
|
|
|
|
dchild = lookup_one_len(fname, dentry, flen);
|
2006-10-20 13:28:58 +07:00
|
|
|
host_err = PTR_ERR(dchild);
|
2005-04-17 05:20:36 +07:00
|
|
|
if (IS_ERR(dchild))
|
|
|
|
goto out_nfserr;
|
|
|
|
|
2011-04-20 19:09:35 +07:00
|
|
|
/* If file doesn't exist, check for permissions to create one */
|
2015-03-18 05:25:59 +07:00
|
|
|
if (d_really_is_negative(dchild)) {
|
2011-04-20 19:09:35 +07:00
|
|
|
err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_CREATE);
|
|
|
|
if (err)
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
2005-04-17 05:20:36 +07:00
|
|
|
err = fh_compose(resfhp, fhp->fh_export, dchild, fhp);
|
|
|
|
if (err)
|
|
|
|
goto out;
|
|
|
|
|
2011-04-20 16:06:25 +07:00
|
|
|
if (nfsd_create_is_exclusive(createmode)) {
|
2007-01-26 15:57:00 +07:00
|
|
|
/* solaris7 gets confused (bugid 4218508) if these have
|
2007-07-31 14:37:51 +07:00
|
|
|
* the high bit set, so just clear the high bits. If this is
|
|
|
|
* ever changed to use different attrs for storing the
|
|
|
|
* verifier, then do_open_lookup() will also need to be fixed
|
|
|
|
* accordingly.
|
2005-04-17 05:20:36 +07:00
|
|
|
*/
|
|
|
|
v_mtime = verifier[0]&0x7fffffff;
|
|
|
|
v_atime = verifier[1]&0x7fffffff;
|
|
|
|
}
|
|
|
|
|
2015-03-18 05:25:59 +07:00
|
|
|
if (d_really_is_positive(dchild)) {
|
2005-04-17 05:20:36 +07:00
|
|
|
err = 0;
|
|
|
|
|
|
|
|
switch (createmode) {
|
|
|
|
case NFS3_CREATE_UNCHECKED:
|
VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry)
Convert the following where appropriate:
(1) S_ISLNK(dentry->d_inode) to d_is_symlink(dentry).
(2) S_ISREG(dentry->d_inode) to d_is_reg(dentry).
(3) S_ISDIR(dentry->d_inode) to d_is_dir(dentry). This is actually more
complicated than it appears as some calls should be converted to
d_can_lookup() instead. The difference is whether the directory in
question is a real dir with a ->lookup op or whether it's a fake dir with
a ->d_automount op.
In some circumstances, we can subsume checks for dentry->d_inode not being
NULL into this, provided we the code isn't in a filesystem that expects
d_inode to be NULL if the dirent really *is* negative (ie. if we're going to
use d_inode() rather than d_backing_inode() to get the inode pointer).
Note that the dentry type field may be set to something other than
DCACHE_MISS_TYPE when d_inode is NULL in the case of unionmount, where the VFS
manages the fall-through from a negative dentry to a lower layer. In such a
case, the dentry type of the negative union dentry is set to the same as the
type of the lower dentry.
However, if you know d_inode is not NULL at the call site, then you can use
the d_is_xxx() functions even in a filesystem.
There is one further complication: a 0,0 chardev dentry may be labelled
DCACHE_WHITEOUT_TYPE rather than DCACHE_SPECIAL_TYPE. Strictly, this was
intended for special directory entry types that don't have attached inodes.
The following perl+coccinelle script was used:
use strict;
my @callers;
open($fd, 'git grep -l \'S_IS[A-Z].*->d_inode\' |') ||
die "Can't grep for S_ISDIR and co. callers";
@callers = <$fd>;
close($fd);
unless (@callers) {
print "No matches\n";
exit(0);
}
my @cocci = (
'@@',
'expression E;',
'@@',
'',
'- S_ISLNK(E->d_inode->i_mode)',
'+ d_is_symlink(E)',
'',
'@@',
'expression E;',
'@@',
'',
'- S_ISDIR(E->d_inode->i_mode)',
'+ d_is_dir(E)',
'',
'@@',
'expression E;',
'@@',
'',
'- S_ISREG(E->d_inode->i_mode)',
'+ d_is_reg(E)' );
my $coccifile = "tmp.sp.cocci";
open($fd, ">$coccifile") || die $coccifile;
print($fd "$_\n") || die $coccifile foreach (@cocci);
close($fd);
foreach my $file (@callers) {
chomp $file;
print "Processing ", $file, "\n";
system("spatch", "--sp-file", $coccifile, $file, "--in-place", "--no-show-diff") == 0 ||
die "spatch failed";
}
[AV: overlayfs parts skipped]
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-01-29 19:02:35 +07:00
|
|
|
if (! d_is_reg(dchild))
|
2012-04-10 05:06:49 +07:00
|
|
|
goto out;
|
2005-04-17 05:20:36 +07:00
|
|
|
else if (truncp) {
|
|
|
|
/* in nfsv4, we need to treat this case a little
|
|
|
|
* differently. we don't want to truncate the
|
|
|
|
* file now; this would be wrong if the OPEN
|
|
|
|
* fails for some other reason. furthermore,
|
|
|
|
* if the size is nonzero, we should ignore it
|
|
|
|
* according to spec!
|
|
|
|
*/
|
|
|
|
*truncp = (iap->ia_valid & ATTR_SIZE) && !iap->ia_size;
|
|
|
|
}
|
|
|
|
else {
|
|
|
|
iap->ia_valid &= ATTR_SIZE;
|
|
|
|
goto set_attr;
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
case NFS3_CREATE_EXCLUSIVE:
|
2015-03-18 05:25:59 +07:00
|
|
|
if ( d_inode(dchild)->i_mtime.tv_sec == v_mtime
|
|
|
|
&& d_inode(dchild)->i_atime.tv_sec == v_atime
|
|
|
|
&& d_inode(dchild)->i_size == 0 ) {
|
2012-12-08 03:40:55 +07:00
|
|
|
if (created)
|
|
|
|
*created = 1;
|
2005-04-17 05:20:36 +07:00
|
|
|
break;
|
2012-12-08 03:40:55 +07:00
|
|
|
}
|
2011-04-20 16:06:25 +07:00
|
|
|
case NFS4_CREATE_EXCLUSIVE4_1:
|
2015-03-18 05:25:59 +07:00
|
|
|
if ( d_inode(dchild)->i_mtime.tv_sec == v_mtime
|
|
|
|
&& d_inode(dchild)->i_atime.tv_sec == v_atime
|
|
|
|
&& d_inode(dchild)->i_size == 0 ) {
|
2012-12-08 03:40:55 +07:00
|
|
|
if (created)
|
|
|
|
*created = 1;
|
2011-04-20 16:06:25 +07:00
|
|
|
goto set_attr;
|
2012-12-08 03:40:55 +07:00
|
|
|
}
|
2005-04-17 05:20:36 +07:00
|
|
|
/* fallthru */
|
|
|
|
case NFS3_CREATE_GUARDED:
|
|
|
|
err = nfserr_exist;
|
|
|
|
}
|
2011-11-24 00:03:18 +07:00
|
|
|
fh_drop_write(fhp);
|
2005-04-17 05:20:36 +07:00
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
2012-06-11 05:09:36 +07:00
|
|
|
host_err = vfs_create(dirp, dchild, iap->ia_mode, true);
|
2008-02-16 05:37:57 +07:00
|
|
|
if (host_err < 0) {
|
2011-11-24 00:03:18 +07:00
|
|
|
fh_drop_write(fhp);
|
2005-04-17 05:20:36 +07:00
|
|
|
goto out_nfserr;
|
2008-02-16 05:37:57 +07:00
|
|
|
}
|
2006-11-09 08:44:40 +07:00
|
|
|
if (created)
|
|
|
|
*created = 1;
|
2005-04-17 05:20:36 +07:00
|
|
|
|
2009-02-10 10:27:51 +07:00
|
|
|
nfsd_check_ignore_resizing(iap);
|
|
|
|
|
2011-04-20 16:06:25 +07:00
|
|
|
if (nfsd_create_is_exclusive(createmode)) {
|
2007-01-26 15:57:00 +07:00
|
|
|
/* Cram the verifier into atime/mtime */
|
2005-04-17 05:20:36 +07:00
|
|
|
iap->ia_valid = ATTR_MTIME|ATTR_ATIME
|
2007-01-26 15:57:00 +07:00
|
|
|
| ATTR_MTIME_SET|ATTR_ATIME_SET;
|
2005-04-17 05:20:36 +07:00
|
|
|
/* XXX someone who knows this better please fix it for nsec */
|
|
|
|
iap->ia_mtime.tv_sec = v_mtime;
|
|
|
|
iap->ia_atime.tv_sec = v_atime;
|
|
|
|
iap->ia_mtime.tv_nsec = 0;
|
|
|
|
iap->ia_atime.tv_nsec = 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
set_attr:
|
2010-02-18 03:05:11 +07:00
|
|
|
err = nfsd_create_setattr(rqstp, resfhp, iap);
|
|
|
|
|
|
|
|
/*
|
2014-07-02 05:27:53 +07:00
|
|
|
* nfsd_create_setattr already committed the child
|
|
|
|
* (and possibly also the parent).
|
2010-02-18 03:05:11 +07:00
|
|
|
*/
|
|
|
|
if (!err)
|
|
|
|
err = nfserrno(commit_metadata(fhp));
|
2006-01-19 08:43:13 +07:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Update the filehandle to get the new inode info.
|
|
|
|
*/
|
|
|
|
if (!err)
|
|
|
|
err = fh_update(resfhp);
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
out:
|
|
|
|
fh_unlock(fhp);
|
|
|
|
if (dchild && !IS_ERR(dchild))
|
|
|
|
dput(dchild);
|
2012-06-12 21:20:33 +07:00
|
|
|
fh_drop_write(fhp);
|
2005-04-17 05:20:36 +07:00
|
|
|
return err;
|
|
|
|
|
|
|
|
out_nfserr:
|
2006-10-20 13:28:58 +07:00
|
|
|
err = nfserrno(host_err);
|
2005-04-17 05:20:36 +07:00
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
#endif /* CONFIG_NFSD_V3 */
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Read a symlink. On entry, *lenp must contain the maximum path length that
|
|
|
|
* fits into the buffer. On return, it contains the true length.
|
|
|
|
* N.B. After this call fhp needs an fh_put
|
|
|
|
*/
|
2006-10-20 13:28:58 +07:00
|
|
|
__be32
|
2005-04-17 05:20:36 +07:00
|
|
|
nfsd_readlink(struct svc_rqst *rqstp, struct svc_fh *fhp, char *buf, int *lenp)
|
|
|
|
{
|
|
|
|
struct inode *inode;
|
|
|
|
mm_segment_t oldfs;
|
2006-10-20 13:28:58 +07:00
|
|
|
__be32 err;
|
|
|
|
int host_err;
|
2012-03-15 19:21:57 +07:00
|
|
|
struct path path;
|
2005-04-17 05:20:36 +07:00
|
|
|
|
2008-06-16 18:20:29 +07:00
|
|
|
err = fh_verify(rqstp, fhp, S_IFLNK, NFSD_MAY_NOP);
|
2005-04-17 05:20:36 +07:00
|
|
|
if (err)
|
|
|
|
goto out;
|
|
|
|
|
2012-03-15 19:21:57 +07:00
|
|
|
path.mnt = fhp->fh_export->ex_path.mnt;
|
|
|
|
path.dentry = fhp->fh_dentry;
|
2015-03-18 05:25:59 +07:00
|
|
|
inode = d_inode(path.dentry);
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
err = nfserr_inval;
|
2008-12-04 22:06:33 +07:00
|
|
|
if (!inode->i_op->readlink)
|
2005-04-17 05:20:36 +07:00
|
|
|
goto out;
|
|
|
|
|
2012-03-15 19:21:57 +07:00
|
|
|
touch_atime(&path);
|
2005-04-17 05:20:36 +07:00
|
|
|
/* N.B. Why does this call need a get_fs()??
|
|
|
|
* Remove the set_fs and watch the fireworks:-) --okir
|
|
|
|
*/
|
|
|
|
|
|
|
|
oldfs = get_fs(); set_fs(KERNEL_DS);
|
2012-07-28 01:16:55 +07:00
|
|
|
host_err = inode->i_op->readlink(path.dentry, (char __user *)buf, *lenp);
|
2005-04-17 05:20:36 +07:00
|
|
|
set_fs(oldfs);
|
|
|
|
|
2006-10-20 13:28:58 +07:00
|
|
|
if (host_err < 0)
|
2005-04-17 05:20:36 +07:00
|
|
|
goto out_nfserr;
|
2006-10-20 13:28:58 +07:00
|
|
|
*lenp = host_err;
|
2005-04-17 05:20:36 +07:00
|
|
|
err = 0;
|
|
|
|
out:
|
|
|
|
return err;
|
|
|
|
|
|
|
|
out_nfserr:
|
2006-10-20 13:28:58 +07:00
|
|
|
err = nfserrno(host_err);
|
2005-04-17 05:20:36 +07:00
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Create a symlink and look up its inode
|
|
|
|
* N.B. After this call _both_ fhp and resfhp need an fh_put
|
|
|
|
*/
|
2006-10-20 13:28:58 +07:00
|
|
|
__be32
|
2005-04-17 05:20:36 +07:00
|
|
|
nfsd_symlink(struct svc_rqst *rqstp, struct svc_fh *fhp,
|
|
|
|
char *fname, int flen,
|
2014-06-20 22:52:21 +07:00
|
|
|
char *path,
|
2014-07-01 16:48:02 +07:00
|
|
|
struct svc_fh *resfhp)
|
2005-04-17 05:20:36 +07:00
|
|
|
{
|
|
|
|
struct dentry *dentry, *dnew;
|
2006-10-20 13:28:58 +07:00
|
|
|
__be32 err, cerr;
|
|
|
|
int host_err;
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
err = nfserr_noent;
|
2014-06-20 22:52:21 +07:00
|
|
|
if (!flen || path[0] == '\0')
|
2005-04-17 05:20:36 +07:00
|
|
|
goto out;
|
|
|
|
err = nfserr_exist;
|
|
|
|
if (isdotent(fname, flen))
|
|
|
|
goto out;
|
|
|
|
|
2008-06-16 18:20:29 +07:00
|
|
|
err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_CREATE);
|
2005-04-17 05:20:36 +07:00
|
|
|
if (err)
|
|
|
|
goto out;
|
2012-06-12 21:20:33 +07:00
|
|
|
|
|
|
|
host_err = fh_want_write(fhp);
|
|
|
|
if (host_err)
|
|
|
|
goto out_nfserr;
|
|
|
|
|
2005-04-17 05:20:36 +07:00
|
|
|
fh_lock(fhp);
|
|
|
|
dentry = fhp->fh_dentry;
|
|
|
|
dnew = lookup_one_len(fname, dentry, flen);
|
2006-10-20 13:28:58 +07:00
|
|
|
host_err = PTR_ERR(dnew);
|
2005-04-17 05:20:36 +07:00
|
|
|
if (IS_ERR(dnew))
|
|
|
|
goto out_nfserr;
|
|
|
|
|
2015-03-18 05:25:59 +07:00
|
|
|
host_err = vfs_symlink(d_inode(dentry), dnew, path);
|
2006-10-20 13:28:58 +07:00
|
|
|
err = nfserrno(host_err);
|
2010-02-18 03:05:11 +07:00
|
|
|
if (!err)
|
|
|
|
err = nfserrno(commit_metadata(fhp));
|
2005-04-17 05:20:36 +07:00
|
|
|
fh_unlock(fhp);
|
|
|
|
|
2011-11-24 00:03:18 +07:00
|
|
|
fh_drop_write(fhp);
|
2008-02-16 05:37:45 +07:00
|
|
|
|
2005-04-17 05:20:36 +07:00
|
|
|
cerr = fh_compose(resfhp, fhp->fh_export, dnew, fhp);
|
|
|
|
dput(dnew);
|
|
|
|
if (err==0) err = cerr;
|
|
|
|
out:
|
|
|
|
return err;
|
|
|
|
|
|
|
|
out_nfserr:
|
2006-10-20 13:28:58 +07:00
|
|
|
err = nfserrno(host_err);
|
2005-04-17 05:20:36 +07:00
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Create a hardlink
|
|
|
|
* N.B. After this call _both_ ffhp and tfhp need an fh_put
|
|
|
|
*/
|
2006-10-20 13:28:58 +07:00
|
|
|
__be32
|
2005-04-17 05:20:36 +07:00
|
|
|
nfsd_link(struct svc_rqst *rqstp, struct svc_fh *ffhp,
|
|
|
|
char *name, int len, struct svc_fh *tfhp)
|
|
|
|
{
|
|
|
|
struct dentry *ddir, *dnew, *dold;
|
2010-07-20 03:38:24 +07:00
|
|
|
struct inode *dirp;
|
2006-10-20 13:28:58 +07:00
|
|
|
__be32 err;
|
|
|
|
int host_err;
|
2005-04-17 05:20:36 +07:00
|
|
|
|
2008-06-16 18:20:29 +07:00
|
|
|
err = fh_verify(rqstp, ffhp, S_IFDIR, NFSD_MAY_CREATE);
|
2005-04-17 05:20:36 +07:00
|
|
|
if (err)
|
|
|
|
goto out;
|
2011-08-16 03:59:55 +07:00
|
|
|
err = fh_verify(rqstp, tfhp, 0, NFSD_MAY_NOP);
|
2005-04-17 05:20:36 +07:00
|
|
|
if (err)
|
|
|
|
goto out;
|
2011-08-16 03:59:55 +07:00
|
|
|
err = nfserr_isdir;
|
VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry)
Convert the following where appropriate:
(1) S_ISLNK(dentry->d_inode) to d_is_symlink(dentry).
(2) S_ISREG(dentry->d_inode) to d_is_reg(dentry).
(3) S_ISDIR(dentry->d_inode) to d_is_dir(dentry). This is actually more
complicated than it appears as some calls should be converted to
d_can_lookup() instead. The difference is whether the directory in
question is a real dir with a ->lookup op or whether it's a fake dir with
a ->d_automount op.
In some circumstances, we can subsume checks for dentry->d_inode not being
NULL into this, provided we the code isn't in a filesystem that expects
d_inode to be NULL if the dirent really *is* negative (ie. if we're going to
use d_inode() rather than d_backing_inode() to get the inode pointer).
Note that the dentry type field may be set to something other than
DCACHE_MISS_TYPE when d_inode is NULL in the case of unionmount, where the VFS
manages the fall-through from a negative dentry to a lower layer. In such a
case, the dentry type of the negative union dentry is set to the same as the
type of the lower dentry.
However, if you know d_inode is not NULL at the call site, then you can use
the d_is_xxx() functions even in a filesystem.
There is one further complication: a 0,0 chardev dentry may be labelled
DCACHE_WHITEOUT_TYPE rather than DCACHE_SPECIAL_TYPE. Strictly, this was
intended for special directory entry types that don't have attached inodes.
The following perl+coccinelle script was used:
use strict;
my @callers;
open($fd, 'git grep -l \'S_IS[A-Z].*->d_inode\' |') ||
die "Can't grep for S_ISDIR and co. callers";
@callers = <$fd>;
close($fd);
unless (@callers) {
print "No matches\n";
exit(0);
}
my @cocci = (
'@@',
'expression E;',
'@@',
'',
'- S_ISLNK(E->d_inode->i_mode)',
'+ d_is_symlink(E)',
'',
'@@',
'expression E;',
'@@',
'',
'- S_ISDIR(E->d_inode->i_mode)',
'+ d_is_dir(E)',
'',
'@@',
'expression E;',
'@@',
'',
'- S_ISREG(E->d_inode->i_mode)',
'+ d_is_reg(E)' );
my $coccifile = "tmp.sp.cocci";
open($fd, ">$coccifile") || die $coccifile;
print($fd "$_\n") || die $coccifile foreach (@cocci);
close($fd);
foreach my $file (@callers) {
chomp $file;
print "Processing ", $file, "\n";
system("spatch", "--sp-file", $coccifile, $file, "--in-place", "--no-show-diff") == 0 ||
die "spatch failed";
}
[AV: overlayfs parts skipped]
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-01-29 19:02:35 +07:00
|
|
|
if (d_is_dir(tfhp->fh_dentry))
|
2011-08-16 03:59:55 +07:00
|
|
|
goto out;
|
2005-04-17 05:20:36 +07:00
|
|
|
err = nfserr_perm;
|
|
|
|
if (!len)
|
|
|
|
goto out;
|
|
|
|
err = nfserr_exist;
|
|
|
|
if (isdotent(name, len))
|
|
|
|
goto out;
|
|
|
|
|
2012-06-12 21:20:33 +07:00
|
|
|
host_err = fh_want_write(tfhp);
|
|
|
|
if (host_err) {
|
|
|
|
err = nfserrno(host_err);
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
2006-10-02 16:18:03 +07:00
|
|
|
fh_lock_nested(ffhp, I_MUTEX_PARENT);
|
2005-04-17 05:20:36 +07:00
|
|
|
ddir = ffhp->fh_dentry;
|
2015-03-18 05:25:59 +07:00
|
|
|
dirp = d_inode(ddir);
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
dnew = lookup_one_len(name, ddir, len);
|
2006-10-20 13:28:58 +07:00
|
|
|
host_err = PTR_ERR(dnew);
|
2005-04-17 05:20:36 +07:00
|
|
|
if (IS_ERR(dnew))
|
|
|
|
goto out_nfserr;
|
|
|
|
|
|
|
|
dold = tfhp->fh_dentry;
|
|
|
|
|
2011-01-12 01:55:46 +07:00
|
|
|
err = nfserr_noent;
|
2015-03-18 05:25:59 +07:00
|
|
|
if (d_really_is_negative(dold))
|
2012-06-12 21:20:33 +07:00
|
|
|
goto out_dput;
|
2011-09-21 04:14:31 +07:00
|
|
|
host_err = vfs_link(dold, dirp, dnew, NULL);
|
2006-10-20 13:28:58 +07:00
|
|
|
if (!host_err) {
|
2010-02-18 03:05:11 +07:00
|
|
|
err = nfserrno(commit_metadata(ffhp));
|
|
|
|
if (!err)
|
|
|
|
err = nfserrno(commit_metadata(tfhp));
|
2005-04-17 05:20:36 +07:00
|
|
|
} else {
|
2006-10-20 13:28:58 +07:00
|
|
|
if (host_err == -EXDEV && rqstp->rq_vers == 2)
|
2005-04-17 05:20:36 +07:00
|
|
|
err = nfserr_acces;
|
|
|
|
else
|
2006-10-20 13:28:58 +07:00
|
|
|
err = nfserrno(host_err);
|
2005-04-17 05:20:36 +07:00
|
|
|
}
|
2008-02-16 05:37:45 +07:00
|
|
|
out_dput:
|
2005-04-17 05:20:36 +07:00
|
|
|
dput(dnew);
|
2006-06-30 15:56:15 +07:00
|
|
|
out_unlock:
|
|
|
|
fh_unlock(ffhp);
|
2012-06-12 21:20:33 +07:00
|
|
|
fh_drop_write(tfhp);
|
2005-04-17 05:20:36 +07:00
|
|
|
out:
|
|
|
|
return err;
|
|
|
|
|
|
|
|
out_nfserr:
|
2006-10-20 13:28:58 +07:00
|
|
|
err = nfserrno(host_err);
|
2006-06-30 15:56:15 +07:00
|
|
|
goto out_unlock;
|
2005-04-17 05:20:36 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Rename a file
|
|
|
|
* N.B. After this call _both_ ffhp and tfhp need an fh_put
|
|
|
|
*/
|
2006-10-20 13:28:58 +07:00
|
|
|
__be32
|
2005-04-17 05:20:36 +07:00
|
|
|
nfsd_rename(struct svc_rqst *rqstp, struct svc_fh *ffhp, char *fname, int flen,
|
|
|
|
struct svc_fh *tfhp, char *tname, int tlen)
|
|
|
|
{
|
|
|
|
struct dentry *fdentry, *tdentry, *odentry, *ndentry, *trap;
|
|
|
|
struct inode *fdir, *tdir;
|
2006-10-20 13:28:58 +07:00
|
|
|
__be32 err;
|
|
|
|
int host_err;
|
2005-04-17 05:20:36 +07:00
|
|
|
|
2008-06-16 18:20:29 +07:00
|
|
|
err = fh_verify(rqstp, ffhp, S_IFDIR, NFSD_MAY_REMOVE);
|
2005-04-17 05:20:36 +07:00
|
|
|
if (err)
|
|
|
|
goto out;
|
2008-06-16 18:20:29 +07:00
|
|
|
err = fh_verify(rqstp, tfhp, S_IFDIR, NFSD_MAY_CREATE);
|
2005-04-17 05:20:36 +07:00
|
|
|
if (err)
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
fdentry = ffhp->fh_dentry;
|
2015-03-18 05:25:59 +07:00
|
|
|
fdir = d_inode(fdentry);
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
tdentry = tfhp->fh_dentry;
|
2015-03-18 05:25:59 +07:00
|
|
|
tdir = d_inode(tdentry);
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
err = nfserr_perm;
|
|
|
|
if (!flen || isdotent(fname, flen) || !tlen || isdotent(tname, tlen))
|
|
|
|
goto out;
|
|
|
|
|
2012-06-12 21:20:33 +07:00
|
|
|
host_err = fh_want_write(ffhp);
|
|
|
|
if (host_err) {
|
|
|
|
err = nfserrno(host_err);
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
2005-04-17 05:20:36 +07:00
|
|
|
/* cannot use fh_lock as we need deadlock protective ordering
|
|
|
|
* so do it by hand */
|
|
|
|
trap = lock_rename(tdentry, fdentry);
|
2015-09-17 19:28:39 +07:00
|
|
|
ffhp->fh_locked = tfhp->fh_locked = true;
|
2005-04-17 05:20:36 +07:00
|
|
|
fill_pre_wcc(ffhp);
|
|
|
|
fill_pre_wcc(tfhp);
|
|
|
|
|
|
|
|
odentry = lookup_one_len(fname, fdentry, flen);
|
2006-10-20 13:28:58 +07:00
|
|
|
host_err = PTR_ERR(odentry);
|
2005-04-17 05:20:36 +07:00
|
|
|
if (IS_ERR(odentry))
|
|
|
|
goto out_nfserr;
|
|
|
|
|
2006-10-20 13:28:58 +07:00
|
|
|
host_err = -ENOENT;
|
2015-03-18 05:25:59 +07:00
|
|
|
if (d_really_is_negative(odentry))
|
2005-04-17 05:20:36 +07:00
|
|
|
goto out_dput_old;
|
2006-10-20 13:28:58 +07:00
|
|
|
host_err = -EINVAL;
|
2005-04-17 05:20:36 +07:00
|
|
|
if (odentry == trap)
|
|
|
|
goto out_dput_old;
|
|
|
|
|
|
|
|
ndentry = lookup_one_len(tname, tdentry, tlen);
|
2006-10-20 13:28:58 +07:00
|
|
|
host_err = PTR_ERR(ndentry);
|
2005-04-17 05:20:36 +07:00
|
|
|
if (IS_ERR(ndentry))
|
|
|
|
goto out_dput_old;
|
2006-10-20 13:28:58 +07:00
|
|
|
host_err = -ENOTEMPTY;
|
2005-04-17 05:20:36 +07:00
|
|
|
if (ndentry == trap)
|
|
|
|
goto out_dput_new;
|
|
|
|
|
2008-02-16 05:37:49 +07:00
|
|
|
host_err = -EXDEV;
|
|
|
|
if (ffhp->fh_export->ex_path.mnt != tfhp->fh_export->ex_path.mnt)
|
|
|
|
goto out_dput_new;
|
2013-04-16 03:03:46 +07:00
|
|
|
if (ffhp->fh_export->ex_path.dentry != tfhp->fh_export->ex_path.dentry)
|
|
|
|
goto out_dput_new;
|
2008-02-16 05:37:49 +07:00
|
|
|
|
2014-04-01 22:08:42 +07:00
|
|
|
host_err = vfs_rename(fdir, odentry, tdir, ndentry, NULL, 0);
|
2010-02-18 03:05:11 +07:00
|
|
|
if (!host_err) {
|
|
|
|
host_err = commit_metadata(tfhp);
|
2006-10-20 13:28:58 +07:00
|
|
|
if (!host_err)
|
2010-02-18 03:05:11 +07:00
|
|
|
host_err = commit_metadata(ffhp);
|
2005-04-17 05:20:36 +07:00
|
|
|
}
|
|
|
|
out_dput_new:
|
|
|
|
dput(ndentry);
|
|
|
|
out_dput_old:
|
|
|
|
dput(odentry);
|
|
|
|
out_nfserr:
|
2006-10-20 13:28:58 +07:00
|
|
|
err = nfserrno(host_err);
|
2014-03-29 03:43:17 +07:00
|
|
|
/*
|
|
|
|
* We cannot rely on fh_unlock on the two filehandles,
|
2005-04-17 05:20:36 +07:00
|
|
|
* as that would do the wrong thing if the two directories
|
2014-03-29 03:43:17 +07:00
|
|
|
* were the same, so again we do it by hand.
|
2005-04-17 05:20:36 +07:00
|
|
|
*/
|
|
|
|
fill_post_wcc(ffhp);
|
|
|
|
fill_post_wcc(tfhp);
|
|
|
|
unlock_rename(tdentry, fdentry);
|
2015-09-17 19:28:39 +07:00
|
|
|
ffhp->fh_locked = tfhp->fh_locked = false;
|
2012-06-12 21:20:33 +07:00
|
|
|
fh_drop_write(ffhp);
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
out:
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Unlink a file or directory
|
|
|
|
* N.B. After this call fhp needs an fh_put
|
|
|
|
*/
|
2006-10-20 13:28:58 +07:00
|
|
|
__be32
|
2005-04-17 05:20:36 +07:00
|
|
|
nfsd_unlink(struct svc_rqst *rqstp, struct svc_fh *fhp, int type,
|
|
|
|
char *fname, int flen)
|
|
|
|
{
|
|
|
|
struct dentry *dentry, *rdentry;
|
|
|
|
struct inode *dirp;
|
2006-10-20 13:28:58 +07:00
|
|
|
__be32 err;
|
|
|
|
int host_err;
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
err = nfserr_acces;
|
|
|
|
if (!flen || isdotent(fname, flen))
|
|
|
|
goto out;
|
2008-06-16 18:20:29 +07:00
|
|
|
err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_REMOVE);
|
2005-04-17 05:20:36 +07:00
|
|
|
if (err)
|
|
|
|
goto out;
|
|
|
|
|
2012-06-12 21:20:33 +07:00
|
|
|
host_err = fh_want_write(fhp);
|
|
|
|
if (host_err)
|
|
|
|
goto out_nfserr;
|
|
|
|
|
2006-10-02 16:18:03 +07:00
|
|
|
fh_lock_nested(fhp, I_MUTEX_PARENT);
|
2005-04-17 05:20:36 +07:00
|
|
|
dentry = fhp->fh_dentry;
|
2015-03-18 05:25:59 +07:00
|
|
|
dirp = d_inode(dentry);
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
rdentry = lookup_one_len(fname, dentry, flen);
|
2006-10-20 13:28:58 +07:00
|
|
|
host_err = PTR_ERR(rdentry);
|
2005-04-17 05:20:36 +07:00
|
|
|
if (IS_ERR(rdentry))
|
|
|
|
goto out_nfserr;
|
|
|
|
|
2015-03-18 05:25:59 +07:00
|
|
|
if (d_really_is_negative(rdentry)) {
|
2005-04-17 05:20:36 +07:00
|
|
|
dput(rdentry);
|
|
|
|
err = nfserr_noent;
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!type)
|
2015-03-18 05:25:59 +07:00
|
|
|
type = d_inode(rdentry)->i_mode & S_IFMT;
|
2005-04-17 05:20:36 +07:00
|
|
|
|
2011-01-12 02:07:12 +07:00
|
|
|
if (type != S_IFDIR)
|
2011-09-20 20:14:34 +07:00
|
|
|
host_err = vfs_unlink(dirp, rdentry, NULL);
|
2011-01-12 02:07:12 +07:00
|
|
|
else
|
2006-10-20 13:28:58 +07:00
|
|
|
host_err = vfs_rmdir(dirp, rdentry);
|
2010-02-18 03:05:11 +07:00
|
|
|
if (!host_err)
|
|
|
|
host_err = commit_metadata(fhp);
|
2011-01-15 08:00:02 +07:00
|
|
|
dput(rdentry);
|
|
|
|
|
2005-04-17 05:20:36 +07:00
|
|
|
out_nfserr:
|
2006-10-20 13:28:58 +07:00
|
|
|
err = nfserrno(host_err);
|
2006-01-19 08:43:13 +07:00
|
|
|
out:
|
|
|
|
return err;
|
2005-04-17 05:20:36 +07:00
|
|
|
}
|
|
|
|
|
2008-08-01 02:29:12 +07:00
|
|
|
/*
|
|
|
|
* We do this buffering because we must not call back into the file
|
|
|
|
* system's ->lookup() method from the filldir callback. That may well
|
|
|
|
* deadlock a number of file systems.
|
|
|
|
*
|
|
|
|
* This is based heavily on the implementation of same in XFS.
|
|
|
|
*/
|
|
|
|
struct buffered_dirent {
|
|
|
|
u64 ino;
|
|
|
|
loff_t offset;
|
|
|
|
int namlen;
|
|
|
|
unsigned int d_type;
|
|
|
|
char name[];
|
|
|
|
};
|
|
|
|
|
|
|
|
struct readdir_data {
|
2013-05-16 00:52:59 +07:00
|
|
|
struct dir_context ctx;
|
2008-08-01 02:29:12 +07:00
|
|
|
char *dirent;
|
|
|
|
size_t used;
|
2008-08-24 18:29:52 +07:00
|
|
|
int full;
|
2008-08-01 02:29:12 +07:00
|
|
|
};
|
|
|
|
|
2014-10-30 23:37:34 +07:00
|
|
|
static int nfsd_buffered_filldir(struct dir_context *ctx, const char *name,
|
|
|
|
int namlen, loff_t offset, u64 ino,
|
|
|
|
unsigned int d_type)
|
2008-08-01 02:29:12 +07:00
|
|
|
{
|
2014-10-30 23:37:34 +07:00
|
|
|
struct readdir_data *buf =
|
|
|
|
container_of(ctx, struct readdir_data, ctx);
|
2008-08-01 02:29:12 +07:00
|
|
|
struct buffered_dirent *de = (void *)(buf->dirent + buf->used);
|
|
|
|
unsigned int reclen;
|
|
|
|
|
|
|
|
reclen = ALIGN(sizeof(struct buffered_dirent) + namlen, sizeof(u64));
|
2008-08-24 18:29:52 +07:00
|
|
|
if (buf->used + reclen > PAGE_SIZE) {
|
|
|
|
buf->full = 1;
|
2008-08-01 02:29:12 +07:00
|
|
|
return -EINVAL;
|
2008-08-24 18:29:52 +07:00
|
|
|
}
|
2008-08-01 02:29:12 +07:00
|
|
|
|
|
|
|
de->namlen = namlen;
|
|
|
|
de->offset = offset;
|
|
|
|
de->ino = ino;
|
|
|
|
de->d_type = d_type;
|
|
|
|
memcpy(de->name, name, namlen);
|
|
|
|
buf->used += reclen;
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2014-10-30 23:37:34 +07:00
|
|
|
static __be32 nfsd_buffered_readdir(struct file *file, nfsd_filldir_t func,
|
2009-04-21 05:18:37 +07:00
|
|
|
struct readdir_cd *cdp, loff_t *offsetp)
|
2008-07-31 23:16:51 +07:00
|
|
|
{
|
2008-08-01 02:29:12 +07:00
|
|
|
struct buffered_dirent *de;
|
2008-07-31 23:16:51 +07:00
|
|
|
int host_err;
|
2008-08-01 02:29:12 +07:00
|
|
|
int size;
|
|
|
|
loff_t offset;
|
2013-05-23 09:22:04 +07:00
|
|
|
struct readdir_data buf = {
|
|
|
|
.ctx.actor = nfsd_buffered_filldir,
|
|
|
|
.dirent = (void *)__get_free_page(GFP_KERNEL)
|
|
|
|
};
|
2008-07-31 23:16:51 +07:00
|
|
|
|
2008-08-01 02:29:12 +07:00
|
|
|
if (!buf.dirent)
|
2009-04-21 05:18:37 +07:00
|
|
|
return nfserrno(-ENOMEM);
|
2008-08-01 02:29:12 +07:00
|
|
|
|
|
|
|
offset = *offsetp;
|
2008-07-31 23:16:51 +07:00
|
|
|
|
2008-08-01 02:29:12 +07:00
|
|
|
while (1) {
|
|
|
|
unsigned int reclen;
|
|
|
|
|
Fix nfsd truncation of readdir results
Commit 8d7c4203 "nfsd: fix failure to set eof in readdir in some
situations" introduced a bug: on a directory in an exported ext3
filesystem with dir_index unset, a READDIR will only return about 250
entries, even if the directory was larger.
Bisected it back to this commit; reverting it fixes the problem.
It turns out that in this case ext3 reads a block at a time, then
returns from readdir, which means we can end up with buf.full==0 but
with more entries in the directory still to be read. Before 8d7c4203
(but after c002a6c797 "Optimise NFS readdir hack slightly"), this would
cause us to return the READDIR result immediately, but with the eof bit
unset. That could cause a performance regression (because the client
would need more roundtrips to the server to read the whole directory),
but no loss in correctness, since the cleared eof bit caused the client
to send another readdir. After 8d7c4203, the setting of the eof bit
made this a correctness problem.
So, move nfserr_eof into the loop and remove the buf.full check so that
we loop until buf.used==0. The following seems to do the right thing
and reduces the network traffic since we don't return a READDIR result
until the buffer is full.
Tested on an empty directory & large directory; eof is properly sent and
there are no more short buffers.
Signed-off-by: Doug Nazar <nazard@dragoninc.ca>
Cc: David Woodhouse <David.Woodhouse@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-11-05 18:16:28 +07:00
|
|
|
cdp->err = nfserr_eof; /* will be cleared on successful read */
|
2008-08-01 02:29:12 +07:00
|
|
|
buf.used = 0;
|
2008-08-24 18:29:52 +07:00
|
|
|
buf.full = 0;
|
2008-08-01 02:29:12 +07:00
|
|
|
|
2013-05-16 00:52:59 +07:00
|
|
|
host_err = iterate_dir(file, &buf.ctx);
|
2008-08-24 18:29:52 +07:00
|
|
|
if (buf.full)
|
|
|
|
host_err = 0;
|
|
|
|
|
|
|
|
if (host_err < 0)
|
2008-08-01 02:29:12 +07:00
|
|
|
break;
|
|
|
|
|
|
|
|
size = buf.used;
|
|
|
|
|
|
|
|
if (!size)
|
|
|
|
break;
|
|
|
|
|
|
|
|
de = (struct buffered_dirent *)buf.dirent;
|
|
|
|
while (size > 0) {
|
|
|
|
offset = de->offset;
|
|
|
|
|
|
|
|
if (func(cdp, de->name, de->namlen, de->offset,
|
|
|
|
de->ino, de->d_type))
|
2009-04-21 05:18:37 +07:00
|
|
|
break;
|
2008-08-01 02:29:12 +07:00
|
|
|
|
|
|
|
if (cdp->err != nfs_ok)
|
2009-04-21 05:18:37 +07:00
|
|
|
break;
|
2008-08-01 02:29:12 +07:00
|
|
|
|
|
|
|
reclen = ALIGN(sizeof(*de) + de->namlen,
|
|
|
|
sizeof(u64));
|
|
|
|
size -= reclen;
|
|
|
|
de = (struct buffered_dirent *)((char *)de + reclen);
|
|
|
|
}
|
2009-04-21 05:18:37 +07:00
|
|
|
if (size > 0) /* We bailed out early */
|
|
|
|
break;
|
|
|
|
|
2008-08-17 23:21:18 +07:00
|
|
|
offset = vfs_llseek(file, 0, SEEK_CUR);
|
2008-08-01 02:29:12 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
free_page((unsigned long)(buf.dirent));
|
2008-07-31 23:16:51 +07:00
|
|
|
|
|
|
|
if (host_err)
|
|
|
|
return nfserrno(host_err);
|
2008-08-01 02:29:12 +07:00
|
|
|
|
|
|
|
*offsetp = offset;
|
|
|
|
return cdp->err;
|
2008-07-31 23:16:51 +07:00
|
|
|
}
|
|
|
|
|
2005-04-17 05:20:36 +07:00
|
|
|
/*
|
|
|
|
* Read entries from a directory.
|
|
|
|
* The NFSv3/4 verifier we ignore for now.
|
|
|
|
*/
|
2006-10-20 13:28:58 +07:00
|
|
|
__be32
|
2005-04-17 05:20:36 +07:00
|
|
|
nfsd_readdir(struct svc_rqst *rqstp, struct svc_fh *fhp, loff_t *offsetp,
|
2014-10-30 23:37:34 +07:00
|
|
|
struct readdir_cd *cdp, nfsd_filldir_t func)
|
2005-04-17 05:20:36 +07:00
|
|
|
{
|
2006-10-20 13:28:58 +07:00
|
|
|
__be32 err;
|
2005-04-17 05:20:36 +07:00
|
|
|
struct file *file;
|
|
|
|
loff_t offset = *offsetp;
|
2012-03-19 09:44:50 +07:00
|
|
|
int may_flags = NFSD_MAY_READ;
|
|
|
|
|
|
|
|
/* NFSv2 only supports 32 bit cookies */
|
|
|
|
if (rqstp->rq_vers > 2)
|
|
|
|
may_flags |= NFSD_MAY_64BIT_COOKIE;
|
2005-04-17 05:20:36 +07:00
|
|
|
|
2012-03-19 09:44:50 +07:00
|
|
|
err = nfsd_open(rqstp, fhp, S_IFDIR, may_flags, &file);
|
2005-04-17 05:20:36 +07:00
|
|
|
if (err)
|
|
|
|
goto out;
|
|
|
|
|
2012-04-26 02:30:00 +07:00
|
|
|
offset = vfs_llseek(file, offset, SEEK_SET);
|
2005-04-17 05:20:36 +07:00
|
|
|
if (offset < 0) {
|
|
|
|
err = nfserrno((int)offset);
|
|
|
|
goto out_close;
|
|
|
|
}
|
|
|
|
|
2008-08-01 02:29:12 +07:00
|
|
|
err = nfsd_buffered_readdir(file, func, cdp, offsetp);
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
if (err == nfserr_eof || err == nfserr_toosmall)
|
|
|
|
err = nfs_ok; /* can still be found in ->err */
|
|
|
|
out_close:
|
2015-04-28 20:41:16 +07:00
|
|
|
fput(file);
|
2005-04-17 05:20:36 +07:00
|
|
|
out:
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Get file system stats
|
|
|
|
* N.B. After this call fhp needs an fh_put
|
|
|
|
*/
|
2006-10-20 13:28:58 +07:00
|
|
|
__be32
|
2008-08-08 00:00:20 +07:00
|
|
|
nfsd_statfs(struct svc_rqst *rqstp, struct svc_fh *fhp, struct kstatfs *stat, int access)
|
2005-04-17 05:20:36 +07:00
|
|
|
{
|
2010-07-07 23:53:11 +07:00
|
|
|
__be32 err;
|
|
|
|
|
|
|
|
err = fh_verify(rqstp, fhp, 0, NFSD_MAY_NOP | access);
|
2010-08-13 20:53:49 +07:00
|
|
|
if (!err) {
|
|
|
|
struct path path = {
|
|
|
|
.mnt = fhp->fh_export->ex_path.mnt,
|
|
|
|
.dentry = fhp->fh_dentry,
|
|
|
|
};
|
|
|
|
if (vfs_statfs(&path, stat))
|
|
|
|
err = nfserr_io;
|
|
|
|
}
|
2005-04-17 05:20:36 +07:00
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
2007-07-19 15:49:20 +07:00
|
|
|
static int exp_rdonly(struct svc_rqst *rqstp, struct svc_export *exp)
|
2007-07-19 15:49:20 +07:00
|
|
|
{
|
2007-07-19 15:49:20 +07:00
|
|
|
return nfsexp_flags(rqstp, exp) & NFSEXP_READONLY;
|
2007-07-19 15:49:20 +07:00
|
|
|
}
|
|
|
|
|
2005-04-17 05:20:36 +07:00
|
|
|
/*
|
|
|
|
* Check for a user's access permissions to this inode.
|
|
|
|
*/
|
2006-10-20 13:28:58 +07:00
|
|
|
__be32
|
2007-07-17 18:04:48 +07:00
|
|
|
nfsd_permission(struct svc_rqst *rqstp, struct svc_export *exp,
|
|
|
|
struct dentry *dentry, int acc)
|
2005-04-17 05:20:36 +07:00
|
|
|
{
|
2015-03-18 05:25:59 +07:00
|
|
|
struct inode *inode = d_inode(dentry);
|
2005-04-17 05:20:36 +07:00
|
|
|
int err;
|
|
|
|
|
2011-04-10 21:35:12 +07:00
|
|
|
if ((acc & NFSD_MAY_MASK) == NFSD_MAY_NOP)
|
2005-04-17 05:20:36 +07:00
|
|
|
return 0;
|
|
|
|
#if 0
|
|
|
|
dprintk("nfsd: permission 0x%x%s%s%s%s%s%s%s mode 0%o%s%s%s\n",
|
|
|
|
acc,
|
2008-06-16 18:20:29 +07:00
|
|
|
(acc & NFSD_MAY_READ)? " read" : "",
|
|
|
|
(acc & NFSD_MAY_WRITE)? " write" : "",
|
|
|
|
(acc & NFSD_MAY_EXEC)? " exec" : "",
|
|
|
|
(acc & NFSD_MAY_SATTR)? " sattr" : "",
|
|
|
|
(acc & NFSD_MAY_TRUNC)? " trunc" : "",
|
|
|
|
(acc & NFSD_MAY_LOCK)? " lock" : "",
|
|
|
|
(acc & NFSD_MAY_OWNER_OVERRIDE)? " owneroverride" : "",
|
2005-04-17 05:20:36 +07:00
|
|
|
inode->i_mode,
|
|
|
|
IS_IMMUTABLE(inode)? " immut" : "",
|
|
|
|
IS_APPEND(inode)? " append" : "",
|
2008-02-16 05:37:56 +07:00
|
|
|
__mnt_is_readonly(exp->ex_path.mnt)? " ro" : "");
|
2005-04-17 05:20:36 +07:00
|
|
|
dprintk(" owner %d/%d user %d/%d\n",
|
2008-11-14 06:38:58 +07:00
|
|
|
inode->i_uid, inode->i_gid, current_fsuid(), current_fsgid());
|
2005-04-17 05:20:36 +07:00
|
|
|
#endif
|
|
|
|
|
|
|
|
/* Normally we reject any write/sattr etc access on a read-only file
|
|
|
|
* system. But if it is IRIX doing check on write-access for a
|
|
|
|
* device special file, we ignore rofs.
|
|
|
|
*/
|
2008-06-16 18:20:29 +07:00
|
|
|
if (!(acc & NFSD_MAY_LOCAL_ACCESS))
|
|
|
|
if (acc & (NFSD_MAY_WRITE | NFSD_MAY_SATTR | NFSD_MAY_TRUNC)) {
|
2008-02-16 05:37:56 +07:00
|
|
|
if (exp_rdonly(rqstp, exp) ||
|
|
|
|
__mnt_is_readonly(exp->ex_path.mnt))
|
2005-04-17 05:20:36 +07:00
|
|
|
return nfserr_rofs;
|
2008-06-16 18:20:29 +07:00
|
|
|
if (/* (acc & NFSD_MAY_WRITE) && */ IS_IMMUTABLE(inode))
|
2005-04-17 05:20:36 +07:00
|
|
|
return nfserr_perm;
|
|
|
|
}
|
2008-06-16 18:20:29 +07:00
|
|
|
if ((acc & NFSD_MAY_TRUNC) && IS_APPEND(inode))
|
2005-04-17 05:20:36 +07:00
|
|
|
return nfserr_perm;
|
|
|
|
|
2008-06-16 18:20:29 +07:00
|
|
|
if (acc & NFSD_MAY_LOCK) {
|
2005-04-17 05:20:36 +07:00
|
|
|
/* If we cannot rely on authentication in NLM requests,
|
|
|
|
* just allow locks, otherwise require read permission, or
|
|
|
|
* ownership
|
|
|
|
*/
|
|
|
|
if (exp->ex_flags & NFSEXP_NOAUTHNLM)
|
|
|
|
return 0;
|
|
|
|
else
|
2008-06-16 18:20:29 +07:00
|
|
|
acc = NFSD_MAY_READ | NFSD_MAY_OWNER_OVERRIDE;
|
2005-04-17 05:20:36 +07:00
|
|
|
}
|
|
|
|
/*
|
|
|
|
* The file owner always gets access permission for accesses that
|
|
|
|
* would normally be checked at open time. This is to make
|
|
|
|
* file access work even when the client has done a fchmod(fd, 0).
|
|
|
|
*
|
|
|
|
* However, `cp foo bar' should fail nevertheless when bar is
|
|
|
|
* readonly. A sensible way to do this might be to reject all
|
|
|
|
* attempts to truncate a read-only file, because a creat() call
|
|
|
|
* always implies file truncation.
|
|
|
|
* ... but this isn't really fair. A process may reasonably call
|
|
|
|
* ftruncate on an open file descriptor on a file with perm 000.
|
|
|
|
* We must trust the client to do permission checking - using "ACCESS"
|
|
|
|
* with NFSv3.
|
|
|
|
*/
|
2008-06-16 18:20:29 +07:00
|
|
|
if ((acc & NFSD_MAY_OWNER_OVERRIDE) &&
|
2013-02-02 21:53:11 +07:00
|
|
|
uid_eq(inode->i_uid, current_fsuid()))
|
2005-04-17 05:20:36 +07:00
|
|
|
return 0;
|
|
|
|
|
2008-06-16 18:20:29 +07:00
|
|
|
/* This assumes NFSD_MAY_{READ,WRITE,EXEC} == MAY_{READ,WRITE,EXEC} */
|
2008-07-22 11:07:17 +07:00
|
|
|
err = inode_permission(inode, acc & (MAY_READ|MAY_WRITE|MAY_EXEC));
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
/* Allow read access to binaries even when mode 111 */
|
|
|
|
if (err == -EACCES && S_ISREG(inode->i_mode) &&
|
2011-08-25 21:48:39 +07:00
|
|
|
(acc == (NFSD_MAY_READ | NFSD_MAY_OWNER_OVERRIDE) ||
|
|
|
|
acc == (NFSD_MAY_READ | NFSD_MAY_READ_IF_EXEC)))
|
2008-07-22 11:07:17 +07:00
|
|
|
err = inode_permission(inode, MAY_EXEC);
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
return err? nfserrno(err) : 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
void
|
|
|
|
nfsd_racache_shutdown(void)
|
|
|
|
{
|
2008-08-14 09:03:27 +07:00
|
|
|
struct raparms *raparm, *last_raparm;
|
|
|
|
unsigned int i;
|
|
|
|
|
2005-04-17 05:20:36 +07:00
|
|
|
dprintk("nfsd: freeing readahead buffers.\n");
|
2008-08-14 09:03:27 +07:00
|
|
|
|
|
|
|
for (i = 0; i < RAPARM_HASH_SIZE; i++) {
|
|
|
|
raparm = raparm_hash[i].pb_head;
|
|
|
|
while(raparm) {
|
|
|
|
last_raparm = raparm;
|
|
|
|
raparm = raparm->p_next;
|
|
|
|
kfree(last_raparm);
|
|
|
|
}
|
|
|
|
raparm_hash[i].pb_head = NULL;
|
|
|
|
}
|
2005-04-17 05:20:36 +07:00
|
|
|
}
|
|
|
|
/*
|
|
|
|
* Initialize readahead param cache
|
|
|
|
*/
|
|
|
|
int
|
|
|
|
nfsd_racache_init(int cache_size)
|
|
|
|
{
|
|
|
|
int i;
|
2006-10-04 16:15:49 +07:00
|
|
|
int j = 0;
|
|
|
|
int nperbucket;
|
2008-08-14 09:03:27 +07:00
|
|
|
struct raparms **raparm = NULL;
|
2005-04-17 05:20:36 +07:00
|
|
|
|
2006-10-04 16:15:49 +07:00
|
|
|
|
2008-08-14 09:03:27 +07:00
|
|
|
if (raparm_hash[0].pb_head)
|
2005-04-17 05:20:36 +07:00
|
|
|
return 0;
|
2008-08-14 09:03:27 +07:00
|
|
|
nperbucket = DIV_ROUND_UP(cache_size, RAPARM_HASH_SIZE);
|
2014-06-10 17:08:19 +07:00
|
|
|
nperbucket = max(2, nperbucket);
|
2008-08-14 09:03:27 +07:00
|
|
|
cache_size = nperbucket * RAPARM_HASH_SIZE;
|
2006-12-08 17:39:41 +07:00
|
|
|
|
|
|
|
dprintk("nfsd: allocating %d readahead buffers.\n", cache_size);
|
2008-08-14 09:03:27 +07:00
|
|
|
|
|
|
|
for (i = 0; i < RAPARM_HASH_SIZE; i++) {
|
2006-12-08 17:39:41 +07:00
|
|
|
spin_lock_init(&raparm_hash[i].pb_lock);
|
2008-08-14 09:03:27 +07:00
|
|
|
|
|
|
|
raparm = &raparm_hash[i].pb_head;
|
|
|
|
for (j = 0; j < nperbucket; j++) {
|
|
|
|
*raparm = kzalloc(sizeof(struct raparms), GFP_KERNEL);
|
|
|
|
if (!*raparm)
|
|
|
|
goto out_nomem;
|
|
|
|
raparm = &(*raparm)->p_next;
|
|
|
|
}
|
|
|
|
*raparm = NULL;
|
2006-12-08 17:39:41 +07:00
|
|
|
}
|
|
|
|
|
2005-04-17 05:20:36 +07:00
|
|
|
nfsdstats.ra_size = cache_size;
|
|
|
|
return 0;
|
2008-08-14 09:03:27 +07:00
|
|
|
|
|
|
|
out_nomem:
|
|
|
|
dprintk("nfsd: kmalloc failed, freeing readahead buffers\n");
|
|
|
|
nfsd_racache_shutdown();
|
|
|
|
return -ENOMEM;
|
2005-04-17 05:20:36 +07:00
|
|
|
}
|