Pull misc vfs cleanups from Al Viro:
"Assorted cleanups and fixes all over the place"
* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
coredump: only charge written data against RLIMIT_CORE
coredump: get rid of coredump_params->written
ecryptfs_lookup(): try either only encrypted or plaintext name
ecryptfs: avoid multiple aliases for directories
bpf: reject invalid names right in ->lookup()
__d_alloc(): treat NULL name as QSTR("/", 1)
mtd: switch ubi_open_volume_path() to vfs_stat()
mtd: switch open_mtd_by_chdev() to use of vfs_stat()
Pull parallel filesystem directory handling update from Al Viro.
This is the main parallel directory work by Al that makes the vfs layer
able to do lookup and readdir in parallel within a single directory.
That's a big change, since this used to be all protected by the
directory inode mutex.
The inode mutex is replaced by an rwsem, and serialization of lookups of
a single name is done by a "in-progress" dentry marker.
The series begins with xattr cleanups, and then ends with switching
filesystems over to actually doing the readdir in parallel (switching to
the "iterate_shared()" that only takes the read lock).
A more detailed explanation of the process from Al Viro:
"The xattr work starts with some acl fixes, then switches ->getxattr to
passing inode and dentry separately. This is the point where the
things start to get tricky - that got merged into the very beginning
of the -rc3-based #work.lookups, to allow untangling the
security_d_instantiate() mess. The xattr work itself proceeds to
switch a lot of filesystems to generic_...xattr(); no complications
there.
After that initial xattr work, the series then does the following:
- untangle security_d_instantiate()
- convert a bunch of open-coded lookup_one_len_unlocked() to calls of
that thing; one such place (in overlayfs) actually yields a trivial
conflict with overlayfs fixes later in the cycle - overlayfs ended
up switching to a variant of lookup_one_len_unlocked() sans the
permission checks. I would've dropped that commit (it gets
overridden on merge from #ovl-fixes in #for-next; proper resolution
is to use the variant in mainline fs/overlayfs/super.c), but I
didn't want to rebase the damn thing - it was fairly late in the
cycle...
- some filesystems had managed to depend on lookup/lookup exclusion
for *fs-internal* data structures in a way that would break if we
relaxed the VFS exclusion. Fixing hadn't been hard, fortunately.
- core of that series - parallel lookup machinery, replacing
->i_mutex with rwsem, making lookup_slow() take it only shared. At
that point lookups happen in parallel; lookups on the same name
wait for the in-progress one to be done with that dentry.
Surprisingly little code, at that - almost all of it is in
fs/dcache.c, with fs/namei.c changes limited to lookup_slow() -
making it use the new primitive and actually switching to locking
shared.
- parallel readdir stuff - first of all, we provide the exclusion on
per-struct file basis, same as we do for read() vs lseek() for
regular files. That takes care of most of the needed exclusion in
readdir/readdir; however, these guys are trickier than lookups, so
I went for switching them one-by-one. To do that, a new method
'->iterate_shared()' is added and filesystems are switched to it
as they are either confirmed to be OK with shared lock on directory
or fixed to be OK with that. I hope to kill the original method
come next cycle (almost all in-tree filesystems are switched
already), but it's still not quite finished.
- several filesystems get switched to parallel readdir. The
interesting part here is dealing with dcache preseeding by readdir;
that needs minor adjustment to be safe with directory locked only
shared.
Most of the filesystems doing that got switched to in those
commits. Important exception: NFS. Turns out that NFS folks, with
their, er, insistence on VFS getting the fuck out of the way of the
Smart Filesystem Code That Knows How And What To Lock(tm) have
grown the locking of their own. They had their own homegrown
rwsem, with lookup/readdir/atomic_open being *writers* (sillyunlink
is the reader there). Of course, with VFS getting the fuck out of
the way, as requested, the actual smarts of the smart filesystem
code etc. had become exposed...
- do_last/lookup_open/atomic_open cleanups. As the result, open()
without O_CREAT locks the directory only shared. Including the
->atomic_open() case. Backmerge from #for-linus in the middle of
that - atomic_open() fix got brought in.
- then comes NFS switch to saner (VFS-based ;-) locking, killing the
homegrown "lookup and readdir are writers" kinda-sorta rwsem. All
exclusion for sillyunlink/lookup is done by the parallel lookups
mechanism. Exclusion between sillyunlink and rmdir is a real rwsem
now - rmdir being the writer.
Result: NFS lookups/readdirs/O_CREAT-less opens happen in parallel
now.
- the rest of the series consists of switching a lot of filesystems
to parallel readdir; in a lot of cases ->llseek() gets simplified
as well. One backmerge in there (again, #for-linus - rockridge
fix)"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (74 commits)
ext4: switch to ->iterate_shared()
hfs: switch to ->iterate_shared()
hfsplus: switch to ->iterate_shared()
hostfs: switch to ->iterate_shared()
hpfs: switch to ->iterate_shared()
hpfs: handle allocation failures in hpfs_add_pos()
gfs2: switch to ->iterate_shared()
f2fs: switch to ->iterate_shared()
afs: switch to ->iterate_shared()
befs: switch to ->iterate_shared()
befs: constify stuff a bit
isofs: switch to ->iterate_shared()
get_acorn_filename(): deobfuscate a bit
btrfs: switch to ->iterate_shared()
logfs: no need to lock directory in lseek
switch ecryptfs to ->iterate_shared
9p: switch to ->iterate_shared()
fat: switch to ->iterate_shared()
romfs, squashfs: switch to ->iterate_shared()
more trivial ->iterate_shared conversions
...
Pull crypto update from Herbert Xu:
"API:
- Crypto self tests can now be disabled at boot/run time.
- Add async support to algif_aead.
Algorithms:
- A large number of fixes to MPI from Nicolai Stange.
- Performance improvement for HMAC DRBG.
Drivers:
- Use generic crypto engine in omap-des.
- Merge ppc4xx-rng and crypto4xx drivers.
- Fix lockups in sun4i-ss driver by disabling IRQs.
- Add DMA engine support to ccp.
- Reenable talitos hash algorithms.
- Add support for Hisilicon SoC RNG.
- Add basic crypto driver for the MXC SCC.
Others:
- Do not allocate crypto hash tfm in NORECLAIM context in ecryptfs"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (77 commits)
crypto: qat - change the adf_ctl_stop_devices to void
crypto: caam - fix caam_jr_alloc() ret code
crypto: vmx - comply with ABIs that specify vrsave as reserved.
crypto: testmgr - Add a flag allowing the self-tests to be disabled at runtime.
crypto: ccp - constify ccp_actions structure
crypto: marvell/cesa - Use dma_pool_zalloc
crypto: qat - make adf_vf_isr.c dependant on IOV config
crypto: qat - Fix typo in comments
lib: asn1_decoder - add MODULE_LICENSE("GPL")
crypto: omap-sham - Use dma_request_chan() for requesting DMA channel
crypto: omap-des - Use dma_request_chan() for requesting DMA channel
crypto: omap-aes - Use dma_request_chan() for requesting DMA channel
crypto: omap-des - Integrate with the crypto engine framework
crypto: s5p-sss - fix incorrect usage of scatterlists api
crypto: s5p-sss - Fix missed interrupts when working with 8 kB blocks
crypto: s5p-sss - Use common BIT macro
crypto: mxc-scc - fix unwinding in mxc_scc_crypto_register()
crypto: mxc-scc - signedness bugs in mxc_scc_ablkcipher_req_init()
crypto: talitos - fix ahash algorithms registration
crypto: ccp - Ensure all dependencies are specified
...
You cannot allocate crypto tfm objects in NORECLAIM or NOFS contexts.
The ecryptfs code currently does exactly that for the MD5 tfm.
This patch fixes it by preallocating the MD5 tfm in a safe context.
The MD5 tfm is also reentrant so this patch removes the superfluous
cs_hash_tfm_mutex.
Reported-by: Nicolas Boichat <drinkcat@chromium.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Mostly direct substitution with occasional adjustment or removing
outdated comments.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.
This promise never materialized. And unlikely will.
We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE. And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.
Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.
Let's stop pretending that pages in page cache are special. They are
not.
The changes are pretty straight-forward:
- <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
- page_cache_get() -> get_page();
- page_cache_release() -> put_page();
This patch contains automated changes generated with coccinelle using
script below. For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.
The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.
There are few places in the code where coccinelle didn't reach. I'll
fix them manually in a separate patch. Comments and documentation also
will be addressed with the separate patch.
virtual patch
@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT
@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE
@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK
@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)
@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)
@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ecryptfs_lookup_interpose should use d_splice_alias(), not d_add()
(and return struct dentry * rather than int). Get rid of
redundant dir_inode argument, while we are touching it...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Pull vfs updates from Al Viro:
- Preparations of parallel lookups (the remaining main obstacle is the
need to move security_d_instantiate(); once that becomes safe, the
rest will be a matter of rather short series local to fs/*.c
- preadv2/pwritev2 series from Christoph
- assorted fixes
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (32 commits)
splice: handle zero nr_pages in splice_to_pipe()
vfs: show_vfsstat: do not ignore errors from show_devname method
dcache.c: new helper: __d_add()
don't bother with __d_instantiate(dentry, NULL)
untangle fsnotify_d_instantiate() a bit
uninline d_add()
replace d_add_unique() with saner primitive
quota: use lookup_one_len_unlocked()
cifs_get_root(): use lookup_one_len_unlocked()
nfs_lookup: don't bother with d_instantiate(dentry, NULL)
kill dentry_unhash()
ceph_fill_trace(): don't bother with d_instantiate(dn, NULL)
autofs4: don't bother with d_instantiate(dentry, NULL) in ->lookup()
configfs: move d_rehash() into configfs_create() for regular files
ceph: don't bother with d_rehash() in splice_dentry()
namei: teach lookup_slow() to skip revalidate
namei: massage lookup_slow() to be usable by lookup_one_len_unlocked()
lookup_one_len_unlocked(): use lookup_dcache()
namei: simplify invalidation logics in lookup_dcache()
namei: change calling conventions for lookup_{fast,slow} and follow_managed()
...
This patch replaces uses of ablkcipher and blkcipher with skcipher,
and the long obsolete hash interface with shash.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
inode_foo(inode) being mutex_foo(&inode->i_mutex).
Please, use those for access to ->i_mutex; over the coming cycle
->i_mutex will become rwsem, with ->lookup() done with it held
only shared.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Pull misc vfs updates from Al Viro:
"All kinds of stuff. That probably should've been 5 or 6 separate
branches, but by the time I'd realized how large and mixed that bag
had become it had been too close to -final to play with rebasing.
Some fs/namei.c cleanups there, memdup_user_nul() introduction and
switching open-coded instances, burying long-dead code, whack-a-mole
of various kinds, several new helpers for ->llseek(), assorted
cleanups and fixes from various people, etc.
One piece probably deserves special mention - Neil's
lookup_one_len_unlocked(). Similar to lookup_one_len(), but gets
called without ->i_mutex and tries to avoid ever taking it. That, of
course, means that it's not useful for any directory modifications,
but things like getting inode attributes in nfds readdirplus are fine
with that. I really should've asked for moratorium on lookup-related
changes this cycle, but since I hadn't done that early enough... I
*am* asking for that for the coming cycle, though - I'm going to try
and get conversion of i_mutex to rwsem with ->lookup() done under lock
taken shared.
There will be a patch closer to the end of the window, along the lines
of the one Linus had posted last May - mechanical conversion of
->i_mutex accesses to inode_lock()/inode_unlock()/inode_trylock()/
inode_is_locked()/inode_lock_nested(). To quote Linus back then:
-----
| This is an automated patch using
|
| sed 's/mutex_lock(&\(.*\)->i_mutex)/inode_lock(\1)/'
| sed 's/mutex_unlock(&\(.*\)->i_mutex)/inode_unlock(\1)/'
| sed 's/mutex_lock_nested(&\(.*\)->i_mutex,[ ]*I_MUTEX_\([A-Z0-9_]*\))/inode_lock_nested(\1, I_MUTEX_\2)/'
| sed 's/mutex_is_locked(&\(.*\)->i_mutex)/inode_is_locked(\1)/'
| sed 's/mutex_trylock(&\(.*\)->i_mutex)/inode_trylock(\1)/'
|
| with a very few manual fixups
-----
I'm going to send that once the ->i_mutex-affecting stuff in -next
gets mostly merged (or when Linus says he's about to stop taking
merges)"
* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
nfsd: don't hold i_mutex over userspace upcalls
fs:affs:Replace time_t with time64_t
fs/9p: use fscache mutex rather than spinlock
proc: add a reschedule point in proc_readfd_common()
logfs: constify logfs_block_ops structures
fcntl: allow to set O_DIRECT flag on pipe
fs: __generic_file_splice_read retry lookup on AOP_TRUNCATED_PAGE
fs: xattr: Use kvfree()
[s390] page_to_phys() always returns a multiple of PAGE_SIZE
nbd: use ->compat_ioctl()
fs: use block_device name vsprintf helper
lib/vsprintf: add %*pg format specifier
fs: use gendisk->disk_name where possible
poll: plug an unused argument to do_poll
amdkfd: don't open-code memdup_user()
cdrom: don't open-code memdup_user()
rsxx: don't open-code memdup_user()
mtip32xx: don't open-code memdup_user()
[um] mconsole: don't open-code memdup_user_nul()
[um] hostaudio: don't open-code memdup_user()
...
new method: ->get_link(); replacement of ->follow_link(). The differences
are:
* inode and dentry are passed separately
* might be called both in RCU and non-RCU mode;
the former is indicated by passing it a NULL dentry.
* when called that way it isn't allowed to block
and should return ERR_PTR(-ECHILD) if it needs to be called
in non-RCU mode.
It's a flagday change - the old method is gone, all in-tree instances
converted. Conversion isn't hard; said that, so far very few instances
do not immediately bail out when called in RCU mode. That'll change
in the next commits.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
IS_ERR(_OR_NULL) already contain an 'unlikely' compiler flag and there
is no need to do that again from its callers. Drop it.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Jeff Layton <jlayton@poochiereds.net>
Reviewed-by: David Howells <dhowells@redhat.com>
Reviewed-by: Steve French <smfrench@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
a) instead of storing the symlink body (via nd_set_link()) and returning
an opaque pointer later passed to ->put_link(), ->follow_link() _stores_
that opaque pointer (into void * passed by address by caller) and returns
the symlink body. Returning ERR_PTR() on error, NULL on jump (procfs magic
symlinks) and pointer to symlink body for normal symlinks. Stored pointer
is ignored in all cases except the last one.
Storing NULL for opaque pointer (or not storing it at all) means no call
of ->put_link().
b) the body used to be passed to ->put_link() implicitly (via nameidata).
Now only the opaque pointer is. In the cases when we used the symlink body
to free stuff, ->follow_link() now should store it as opaque pointer in addition
to returning it.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
that's the bulk of filesystem drivers dealing with inodes of their own
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Convert the following where appropriate:
(1) S_ISLNK(dentry->d_inode) to d_is_symlink(dentry).
(2) S_ISREG(dentry->d_inode) to d_is_reg(dentry).
(3) S_ISDIR(dentry->d_inode) to d_is_dir(dentry). This is actually more
complicated than it appears as some calls should be converted to
d_can_lookup() instead. The difference is whether the directory in
question is a real dir with a ->lookup op or whether it's a fake dir with
a ->d_automount op.
In some circumstances, we can subsume checks for dentry->d_inode not being
NULL into this, provided we the code isn't in a filesystem that expects
d_inode to be NULL if the dirent really *is* negative (ie. if we're going to
use d_inode() rather than d_backing_inode() to get the inode pointer).
Note that the dentry type field may be set to something other than
DCACHE_MISS_TYPE when d_inode is NULL in the case of unionmount, where the VFS
manages the fall-through from a negative dentry to a lower layer. In such a
case, the dentry type of the negative union dentry is set to the same as the
type of the lower dentry.
However, if you know d_inode is not NULL at the call site, then you can use
the d_is_xxx() functions even in a filesystem.
There is one further complication: a 0,0 chardev dentry may be labelled
DCACHE_WHITEOUT_TYPE rather than DCACHE_SPECIAL_TYPE. Strictly, this was
intended for special directory entry types that don't have attached inodes.
The following perl+coccinelle script was used:
use strict;
my @callers;
open($fd, 'git grep -l \'S_IS[A-Z].*->d_inode\' |') ||
die "Can't grep for S_ISDIR and co. callers";
@callers = <$fd>;
close($fd);
unless (@callers) {
print "No matches\n";
exit(0);
}
my @cocci = (
'@@',
'expression E;',
'@@',
'',
'- S_ISLNK(E->d_inode->i_mode)',
'+ d_is_symlink(E)',
'',
'@@',
'expression E;',
'@@',
'',
'- S_ISDIR(E->d_inode->i_mode)',
'+ d_is_dir(E)',
'',
'@@',
'expression E;',
'@@',
'',
'- S_ISREG(E->d_inode->i_mode)',
'+ d_is_reg(E)' );
my $coccifile = "tmp.sp.cocci";
open($fd, ">$coccifile") || die $coccifile;
print($fd "$_\n") || die $coccifile foreach (@cocci);
close($fd);
foreach my $file (@callers) {
chomp $file;
print "Processing ", $file, "\n";
system("spatch", "--sp-file", $coccifile, $file, "--in-place", "--no-show-diff") == 0 ||
die "spatch failed";
}
[AV: overlayfs parts skipped]
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Now that we never use the backing_dev_info pointer in struct address_space
we can simply remove it and save 4 to 8 bytes in every inode.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Reviewed-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABCgAGBQJUNtXXAAoJENaSAD2qAscKx1UP/jqt4gm1AYFBpkwBhVRUssIQ
wHckk8QPasIdEGeKvyCKXl88sUDLSsJwf/mUpl8pBfKm64LokP2fmUBU9Pkf9hVU
lcuaFNIEmHh8p1IqcfaFbnZOjuuHc9M2ULQLmo5ShoTHNu6JYAP2zRBMfFrEdcMR
vKh+RCARa5jr1CdwTHX+dH5vJQIQXW/qgRK5G5Z6KeBI766jK2BvZxYHivUgLWuC
dV6K4RzvHHJYEVoXjCUhrgGepGwHlDoEgx/Y0GK9vbFPG38IrfSlN6fgvzV6mMYE
ien8FsuHPv5oiBmFM2byRmWpJtWUOViCbMaMmKqY5Ix21E0lUafA7ixH3nSpOHNZ
b29dxmHnEDomCJXLAF0NQUE84yTw6ITLp2FldUR2o+sidnJsDx/hph/KvmsK6d6P
sDfEN/DtzPluZoXKY0jrRtoAhi0citNTgKfrujmx6baBstxRp7AfwGcP4skXJq7w
wkg0Seo449CUaBJK9A4s7nIjMBQ6/3hjvF/NVcry1aY+/RyhJDz6uFrtnOEqW3So
6khwJOOcRkmLLXrk+DzvthDRCVNJhWM80cB5/UBBfVG2Kk3PHa86Jfp5Y4teWugx
zyoacEWiitKcK4Po8skZpLd2vUwny/cD0qS43LT+SMvgRsYZ4XdUnceoY8FkOnLr
ooIFKHCR/+2XVnAaC101
=mjS7
-----END PGP SIGNATURE-----
Merge tag 'ecryptfs-3.18-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs
Pull eCryptfs updates from Tyler Hicks:
"Minor code cleanups and a fix for when eCryptfs metadata is stored in
xattrs"
* tag 'ecryptfs-3.18-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs:
ecryptfs: remove unneeded buggy code in ecryptfs_do_create()
ecryptfs: avoid to access NULL pointer when write metadata in xattr
ecryptfs: remove unnecessary break after goto
ecryptfs: Remove unnecessary include of syscall.h in keystore.c
fs/ecryptfs/messaging.c: remove null test before kfree
ecryptfs: Drop cast
Use %pd in eCryptFS
There is a bug in error handling of lock_parent() in ecryptfs_do_create():
lock_parent() acquries mutex even if dget_parent() fails, so mutex should be unlocked anyway.
But dget_parent() does not fail, so the patch just removes unneeded buggy code.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
This patch does away with cast on void * and the if as it is unnecessary.
The following Coccinelle semantic patch was used for making the change:
@r@
expression x;
void* e;
type T;
identifier f;
@@
(
*((T *)e)
|
((T *)x)[...]
|
((T *)x)->f
|
- (T *)
e
)
Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Add new renameat2 syscall, which is the same as renameat with an added
flags argument.
Pass flags to vfs_rename() and to i_op->rename() as well.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Reviewed-by: J. Bruce Fields <bfields@redhat.com>
If ecryptfs_readlink_lower() fails, buf remains an uninitialized
pointer and passing it nd_set_link() won't do anything good.
Fixed by switching ecryptfs_readlink_lower() to saner API - make it
return buf or ERR_PTR(...) and update callers.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Use the new %pd printk() specifier in eCryptFS to replace passing of dentry
name or dentry name and name length * 2 with just passing the dentry.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: ecryptfs@vger.kernel.org
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
NFSv4 uses leases to guarantee that clients can cache metadata as well
as data.
Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Cc: David Howells <dhowells@redhat.com>
Cc: Tyler Hicks <tyhicks@canonical.com>
Cc: Dustin Kirkland <dustin.kirkland@gazzang.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: David Howells <dhowells@redhat.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
We need to break delegations on any operation that changes the set of
links pointing to an inode. Start with unlink.
Such operations also hold the i_mutex on a parent directory. Breaking a
delegation may require waiting for a timeout (by default 90 seconds) in
the case of a unresponsive NFS client. To avoid blocking all directory
operations, we therefore drop locks before waiting for the delegation.
The logic then looks like:
acquire locks
...
test for delegation; if found:
take reference on inode
release locks
wait for delegation break
drop reference on inode
retry
It is possible this could never terminate. (Even if we take precautions
to prevent another delegation being acquired on the same inode, we could
get a different inode on each retry.) But this seems very unlikely.
The initial test for a delegation happens after the lock on the target
inode is acquired, but the directory inode may have been acquired
further up the call stack. We therefore add a "struct inode **"
argument to any intervening functions, which we use to pass the inode
back up to the caller in the case it needs a delegation synchronously
broken.
Cc: David Howells <dhowells@redhat.com>
Cc: Tyler Hicks <tyhicks@canonical.com>
Cc: Dustin Kirkland <dustin.kirkland@gazzang.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The code cleanups fix up W=1 compiler warnings and some unnecessary checks. The
new Kconfig option, defaulting to N, allows the rarely used eCryptfs kernel to
userspace communication channel to be compiled out. This may be the first step
in it being eventually removed.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABCgAGBQJRN+Z7AAoJENaSAD2qAscKr00P/0/sNgen9e5zqe1q+CAj6hW0
ynzWY/ZNk905hU6tmYb/rHwt7DfaSmrZuzypZP9sGbu+q9RITLl65Hm9HGEJuvJA
fK0UHejcAMQmf+AGZiiMs0SB4B4z+eAzUQTWZsX22C1u+3zyI5xLs1NBquKwDyeq
5sNbcmzQYn4w04xag3yYVQEow0NeIjjuCUc8gNUPctDQldN9DdFTdwFTar5lvC0s
V4qPWqa61mS9xtegryWAw4DNKjUIrZZFFupWPqRYDVYK8N+RQRBL1RWGVRFCJ17j
Ho8yi2onPFGt2y/kW6MwsC41wWFk0Mxsfxf/ZaBMm3lpfYM8UbGQJ6+V9wQWOokU
kioUcTI0WvK999mRLxUNkXuVuNDv0OUysgtALy5bevfneWrfXxoSKq+MPbyNfC7+
mo2BCIyHLXn7BYhzPTU+XfksPfMneYUi5LWf4Km5XYXlZ8rwk3IKvJQFyVThEv8+
peVvwSwblUHaoQLnFhEVeu4olHO6AdVQtwr53HPgpMPaZj2/vaWQNA4+bu5HZHTG
wqBmdo4DH4jgd9D8xiMZMIJTik8j9aUmpntc4eR7RJEKSice4+X1fUXL4n4N4NfD
FkYjWCUZI6nkFUGhGDCokCjzZ3GTEzbe+4pNi3ycTnywcOXFSoq2Kx+tNzE4zXBs
FlWGJYrCub9UOLwoYV2C
=XwgS
-----END PGP SIGNATURE-----
Merge tag 'ecryptfs-3.9-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs
Pull ecryptfs fixes from Tyler Hicks:
"Minor code cleanups and new Kconfig option to disable /dev/ecryptfs
The code cleanups fix up W=1 compiler warnings and some unnecessary
checks. The new Kconfig option, defaulting to N, allows the rarely
used eCryptfs kernel to userspace communication channel to be compiled
out. This may be the first step in it being eventually removed."
Hmm. I'm not sure whether these should be called "fixes", and it
probably should have gone in the merge window. But I'll let it slide.
* tag 'ecryptfs-3.9-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs:
eCryptfs: allow userspace messaging to be disabled
eCryptfs: Fix redundant error check on ecryptfs_find_daemon_by_euid()
ecryptfs: ecryptfs_msg_ctx_alloc_to_free(): remove kfree() redundant null check
eCryptfs: decrypt_pki_encrypted_session_key(): remove kfree() redundant null check
eCryptfs: remove unneeded checks in virt_to_scatterlist()
eCryptfs: Fix -Wmissing-prototypes warnings
eCryptfs: Fix -Wunused-but-set-variable warnings
eCryptfs: initialize payload_len in keystore.c
After calling into the lower filesystem to do a rename, the lower target
inode's attributes were not copied up to the eCryptfs target inode. This
resulted in the eCryptfs target inode staying around, rather than being
evicted, because i_nlink was not updated for the eCryptfs inode. This
also meant that eCryptfs didn't do the final iput() on the lower target
inode so it stayed around, as well. This would result in a failure to
free up space occupied by the target file in the rename() operation.
Both target inodes would eventually be evicted when the eCryptfs
filesystem was unmounted.
This patch calls fsstack_copy_attr_all() after the lower filesystem
does its ->rename() so that important inode attributes, such as i_nlink,
are updated at the eCryptfs layer. ecryptfs_evict_inode() is now called
and eCryptfs can drop its final reference on the lower inode.
http://launchpad.net/bugs/561129
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Tested-by: Colin Ian King <colin.king@canonical.com>
Cc: <stable@vger.kernel.org> [2.6.39+]
eCryptfs mount options do not
- Cleanups in the messaging code
- Better handling of empty files in the lower filesystem to improve usability.
Failed file creations are now cleaned up and empty lower files are converted
into eCryptfs during open().
- The write-through cache changes are being reverted due to bugs that are not
easy to fix. Stability outweighs the performance enhancements here.
- Improvement to the mount code to catch unsupported ciphers specified in the
mount options
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABCgAGBQJQGhVWAAoJENaSAD2qAscKjvcQAMgF66sl8KjwzFCgojh30a4u
1hCXltxUCLjXxjXxyygUHb/pH0xdvC+Ss3FTtsPXAjgYm2lXjjoOmVG8WAvwHHx1
2ZjDo+8fQ4XA8Rl9kYuvt/abF0IssNRK3csTWplR7lpoQ8AWbkpkag1I4WZhibey
cgs/zECl8ACTJ5zQ+AyRGnrssq4jI7xZAKWLK0+KKk7S9yIRI7K/xdz1xK39jGK6
N09Dw3VWY/bMcMq77ZXBtyHdP7KR7wKUtCeQttmCvdf20Ocy6AXzr1FRKvUxionF
Sf31tJim0u9OO8hmy8cjCyWEy9LHnXnSd/5vn+Qd9ok9GvuiYmKw07rbXi/gjhBX
ai5PKtl05WiQgp80BybUYfIY1Hq71MsppNi6h9Zgiid5rEvWWvCBWBWP95G8DTmC
6TwLaCG0rh8uuZyeiVrs3xZQ2IG5Zmu0CX3XGyfsaLvqmQWhtT5ZQVMeMQEEBxyQ
ur9SSU2O/nC8ceLB7fzGmZPTLZUWOuYQnd24NJNK+7j0P+Km7pqmDYpCdwmFpx3C
CQ0gGaJGHeycvBF327bwxPmsPdO4fy+nmEL8vrEXPTyQ3ZVAPvxZK9t8Jpk4UFOl
JSTWFiK0mvgE+5dX2kB0nzN7iD7hcMgmozht50qQP3OUkI1kkBrEHgHVO3KzgUIA
+aRAljeLdelq44JBxjiB
=4vDd
-----END PGP SIGNATURE-----
Merge tag 'ecryptfs-3.6-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs
Pull ecryptfs fixes from Tyler Hicks:
- Fixes a bug when the lower filesystem mount options include 'acl',
but the eCryptfs mount options do not
- Cleanups in the messaging code
- Better handling of empty files in the lower filesystem to improve
usability. Failed file creations are now cleaned up and empty lower
files are converted into eCryptfs during open().
- The write-through cache changes are being reverted due to bugs that
are not easy to fix. Stability outweighs the performance
enhancements here.
- Improvement to the mount code to catch unsupported ciphers specified
in the mount options
* tag 'ecryptfs-3.6-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs:
eCryptfs: check for eCryptfs cipher support at mount
eCryptfs: Revert to a writethrough cache model
eCryptfs: Initialize empty lower files when opening them
eCryptfs: Unlink lower inode when ecryptfs_create() fails
eCryptfs: Make all miscdev functions use daemon ptr in file private_data
eCryptfs: Remove unused messaging declarations and function
eCryptfs: Copy up POSIX ACL and read-only flags from lower mount
* ->lookup() never gets hit with . or ..
* dentry it gets is unhashed, so unless we had gone and hashed it ourselves, there's
no need to d_drop() the sucker.
* wrong name printed in one of the printks (NULL, in fact)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>