linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-12 07:06:40 +07:00

Author	SHA1	Message	Date
Al Viro	31956502dd	namei: make may_follow_link() safe in RCU mode We can't call that audit garbage in RCU mode - it's doing a weird mix of allocations (GFP_NOFS, immediately followed by GFP_KERNEL) and I'm not touching that... thing again. So if this security sclero^Whardening feature gets triggered when we are in RCU mode, tough - we'll fail with -ECHILD and have everything restarted in non-RCU mode. Only to hit the same test and fail, this time with EACCES and with (oh, rapture) an audit spew produced. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-11 08:13:13 -04:00
Al Viro	6548fae2ec	namei: make put_link() RCU-safe very simple - just make path_put() conditional on !RCU. Note that right now it doesn't get called in RCU mode - we leave it before getting anything into stack. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-11 08:13:13 -04:00
Al Viro	5f2c4179e1	switch ->put_link() from dentry to inode only one instance looks at that argument at all; that sole exception wants inode rather than dentry. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-11 08:13:12 -04:00
NeilBrown	bda0be7ad9	security: make inode_follow_link RCU-walk aware inode_follow_link now takes an inode and rcu flag as well as the dentry. inode is used in preference to d_backing_inode(dentry), particularly in RCU-walk mode. selinux_inode_follow_link() gets dentry_has_perm() and inode_has_perm() open-coded into it so that it can call avc_has_perm_flags() in way that is safe if LOOKUP_RCU is set. Calling avc_has_perm_flags() with rcu_read_lock() held means that when avc_has_perm_noaudit calls avc_compute_av(), the attempt to rcu_read_unlock() before calling security_compute_av() will not actually drop the RCU read-lock. However as security_compute_av() is completely in a read_lock()ed region, it should be safe with the RCU read-lock held. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-11 08:13:11 -04:00
Al Viro	181548c051	namei: pick_link() callers already have inode no need to refetch (and once we move unlazy out of there, recheck ->d_seq). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-11 08:13:10 -04:00
David Howells	63afdfc781	VFS: Handle lower layer dentry/inode in pathwalk Make use of d_backing_inode() in pathwalk to gain access to an inode or dentry that's on a lower layer. Signed-off-by: David Howells <dhowells@redhat.com>	2015-05-11 08:13:10 -04:00
Al Viro	237d8b327a	namei: store inode in nd->stack[] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-11 08:13:09 -04:00
Al Viro	254cf58212	namei: don't mangle nd->seq in lookup_fast() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-11 08:13:09 -04:00
Al Viro	6e9918b7b3	namei: explicitly pass seq number to unlazy_walk() when dentry != NULL Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-11 08:13:09 -04:00
Al Viro	3595e2346c	link_path_walk: use explicit returns for failure exits Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-11 08:13:08 -04:00
Al Viro	deb106c632	namei: lift terminate_walk() all the way up Lift it from link_path_walk(), trailing_symlink(), lookup_last(), mountpoint_last(), complete_walk() and do_last(). A _lot_ of those suckers merge. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-11 08:13:08 -04:00
Al Viro	3bdba28b72	namei: lift link_path_walk() call out of trailing_symlink() Make trailing_symlink() return the pathname to traverse or ERR_PTR(-E...). A subtle point is that for "magic" symlinks it returns "" now - that leads to link_path_walk("", nd), which is immediately returning 0 and we are back to the treatment of the last component, at whereever the damn thing has left us. Reduces the stack footprint - link_path_walk() called on more shallow stack now. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-11 08:12:57 -04:00
Al Viro	368ee9ba56	namei: path_init() calling conventions change * lift link_path_walk() into callers; moving it down into path_init() had been a mistake. Stack footprint, among other things... * do _not_ call path_cleanup() after path_init() failure; on all failure exits out of it we have nothing for path_cleanup() to do * have path_init() return pathname or ERR_PTR(-E...) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-11 08:10:41 -04:00
Al Viro	34a26b99b7	namei: get rid of nameidata->base we can do fdput() under rcu_read_lock() just fine; all we need to take care of is fetching nd->inode value first. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-11 08:05:05 -04:00
Al Viro	8bcb77fabd	namei: split off filename_lookupat() with LOOKUP_PARENT new functions: filename_parentat() and path_parentat() resp. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:20 -04:00
Al Viro	b5cd339762	namei: may_follow_link() - lift terminate_walk() on failures into caller Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:20 -04:00
Al Viro	ab10492345	namei: take increment of nd->depth into pick_link() Makes the situation much more regular - we avoid a strange state when the element just after the top of stack is used to store struct path of symlink, but isn't counted in nd->depth. This is much more regular, so the normal failure exits, etc., work fine. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:19 -04:00
Al Viro	1cf2665b5b	namei: kill nd->link Just store it in nd->stack[nd->depth].link right in pick_link(). Now that we make sure of stack expansion in pick_link(), we can do so... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:19 -04:00
Al Viro	fec2fa24e8	may_follow_link(): trim arguments Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:18 -04:00
Al Viro	cd179f4468	namei: move bumping the refcount of link->mnt into pick_link() update the failure cleanup in may_follow_link() to match that. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:18 -04:00
Al Viro	e8bb73dfb0	namei: fold put_link() into the failure case of complete_walk() ... and don't open-code unlazy_walk() in there - the only reason for that is to avoid verfication of cached nd->root, which is trivially avoided by discarding said cached nd->root first. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:17 -04:00
Al Viro	fab51e8ab2	namei: take the treatment of absolute symlinks to get_link() rather than letting the callers handle the jump-to-root part of semantics, do it right in get_link() and return the rest of the body for the caller to deal with - at that point it's treated the same way as relative symlinks would be. And return NULL when there's no "rest of the body" - those are treated the same as pure jump symlink would be. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:17 -04:00
Al Viro	4f697a5e17	namei: simpler treatment of symlinks with nothing other that / in the body Instead of saving name and branching to OK:, where we'll immediately restore it, and call walk_component() with WALK_PUT\|WALK_GET and nd->last_type being LAST_BIND, which is equivalent to put_link(nd), err = 0, we can just treat that the same way we'd treat procfs-style "jump" symlinks - do put_link(nd) and move on. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:16 -04:00
Al Viro	6920a4405e	namei: simplify failure exits in get_link() when cookie is NULL, put_link() is equivalent to path_put(), so as soon as we'd set last->cookie to NULL, we can bump nd->depth and let the normal logics in terminate_walk() to take care of cleanups. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:16 -04:00
Al Viro	6e77137b36	don't pass nameidata to ->follow_link() its only use is getting passed to nd_jump_link(), which can obtain it from current->nameidata Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:15 -04:00
Al Viro	8402752ecf	namei: simplify the callers of follow_managed() now that it gets nameidata, no reason to have setting LOOKUP_JUMPED on mountpoint crossing and calling path_put_conditional() on failures done in every caller. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:15 -04:00
NeilBrown	756daf263e	VFS: replace {, total_}link_count in task_struct with pointer to nameidata task_struct currently contains two ad-hoc members for use by the VFS: link_count and total_link_count. These are only interesting to fs/namei.c, so exposing them explicitly is poor layering. Incidentally, link_count isn't used anymore, so it can just die. This patches replaces those with a single pointer to 'struct nameidata'. This structure represents the current filename lookup of which there can only be one per process, and is a natural place to store total_link_count. This will allow the current "nameidata" argument to all follow_link operations to be removed as current->nameidata can be used instead in the _very_ few instances that care about it at all. As there are occasional circumstances where pathname lookup can recurse, such as through kern_path_locked, we always save and old current->nameidata (if there is one) when setting a new value, and make sure any active link_counts are preserved. follow_mount and follow_automount now get a 'struct nameidata *' rather than 'int flags' so that they can directly access total_link_count, rather than going through 'current'. Suggested-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:14 -04:00
Al Viro	626de99676	namei: move link count check and stack allocation into pick_link() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:13 -04:00
Al Viro	d63ff28f0f	namei: make should_follow_link() store the link in nd->link ... if it decides to follow, that is. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:13 -04:00
Al Viro	4693a547cd	namei: new calling conventions for walk_component() instead of a single flag (!= 0 => we want to follow symlinks) pass two bits - WALK_GET (want to follow symlinks) and WALK_PUT (put_link() once we are done looking at the name). The latter matters only for success exits - on failure the caller will discard everything anyway. Suggestions for better variant are welcome; what this thing aims for is making sure that pending put_link() is done before walk_component() decides to pick a symlink up, rather than between picking it up and acting upon it. See the next commit for payoff. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:12 -04:00
Al Viro	8620c238ed	link_path_walk: move the OK: inside the loop fewer labels that way; in particular, resuming after the end of nested symlink is straight-line. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:12 -04:00
Al Viro	1543972678	namei: have terminate_walk() do put_link() on everything left All callers of terminate_walk() are followed by more or less open-coded eqiuvalent of "do put_link() on everything left in nd->stack". Better done in terminate_walk() itself, and when we go for RCU symlink traversal we'll have to do it there anyway. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:11 -04:00
Al Viro	191d7f73e2	namei: take put_link() into {lookup,mountpoint,do}_last() rationale: we'll need to have terminate_walk() do put_link() on everything, which will mean that in some cases ..._last() will do put_link() anyway. Easier to have them do it in all cases. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:11 -04:00
Al Viro	1bc4b813e8	namei: lift (open-coded) terminate_walk() into callers of get_link() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:10 -04:00
Al Viro	f0a9ba7021	lift terminate_walk() into callers of walk_component() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:10 -04:00
Al Viro	70291aecc6	namei: lift (open-coded) terminate_walk() in follow_dotdot_rcu() into callers follow_dotdot_rcu() does an equivalent of terminate_walk() on failure; shifting it into callers makes for simpler rules and those callers already have terminate_walk() on other failure exits. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:09 -04:00
Al Viro	e269f2a73f	namei: we never need more than MAXSYMLINKS entries in nd->stack The only reason why we needed one more was that purely nested MAXSYMLINKS symlinks could lead to path_init() using that many entries in addition to nd->stack[0] which it left unused. That can't happen now - path_init() starts with entry 0 (and trailing_symlink() is called only when we'd already encountered one symlink, so no more than MAXSYMLINKS-1 are left). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:08 -04:00
Al Viro	8eff733a45	link_path_walk: end of nd->depth massage get rid of orig_depth - we only use it on error exit to tell whether to stop doing put_link() when depth reaches 0 (call from path_init()) or when it reaches 1 (call from trailing_symlink()). However, in the latter case the caller would immediately follow with one more put_link(). Just keep doing it until the depth reaches zero (and simplify trailing_symlink() as the result). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:08 -04:00
Al Viro	939724df56	link_path_walk: nd->depth massage, part 10 Get rid of orig_depth checks in OK: logics. If nd->depth is zero, we had been called from path_init() and we are done. If it is greater than 1, we are not done, whether we'd been called from path_init() or trailing_symlink(). And in case when it's 1, we might have been called from path_init() and reached the end of nested symlink (in which case nd->stack[0].name will point to the rest of pathname and we are not done) or from trailing_symlink(), in which case we are done. Just have trailing_symlink() leave NULL in nd->stack[0].name and use that to discriminate between those cases. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:06 -04:00
Al Viro	dc7af8dc05	link_path_walk: nd->depth massage, part 9 Make link_path_walk() work with any value of nd->depth on entry - memorize it and use it in tests instead of comparing with 1. Don't bother with increment/decrement in path_init(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:06 -04:00
Al Viro	21c3003d36	put_link: nd->depth massage, part 8 all calls are preceded by decrement of nd->depth; move it into put_link() itself. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:05 -04:00
Al Viro	9ea57b72bf	trailing_symlink: nd->depth massage, part 7 move decrement of nd->depth on successful returns into the callers. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:05 -04:00
Al Viro	0fd889d59e	get_link: nd->depth massage, part 6 make get_link() increment nd->depth on successful exit Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:04 -04:00
Al Viro	f7df08ee05	trailing_symlink: nd->depth massage, part 5 move increment of ->depth to the point where we'd discovered that get_link() has not returned an error, adjust exits accordingly. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:04 -04:00
Al Viro	ef1a3e7b96	link_path_walk: nd->depth massage, part 4 lift increment/decrement into link_path_walk() callers. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:03 -04:00
Al Viro	da4e0be04d	link_path_walk: nd->depth massage, part 3 remove decrement/increment surrounding nd_alloc_stack(), adjust the test in it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:03 -04:00
Al Viro	fd4620bbdf	link_path_walk: nd->depth massage, part 2 collapse adjacent increment/decrement pairs. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:02 -04:00
Al Viro	071bf50137	link_path_walk: nd->depth massage, part 1 nd->stack[0] is unused until the handling of trailing symlinks and we want to get rid of that. Having fucked that transformation up several times, I went for bloody pedantic series of provably equivalent transformations. Sorry. Step 1: keep nd->depth higher by one in link_path_walk() - increment upon entry, decrement on exits, adjust the arithmetics inside and surround the calls of functions that care about nd->depth value (nd_alloc_stack(), get_link(), put_link()) with decrement/increment pairs. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:02 -04:00
Al Viro	894bc8c466	namei: remove restrictions on nesting depth The only restriction is that on the total amount of symlinks crossed; how they are nested does not matter Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:01 -04:00
Al Viro	3b2e7f7539	namei: trim the arguments of get_link() same story as the previous commit Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:01 -04:00
Al Viro	b9ff44293c	namei: trim redundant arguments of fs/namei.c:put_link() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:00 -04:00
Al Viro	1d8e03d359	namei: trim redundant arguments of trailing_symlink() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:00 -04:00
Al Viro	697fc6ca66	namei: move link/cookie pairs into nameidata Array of MAX_NESTED_LINKS + 1 elements put into nameidata; what used to be a local array in link_path_walk() occupies entries 1 .. MAX_NESTED_LINKS in it, link and cookie from the trailing symlink handling loops - entry 0. This is _not_ the final arrangement; just an easily verified incremental step. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:19:59 -04:00
Al Viro	9e18f10a30	link_path_walk: cleanup - turn goto start; into continue; Deal with skipping leading slashes before what used to be the recursive call. That way we can get rid of that goto completely. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:19:59 -04:00
Al Viro	07681481b8	link_path_walk: split "return from recursive call" path Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:19:58 -04:00
Al Viro	32cd74685c	link_path_walk: kill the recursion absolutely straightforward now - the only variables we need to preserve across the recursive call are name, link and cookie, and recursion depth is limited (and can is equal to nd->depth). So arrange an array of triples to hold instances of those and be done with that. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:19:58 -04:00
Al Viro	bdf6cbf179	link_path_walk: final preparations to killing recursion reduce the number of returns in there - turn all places where it returns zero into goto OK and places where it returns non-zero into goto Err. The only non-trivial detail is that all breaks in the loop are guaranteed to be with non-zero err. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:19:57 -04:00
Al Viro	bb8603f8e1	link_path_walk: get rid of duplication What we do after the second walk_component() + put_link() + depth decrement in there is exactly equivalent to what's done right after the first walk_component(). Easy to verify and not at all surprising, seeing that there we have just walked the last component of nested symlink. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:19:57 -04:00
Al Viro	48c8b0c571	link_path_walk: massage a bit more Pull the block after the if-else in the end of what used to be do-while body into all branches there. We are almost done with the massage... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:19:56 -04:00
Al Viro	d40bcc09ab	link_path_walk: turn inner loop into explicit goto Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:19:56 -04:00
Al Viro	12b0957800	link_path_walk: don't bother with walk_component() after jumping link ... it does nothing if nd->last_type is LAST_BIND. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:19:55 -04:00
Al Viro	b0c24c3bdf	link_path_walk: handle get_link() returning ERR_PTR() immediately If we get ERR_PTR() from get_link(), we are guaranteed to get err != 0 when we break out of do-while, so we are going to hit if (err) return err; shortly after it. Pull that into the if (IS_ERR(s)) body. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:19:55 -04:00
Al Viro	95fa25d9f2	namei: rename follow_link to trailing_symlink, move it down Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:19:54 -04:00
Al Viro	21fef2176e	namei: move the calls of may_follow_link() into follow_link() All remaining callers of the former are preceded by the latter Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:19:53 -04:00
Al Viro	172a39a059	namei: expand the call of follow_link() in link_path_walk() ... and strip __always_inline from follow_link() - remaining callers don't need that. Now link_path_walk() recursion is a direct one. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:19:53 -04:00
Al Viro	5a460275ef	namei: expand nested_symlink() in its only caller Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:19:52 -04:00
Al Viro	896475d5bd	do_last: move path there from caller's stack frame We used to need it to feed to follow_link(). No more... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:19:52 -04:00
Al Viro	caa8563443	namei: introduce nameidata->link shares space with nameidata->next, walk_component() et.al. store the struct path of symlink instead of returning it into a variable passed by caller. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:19:51 -04:00
Al Viro	d4dee48bad	namei: don't bother with ->follow_link() if ->i_link is set with new calling conventions it's trivial Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Conflicts: fs/namei.c	2015-05-10 22:19:51 -04:00
Al Viro	0a959df54b	namei.c: separate the parts of follow_link() that find the link body Split a piece of fs/namei.c:follow_link() that does obtaining the link body into a separate function. follow_link() itself is converted to calling get_link() and then doing the body traversal (if any). The next step will expand follow_link() call in link_path_walk() and this helps to keep the size down... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:19:50 -04:00
Al Viro	680baacbca	new ->follow_link() and ->put_link() calling conventions a) instead of storing the symlink body (via nd_set_link()) and returning an opaque pointer later passed to ->put_link(), ->follow_link() _stores_ that opaque pointer (into void * passed by address by caller) and returns the symlink body. Returning ERR_PTR() on error, NULL on jump (procfs magic symlinks) and pointer to symlink body for normal symlinks. Stored pointer is ignored in all cases except the last one. Storing NULL for opaque pointer (or not storing it at all) means no call of ->put_link(). b) the body used to be passed to ->put_link() implicitly (via nameidata). Now only the opaque pointer is. In the cases when we used the symlink body to free stuff, ->follow_link() now should store it as opaque pointer in addition to returning it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:19:45 -04:00
Al Viro	46afd6f61c	namei: lift nameidata into filename_mountpoint() when we go for on-demand allocation of saved state in link_path_walk(), we'll want nameidata to stay around for all 3 calls of path_mountpoint(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:18:33 -04:00
Al Viro	f5beed755b	name: shift nameidata down into user_path_walk() that avoids having nameidata on stack during the calls of ->rmdir()/->unlink() and two of those during the calls of ->rename(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:18:32 -04:00
Al Viro	6a9f40d610	namei: get rid of lookup_hash() it's a convenient helper, but we'll want to shift nameidata down the call chain, so it won't be available there... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:18:32 -04:00
Al Viro	a5cfe2d5e1	do_last: regularize the logics around following symlinks With LOOKUP_FOLLOW we unlazy and return 1; without it we either fail with ELOOP or, for O_PATH opens, succeed. No need to mix those cases... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:18:31 -04:00
Al Viro	fd2805be23	do_last: kill symlink_ok When O_PATH is present, O_CREAT isn't, so symlink_ok is always equal to (open_flags & O_PATH) && !(nd->flags & LOOKUP_FOLLOW). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:18:30 -04:00
Al Viro	f488443d1d	namei: take O_NOFOLLOW treatment into do_last() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:18:30 -04:00
Al Viro	34b128f31c	uninline walk_component() seriously improves the stack and I-cache footprint... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:18:29 -04:00
NeilBrown	37882db054	SECURITY: remove nameidata arg from inode_follow_link. No ->inode_follow_link() methods use the nameidata arg, and it is about to become private to namei.c. So remove from all inode_follow_link() functions. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:18:29 -04:00
Al Viro	f15133df08	path_openat(): fix double fput() path_openat() jumps to the wrong place after do_tmpfile() - it has already done path_cleanup() (as part of path_lookupat() called by do_tmpfile()), so doing that again can lead to double fput(). Cc: stable@vger.kernel.org # v3.11+ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-09 00:12:48 -04:00
Al Viro	766c4cbfac	namei: d_is_negative() should be checked before ->d_seq validation Fetching ->d_inode, verifying ->d_seq and finding d_is_negative() to be true does not mean that inode we'd fetched had been NULL - that holds only while ->d_seq is still unchanged. Shift d_is_negative() checks into lookup_fast() prior to ->d_seq verification. Reported-by: Steven Rostedt <rostedt@goodmis.org> Tested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-09 00:12:35 -04:00
Al Viro	3cab989afd	RCU pathwalk breakage when running into a symlink overmounting something Calling unlazy_walk() in walk_component() and do_last() when we find a symlink that needs to be followed doesn't acquire a reference to vfsmount. That's fine when the symlink is on the same vfsmount as the parent directory (which is almost always the case), but it's not always true - one _can_ manage to bind a symlink on top of something. And in such cases we end up with excessive mntput(). Cc: stable@vger.kernel.org # since 2.6.39 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-04-24 15:52:14 -04:00
David Howells	4bbcbd3b11	VFS: Make pathwalk use d_is_reg() rather than S_ISREG() Make pathwalk use d_is_reg() rather than S_ISREG() to determine whether to honour O_TRUNC. Since this occurs after complete_walk(), the dentry type field cannot change and the inode pointer cannot change as we hold a ref on the dentry, so this should be safe. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-04-15 15:05:30 -04:00
David Howells	698934df8b	VFS: Combine inode checks with d_is_negative() and d_is_positive() in pathwalk Where we have: if (!dentry->d_inode \|\| d_is_negative(dentry)) { type constructions in pathwalk we should be able to eliminate the check of d_inode and rely solely on the result of d_is_negative() or d_is_positive(). What we do have to take care to do is to read d_inode after calling a d_is_xxx() typecheck function to get the barriering right. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-04-15 15:05:29 -04:00
Al Viro	9e7543e939	remove incorrect comment in lookup_one_len() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-04-11 22:24:30 -04:00
Al Viro	74eb8cc5a5	namei.c: fold do_path_lookup() into both callers Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-04-11 22:24:30 -04:00
Al Viro	fd2f7cb5bc	kill struct filename.separate just make const char iname[] the last member and compare name->name with name->iname instead of checking name->separate We need to make sure that out-of-line name doesn't end up allocated adjacent to struct filename refering to it; fortunately, it's easy to achieve - just allocate that struct filename with one byte in ->iname[], so that ->iname[0] will be inside the same object and thus have an address different from that of out-of-line name [spotted by Boqun Feng <boqun.feng@gmail.com>] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-04-11 22:21:24 -04:00
Al Viro	6e8a1f8741	switch path_init() to struct filename Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-03-24 17:19:16 -04:00
Al Viro	668696dcbb	switch path_mountpoint() to struct filename Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-03-24 17:19:15 -04:00
Al Viro	5eb6b495c6	switch path_lookupat() to struct filename all callers were passing it ->name of some struct filename Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-03-24 17:19:15 -04:00
Al Viro	94b5d2621a	getname_flags(): clean up a bit Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-03-24 17:19:14 -04:00
David Howells	e36cb0b89c	VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_(dentry) Convert the following where appropriate: (1) S_ISLNK(dentry->d_inode) to d_is_symlink(dentry). (2) S_ISREG(dentry->d_inode) to d_is_reg(dentry). (3) S_ISDIR(dentry->d_inode) to d_is_dir(dentry). This is actually more complicated than it appears as some calls should be converted to d_can_lookup() instead. The difference is whether the directory in question is a real dir with a ->lookup op or whether it's a fake dir with a ->d_automount op. In some circumstances, we can subsume checks for dentry->d_inode not being NULL into this, provided we the code isn't in a filesystem that expects d_inode to be NULL if the dirent really is* negative (ie. if we're going to use d_inode() rather than d_backing_inode() to get the inode pointer). Note that the dentry type field may be set to something other than DCACHE_MISS_TYPE when d_inode is NULL in the case of unionmount, where the VFS manages the fall-through from a negative dentry to a lower layer. In such a case, the dentry type of the negative union dentry is set to the same as the type of the lower dentry. However, if you know d_inode is not NULL at the call site, then you can use the d_is_xxx() functions even in a filesystem. There is one further complication: a 0,0 chardev dentry may be labelled DCACHE_WHITEOUT_TYPE rather than DCACHE_SPECIAL_TYPE. Strictly, this was intended for special directory entry types that don't have attached inodes. The following perl+coccinelle script was used: use strict; my @callers; open($fd, 'git grep -l \'S_IS[A-Z].*->d_inode\' \|') \|\| die "Can't grep for S_ISDIR and co. callers"; @callers = <$fd>; close($fd); unless (@callers) { print "No matches\n"; exit(0); } my @cocci = ( '@@', 'expression E;', '@@', '', '- S_ISLNK(E->d_inode->i_mode)', '+ d_is_symlink(E)', '', '@@', 'expression E;', '@@', '', '- S_ISDIR(E->d_inode->i_mode)', '+ d_is_dir(E)', '', '@@', 'expression E;', '@@', '', '- S_ISREG(E->d_inode->i_mode)', '+ d_is_reg(E)' ); my $coccifile = "tmp.sp.cocci"; open($fd, ">$coccifile") \|\| die $coccifile; print($fd "$_\n") \|\| die $coccifile foreach (@cocci); close($fd); foreach my $file (@callers) { chomp $file; print "Processing ", $file, "\n"; system("spatch", "--sp-file", $coccifile, $file, "--in-place", "--no-show-diff") == 0 \|\| die "spatch failed"; } [AV: overlayfs parts skipped] Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-02-22 11:38:41 -05:00
Paul Moore	55422d0bd2	audit: replace getname()/putname() hacks with reference counters In order to ensure that filenames are not released before the audit subsystem is done with the strings there are a number of hacks built into the fs and audit subsystems around getname() and putname(). To say these hacks are "ugly" would be kind. This patch removes the filename hackery in favor of a more conventional reference count based approach. The diffstat below tells most of the story; lots of audit/fs specific code is replaced with a traditional reference count based approach that is easily understood, even by those not familiar with the audit and/or fs subsystems. CC: viro@zeniv.linux.org.uk CC: linux-fsdevel@vger.kernel.org Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-01-23 00:23:58 -05:00
Paul Moore	fd3522fdc8	audit: enable filename recording via getname_kernel() Enable recording of filenames in getname_kernel() and remove the kludgy workaround in __audit_inode() now that we have proper filename logging for kernel users. CC: viro@zeniv.linux.org.uk CC: linux-fsdevel@vger.kernel.org Signed-off-by: Paul Moore <pmoore@redhat.com> Reviewed-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-01-23 00:23:52 -05:00
Al Viro	cbaab2db91	simpler calling conventions for filename_mountpoint() a) make it accept ERR_PTR() as filename (and return its PTR_ERR() in that case) b) make it putname() the sucker in the end otherwise simplifies life for callers... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-01-23 00:22:21 -05:00
Paul Moore	5168910413	fs: create proper filename objects using getname_kernel() There are several areas in the kernel that create temporary filename objects using the following pattern: int func(const char name) { struct filename file = { .name = name }; ... return 0; } ... which for the most part works okay, but it causes havoc within the audit subsystem as the filename object does not persist beyond the lifetime of the function. This patch converts all of these temporary filename objects into proper filename objects using getname_kernel() and putname() which ensure that the filename object persists until the audit subsystem is finished with it. Also, a special thanks to Al Viro, Guenter Roeck, and Sabrina Dubroca for helping resolve a difficult kernel panic on boot related to a use-after-free problem in kern_path_create(); the thread can be seen at the link below: * https://lkml.org/lkml/2015/1/20/710 This patch includes code that was either based on, or directly written by Al in the above thread. CC: viro@zeniv.linux.org.uk CC: linux@roeck-us.net CC: sd@queasysnail.net CC: linux-fsdevel@vger.kernel.org Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-01-23 00:22:20 -05:00
Paul Moore	0851854972	fs: rework getname_kernel to handle up to PATH_MAX sized filenames In preparation for expanded use in the kernel, make getname_kernel() more useful by allowing it to handle any legal filename length. Thanks to Guenter Roeck for his suggestion to substitute memcpy() for strlcpy(). CC: linux@roeck-us.net CC: viro@zeniv.linux.org.uk CC: linux-fsdevel@vger.kernel.org Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-01-23 00:22:20 -05:00
Al Viro	fa14a0b8d2	cut down the number of do_path_lookup() callers ... and don't bother with new struct filename when we already have one Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-01-23 00:22:19 -05:00
Linus Torvalds	603ba7e41b	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pile #2 from Al Viro: "Next pile (and there'll be one or two more). The large piece in this one is getting rid of /proc//ns/ weirdness; among other things, it allows to (finally) make nameidata completely opaque outside of fs/namei.c, making for easier further cleanups in there" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: coda_venus_readdir(): use file_inode() fs/namei.c: fold link_path_walk() call into path_init() path_init(): don't bother with LOOKUP_PARENT in argument fs/namei.c: new helper (path_cleanup()) path_init(): store the "base" pointer to file in nameidata itself make default ->i_fop have ->open() fail with ENXIO make nameidata completely opaque outside of fs/namei.c kill proc_ns completely take the targets of /proc//ns/ symlinks to separate fs bury struct proc_ns in fs/proc copy address of proc_ns_ops into ns_common new helpers: ns_alloc_inum/ns_free_inum make proc_ns_operations work with struct ns_common * instead of void * switch the rest of proc_ns_operations to working with &...->ns netns: switch ->get()/->put()/->install()/->inum() to working with &net->ns make mntns ->get()/->put()/->install()/->inum() work with &mnt_ns->ns common object embedded into various struct ....ns	2014-12-16 15:53:03 -08:00
David Drysdale	51f39a1f0c	syscalls: implement execveat() system call This patchset adds execveat(2) for x86, and is derived from Meredydd Luff's patch from Sept 2012 (https://lkml.org/lkml/2012/9/11/528). The primary aim of adding an execveat syscall is to allow an implementation of fexecve(3) that does not rely on the /proc filesystem, at least for executables (rather than scripts). The current glibc version of fexecve(3) is implemented via /proc, which causes problems in sandboxed or otherwise restricted environments. Given the desire for a /proc-free fexecve() implementation, HPA suggested (https://lkml.org/lkml/2006/7/11/556) that an execveat(2) syscall would be an appropriate generalization. Also, having a new syscall means that it can take a flags argument without back-compatibility concerns. The current implementation just defines the AT_EMPTY_PATH and AT_SYMLINK_NOFOLLOW flags, but other flags could be added in future -- for example, flags for new namespaces (as suggested at https://lkml.org/lkml/2006/7/11/474). Related history: - https://lkml.org/lkml/2006/12/27/123 is an example of someone realizing that fexecve() is likely to fail in a chroot environment. - http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=514043 covered documenting the /proc requirement of fexecve(3) in its manpage, to "prevent other people from wasting their time". - https://bugzilla.redhat.com/show_bug.cgi?id=241609 described a problem where a process that did setuid() could not fexecve() because it no longer had access to /proc/self/fd; this has since been fixed. This patch (of 4): Add a new execveat(2) system call. execveat() is to execve() as openat() is to open(): it takes a file descriptor that refers to a directory, and resolves the filename relative to that. In addition, if the filename is empty and AT_EMPTY_PATH is specified, execveat() executes the file to which the file descriptor refers. This replicates the functionality of fexecve(), which is a system call in other UNIXen, but in Linux glibc it depends on opening "/proc/self/fd/<fd>" (and so relies on /proc being mounted). The filename fed to the executed program as argv[0] (or the name of the script fed to a script interpreter) will be of the form "/dev/fd/<fd>" (for an empty filename) or "/dev/fd/<fd>/<filename>", effectively reflecting how the executable was found. This does however mean that execution of a script in a /proc-less environment won't work; also, script execution via an O_CLOEXEC file descriptor fails (as the file will not be accessible after exec). Based on patches by Meredydd Luff. Signed-off-by: David Drysdale <drysdale@google.com> Cc: Meredydd Luff <meredydd@senatehouse.org> Cc: Shuah Khan <shuah.kh@samsung.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Rich Felker <dalias@aerifal.cx> Cc: Christoph Hellwig <hch@infradead.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-12-13 12:42:51 -08:00

1 2 3 4 5 ...

844 Commits