linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-11-24 08:50:52 +07:00

Author	SHA1	Message	Date
Arnd Bergmann	f37b9c142e	security: commoncap: fix -Wstringop-overread warning commit 82e5d8cc768b0c7b03c551a9ab1f8f3f68d5f83f upstream. gcc-11 introdces a harmless warning for cap_inode_getsecurity: security/commoncap.c: In function ‘cap_inode_getsecurity’: security/commoncap.c:440:33: error: ‘memcpy’ reading 16 bytes from a region of size 0 [-Werror=stringop-overread] 440 \| memcpy(&nscap->data, &cap->data, sizeof(__le32) * 2 * VFS_CAP_U32); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The problem here is that tmpbuf is initialized to NULL, so gcc assumes it is not accessible unless it gets set by vfs_getxattr_alloc(). This is a legitimate warning as far as I can tell, but the code is correct since it correctly handles the error when that function fails. Add a separate NULL check to tell gcc about it as well. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: James Morris <jamorris@linux.microsoft.com> Cc: Andrey Zhizhikin <andrey.z@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-05-11 14:47:36 +02:00
Eric W. Biederman	5d5422a294	Revert 95ebabde382c ("capabilities: Don't allow writing ambiguous v3 file capabilities") commit 3b0c2d3eaa83da259d7726192cf55a137769012f upstream. It turns out that there are in fact userspace implementations that care and this recent change caused a regression. https://github.com/containers/buildah/issues/3071 As the motivation for the original change was future development, and the impact is existing real world code just revert this change and allow the ambiguity in v3 file caps. Cc: stable@vger.kernel.org Fixes: 95ebabde382c ("capabilities: Don't allow writing ambiguous v3 file capabilities") Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-17 17:06:27 +01:00
Eric W. Biederman	54b4e5df95	capabilities: Don't allow writing ambiguous v3 file capabilities [ Upstream commit 95ebabde382c371572297915b104e55403674e73 ] The v3 file capabilities have a uid field that records the filesystem uid of the root user of the user namespace the file capabilities are valid in. When someone is silly enough to have the same underlying uid as the root uid of multiple nested containers a v3 filesystem capability can be ambiguous. In the spirit of don't do that then, forbid writing a v3 filesystem capability if it is ambiguous. Fixes: `8db6c34f1d` ("Introduce v3 namespaced file capabilities") Reviewed-by: Andrew G. Morgan <morgan@kernel.org> Reviewed-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-04 11:37:52 +01:00
Miklos Szeredi	02dee03d48	cap: fix conversions on getxattr [ Upstream commit f2b00be488730522d0fb7a8a5de663febdcefe0a ] If a capability is stored on disk in v2 format cap_inode_getsecurity() will currently return in v2 format unconditionally. This is wrong: v2 cap should be equivalent to a v3 cap with zero rootid, and so the same conversions performed on it. If the rootid cannot be mapped, v3 is returned unconverted. Fix this so that both v2 and v3 return -EOVERFLOW if the rootid (or the owner of the fs user namespace in case of v2) cannot be mapped into the current user namespace. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-02-17 11:02:22 +01:00
Eric W. Biederman	56305aa9b6	exec: Compute file based creds only once Move the computation of creds from prepare_binfmt into begin_new_exec so that the creds need only be computed once. This is just code reorganization no semantic changes of any kind are made. Moving the computation is safe. I have looked through the kernel and verified none of the binfmts look at bprm->cred directly, and that there are no helpers that look at bprm->cred indirectly. Which means that it is not a problem to compute the bprm->cred later in the execution flow as it is not used until it becomes current->cred. A new function bprm_creds_from_file is added to contain the work that needs to be done. bprm_creds_from_file first computes which file bprm->executable or most likely bprm->file that the bprm->creds will be computed from. The funciton bprm_fill_uid is updated to receive the file instead of accessing bprm->file. The now unnecessary work needed to reset the bprm->cred->euid, and bprm->cred->egid is removed from brpm_fill_uid. A small comment to document that bprm_fill_uid now only deals with the work to handle suid and sgid files. The default case is already heandled by prepare_exec_creds. The function security_bprm_repopulate_creds is renamed security_bprm_creds_from_file and now is explicitly passed the file from which to compute the creds. The documentation of the bprm_creds_from_file security hook is updated to explain when the hook is called and what it needs to do. The file is passed from cap_bprm_creds_from_file into get_file_caps so that the caps are computed for the appropriate file. The now unnecessary work in cap_bprm_creds_from_file to reset the ambient capabilites has been removed. A small comment to document that the work of cap_bprm_creds_from_file is to read capabilities from the files secureity attribute and derive capabilities from the fact the user had uid 0 has been added. Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2020-05-29 22:00:54 -05:00
Eric W. Biederman	a7868323c2	exec: Add a per bprm->file version of per_clear There is a small bug in the code that recomputes parts of bprm->cred for every bprm->file. The code never recomputes the part of clear_dangerous_personality_flags it is responsible for. Which means that in practice if someone creates a sgid script the interpreter will not be able to use any of: READ_IMPLIES_EXEC ADDR_NO_RANDOMIZE ADDR_COMPAT_LAYOUT MMAP_PAGE_ZERO. This accentially clearing of personality flags probably does not matter in practice because no one has complained but it does make the code more difficult to understand. Further remaining bug compatible prevents the recomputation from being removed and replaced by simply computing bprm->cred once from the final bprm->file. Making this change removes the last behavior difference between computing bprm->creds from the final file and recomputing bprm->cred several times. Which allows this behavior change to be justified for it's own reasons, and for any but hunts looking into why the behavior changed to wind up here instead of in the code that will follow that computes bprm->cred from the final bprm->file. This small logic bug appears to have existed since the code started clearing dangerous personality bits. History Tree: git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git Fixes: 1bb0fa189c6a ("[PATCH] NX: clean up legacy binary support") Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2020-05-29 21:06:48 -05:00
Eric W. Biederman	e32f887901	Merge commit `a4ae32c71f` ("exec: Always set cap_ambient in cap_bprm_set_creds") This is a bug fix and one of two places where I have found that the result of calling security_bprm_repopulate_creds more than once on different bprm->files depends on all of the bprm->files not just the file bprm->file. I intend to fix both of those cases and then modify the code to only call security_bprm_repopulate_creds on the final bprm file. So merge this change in so I hopefully reduce conflicts for others and I make it possible to build on top of this change. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2020-05-27 22:37:33 -05:00
Eric W. Biederman	a4ae32c71f	exec: Always set cap_ambient in cap_bprm_set_creds An invariant of cap_bprm_set_creds is that every field in the new cred structure that cap_bprm_set_creds might set, needs to be set every time to ensure the fields does not get a stale value. The field cap_ambient is not set every time cap_bprm_set_creds is called, which means that if there is a suid or sgid script with an interpreter that has neither the suid nor the sgid bits set the interpreter should be able to accept ambient credentials. Unfortuantely because cap_ambient is not reset to it's original value the interpreter can not accept ambient credentials. Given that the ambient capability set is expected to be controlled by the caller, I don't think this is particularly serious. But it is definitely worth fixing so the code works correctly. I have tested to verify my reading of the code is correct and the interpreter of a sgid can receive ambient capabilities with this change and cannot receive ambient capabilities without this change. Cc: stable@vger.kernel.org Cc: Andy Lutomirski <luto@kernel.org> Fixes: `58319057b7` ("capabilities: ambient capabilities") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2020-05-26 13:11:00 -05:00
Eric W. Biederman	112b714759	exec: Convert security_bprm_set_creds into security_bprm_repopulate_creds Rename bprm->cap_elevated to bprm->active_secureexec and initialize it in prepare_binprm instead of in cap_bprm_set_creds. Initializing bprm->active_secureexec in prepare_binprm allows multiple implementations of security_bprm_repopulate_creds to play nicely with each other. Rename security_bprm_set_creds to security_bprm_reopulate_creds to emphasize that this path recomputes part of bprm->cred. This recomputation avoids the time of check vs time of use problems that are inherent in unix #! interpreters. In short two renames and a move in the location of initializing bprm->active_secureexec. Link: https://lkml.kernel.org/r/87o8qkzrxp.fsf_-_@x220.int.ebiederm.org Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2020-05-21 10:16:50 -05:00
Linus Torvalds	9d22167f34	Merge branch 'next-lsm' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull capabilities update from James Morris: "Minor fixes for capabilities: - Update the commoncap.c code to utilize XATTR_SECURITY_PREFIX_LEN, from Carmeli tamir. - Make the capability hooks static, from Yue Haibing" * 'next-lsm' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: security/commoncap: Use xattr security prefix len security: Make capability_hooks static	2019-07-09 12:24:21 -07:00
Carmeli Tamir	c5eaab1d13	security/commoncap: Use xattr security prefix len Using the existing defined XATTR_SECURITY_PREFIX_LEN instead of sizeof(XATTR_SECURITY_PREFIX) - 1. Pretty simple cleanup. Signed-off-by: Carmeli Tamir <carmeli.tamir@gmail.com> Signed-off-by: James Morris <jamorris@linux.microsoft.com>	2019-07-07 14:55:54 +12:00
YueHaibing	d1c5947ec6	security: Make capability_hooks static Fix sparse warning: security/commoncap.c:1347:27: warning: symbol 'capability_hooks' was not declared. Should it be static? Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: James Morris <jamorris@linux.microsoft.com>	2019-06-11 14:05:16 -07:00
Thomas Gleixner	2874c5fd28	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 3029 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-05-30 11:26:32 -07:00
Linus Torvalds	be37f21a08	audit/stable-5.1 PR 20190305 -----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAlx+8ZgUHHBhdWxAcGF1 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXOlDhAAiGlirQ9syyG2fYzaARZZ2QoU/GGD PSAeiNmP3jvJzXArCvugRCw+YSNDdQOBM3SrLQC+cM0MAIDRYXN0NdcrsbTchlMA 51Fx1egZ9Fyj+Ehgida3muh2lRUy7DQwMCL6tAVqwz7vYkSTGDUf+MlYqOqXDka5 74pEExOS3Jdi7560BsE8b6QoW9JIJqEJnirXGkG9o2qC0oFHCR6PKxIyQ7TJrLR1 F23aFTqLTH1nbPUQjnox2PTf13iQVh4j2gwzd+9c9KBfxoGSge3dmxId7BJHy2aG M27fPdCYTNZAGWpPVujsCPAh1WPQ9NQqg3mA9+g14PEbiLqPcqU+kWmnDU7T7bEw Qx0kt6Y8GiknwCqq8pDbKYclgRmOjSGdfutzd0z8uDpbaeunS4/NqnDb/FUaDVcr jA4d6ep7qEgHpYbL8KgOeZCexfaTfz6mcwRWNq3Uu9cLZbZqSSQ7PXolMADHvoRs LS7VH2jcP7q4p4GWmdfjv67xyUUo9HG5HHX74h5pLfQSYXiBWo4ht0UOAzX/6EcE CJNHAFHv+OanI5Rg/6JQ8b3/bJYxzAJVyLZpCuMtlKk6lYBGNeADk9BezEDIYsm8 tSe4/GqqyR9+Qz8rSdpAZ0KKkfqS535IcHUPUJau7Bzg1xqSEP5gzZN6QsjdXg0+ 5wFFfdFICTfJFXo= =57/1 -----END PGP SIGNATURE----- Merge tag 'audit-pr-20190305' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit Pull audit updates from Paul Moore: "A lucky 13 audit patches for v5.1. Despite the rather large diffstat, most of the changes are from two bug fix patches that move code from one Kconfig option to another. Beyond that bit of churn, the remaining changes are largely cleanups and bug-fixes as we slowly march towards container auditing. It isn't all boring though, we do have a couple of new things: file capabilities v3 support, and expanded support for filtering on filesystems to solve problems with remote filesystems. All changes pass the audit-testsuite. Please merge for v5.1" * tag 'audit-pr-20190305' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: audit: mark expected switch fall-through audit: hide auditsc_get_stamp and audit_serial prototypes audit: join tty records to their syscall audit: remove audit_context when CONFIG_ AUDIT and not AUDITSYSCALL audit: remove unused actx param from audit_rule_match audit: ignore fcaps on umount audit: clean up AUDITSYSCALL prototypes and stubs audit: more filter PATH records keyed on filesystem magic audit: add support for fcaps v3 audit: move loginuid and sessionid from CONFIG_AUDITSYSCALL to CONFIG_AUDIT audit: add syscall information to CONFIG_CHANGE records audit: hand taken context to audit_kill_trees for syscall logging audit: give a clue what CONFIG_CHANGE op was involved	2019-03-07 12:20:11 -08:00
Micah Morton	e88ed488af	LSM: Update function documentation for cap_capable This should have gone in with commit `c1a85a00ea`. Signed-off-by: Micah Morton <mortonm@chromium.org> Signed-off-by: James Morris <james.morris@microsoft.com>	2019-02-25 15:16:25 -08:00
Richard Guy Briggs	2fec30e245	audit: add support for fcaps v3 V3 namespaced file capabilities were introduced in commit `8db6c34f1d` ("Introduce v3 namespaced file capabilities") Add support for these by adding the "frootid" field to the existing fcaps fields in the NAME and BPRM_FCAPS records. Please see github issue https://github.com/linux-audit/audit-kernel/issues/103 Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Acked-by: Serge Hallyn <serge@hallyn.com> [PM: comment tweak to fit an 80 char line width] Signed-off-by: Paul Moore <paul@paul-moore.com>	2019-01-25 13:31:23 -05:00
Micah Morton	c1a85a00ea	LSM: generalize flag passing to security_capable This patch provides a general mechanism for passing flags to the security_capable LSM hook. It replaces the specific 'audit' flag that is used to tell security_capable whether it should log an audit message for the given capability check. The reason for generalizing this flag passing is so we can add an additional flag that signifies whether security_capable is being called by a setid syscall (which is needed by the proposed SafeSetID LSM). Signed-off-by: Micah Morton <mortonm@chromium.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: James Morris <james.morris@microsoft.com>	2019-01-10 14:16:06 -08:00
Kees Cook	d117a154e6	capability: Initialize as LSM_ORDER_FIRST This converts capabilities to use the new LSM_ORDER_FIRST position. Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>	2019-01-08 13:18:44 -08:00
Paul Gortmaker	876979c930	security: audit and remove any unnecessary uses of module.h Historically a lot of these existed because we did not have a distinction between what was modular code and what was providing support to modules via EXPORT_SYMBOL and friends. That changed when we forked out support for the latter into the export.h file. This means we should be able to reduce the usage of module.h in code that is obj-y Makefile or bool Kconfig. The advantage in removing such instances is that module.h itself sources about 15 other headers; adding significantly to what we feed cpp, and it can obscure what headers we are effectively using. Since module.h might have been the implicit source for init.h (for __init) and for export.h (for EXPORT_SYMBOL) we consider each instance for the presence of either and replace as needed. Cc: James Morris <jmorris@namei.org> Cc: "Serge E. Hallyn" <serge@hallyn.com> Cc: John Johansen <john.johansen@canonical.com> Cc: Mimi Zohar <zohar@linux.ibm.com> Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: linux-security-module@vger.kernel.org Cc: linux-integrity@vger.kernel.org Cc: keyrings@vger.kernel.org Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: James Morris <james.morris@microsoft.com>	2018-12-12 14:58:51 -08:00
James Morris	e42f6f9be4	Linux 4.19-rc2 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAluMWCMeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGZjoH+wbEQcuzTDQGRkFG zP8VTf4tWkRc9cBAaoWfcO7vvnpFshUj4Ebdk2Nhd4NXELXQmuJBTva25tN92wzu B4cUSnTWaSkGnJHLS0V+Z1zTCPXlUgmi0mcg+dQkoHiMPTb/HVaFocHObhQdvGpx 3mJisvm/MvsrT8HX92BhSe6N6rqJ4kVdRxd5TQtVTrrJtKjbWIfkfQoNiJPeapTV 14J+c6sGyVlUUnOQ5NcV/MBxvChn+AoyB0mp22L7t1IPVwv6Spz7ZPAzzyQ/WUB/ qAAvClc3cifE8KdQj4MkxThnAfuC3Q5Ifu2+EtXmBRPFFnj/nO4gJDkADj6WRxKT 7SJE5RI= =P+RK -----END PGP SIGNATURE----- Merge tag 'v4.19-rc2' into next-general Sync to Linux 4.19-rc2 for downstream developers.	2018-09-04 11:35:54 -07:00
Christian Brauner	4408e300a6	security/capabilities: remove check for -EINVAL bprm_caps_from_vfs_caps() never returned -EINVAL so remove the rc == -EINVAL check. Signed-off-by: Christian Brauner <christian@brauner.io> Reviewed-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: James Morris <james.morris@microsoft.com>	2018-08-29 09:05:28 -07:00
Eddie.Horng	355139a8db	cap_inode_getsecurity: use d_find_any_alias() instead of d_find_alias() The code in cap_inode_getsecurity(), introduced by commit `8db6c34f1d` ("Introduce v3 namespaced file capabilities"), should use d_find_any_alias() instead of d_find_alias() do handle unhashed dentry correctly. This is needed, for example, if execveat() is called with an open but unlinked overlayfs file, because overlayfs unhashes dentry on unlink. This is a regression of real life application, first reported at https://www.spinics.net/lists/linux-unionfs/msg05363.html Below reproducer and setup can reproduce the case. const char* exec="echo"; const char newargv[] = { "echo", "hello", NULL}; const char newenviron[] = { NULL }; int fd, err; fd = open(exec, O_PATH); unlink(exec); err = syscall(322/SYS_execveat/, fd, "", newargv, newenviron, AT_EMPTY_PATH); if(err<0) fprintf(stderr, "execveat: %s\n", strerror(errno)); gcc compile into ~/test/a.out mount -t overlay -orw,lowerdir=/mnt/l,upperdir=/mnt/u,workdir=/mnt/w none /mnt/m cd /mnt/m cp /bin/echo . ~/test/a.out Expected result: hello Actually result: execveat: Invalid argument dmesg: Invalid argument reading file caps for /dev/fd/3 The 2nd reproducer and setup emulates similar case but for regular filesystem: const char* exec="echo"; int fd, err; char buf[256]; fd = open(exec, O_RDONLY); unlink(exec); err = fgetxattr(fd, "security.capability", buf, 256); if(err<0) fprintf(stderr, "fgetxattr: %s\n", strerror(errno)); gcc compile into ~/test_fgetxattr cd /tmp cp /bin/echo . ~/test_fgetxattr Result: fgetxattr: Invalid argument On regular filesystem, for example, ext4 read xattr from disk and return to execveat(), will not trigger this issue, however, the overlay attr handler pass real dentry to vfs_getxattr() will. This reproducer calls fgetxattr() with an unlinked fd, involkes vfs_getxattr() then reproduced the case that d_find_alias() in cap_inode_getsecurity() can't find the unlinked dentry. Suggested-by: Amir Goldstein <amir73il@gmail.com> Acked-by: Amir Goldstein <amir73il@gmail.com> Acked-by: Serge E. Hallyn <serge@hallyn.com> Fixes: `8db6c34f1d` ("Introduce v3 namespaced file capabilities") Cc: <stable@vger.kernel.org> # v4.14 Signed-off-by: Eddie Horng <eddie.horng@mediatek.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2018-08-11 02:05:53 -05:00
Eric W. Biederman	b1d749c5c3	capabilities: Allow privileged user in s_user_ns to set security.* xattrs A privileged user in s_user_ns will generally have the ability to manipulate the backing store and insert security.* xattrs into the filesystem directly. Therefore the kernel must be prepared to handle these xattrs from unprivileged mounts, and it makes little sense for commoncap to prevent writing these xattrs to the filesystem. The capability and LSM code have already been updated to appropriately handle xattrs from unprivileged mounts, so it is safe to loosen this restriction on setting xattrs. The exception to this logic is that writing xattrs to a mounted filesystem may also cause the LSM inode_post_setxattr or inode_setsecurity callbacks to be invoked. SELinux will deny the xattr update by virtue of applying mountpoint labeling to unprivileged userns mounts, and Smack will deny the writes for any user without global CAP_MAC_ADMIN, so loosening the capability check in commoncap is safe in this respect as well. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Serge Hallyn <serge@hallyn.com> Acked-by: Christian Brauner <christian@brauner.io> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2018-05-24 12:03:31 -05:00
Tetsuo Handa	1f5781725d	commoncap: Handle memory allocation failure. syzbot is reporting NULL pointer dereference at xattr_getsecurity() [1], for cap_inode_getsecurity() is returning sizeof(struct vfs_cap_data) when memory allocation failed. Return -ENOMEM if memory allocation failed. [1] https://syzkaller.appspot.com/bug?id=a55ba438506fe68649a5f50d2d82d56b365e0107 Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Fixes: `8db6c34f1d` ("Introduce v3 namespaced file capabilities") Reported-by: syzbot <syzbot+9369930ca44f29e60e2d@syzkaller.appspotmail.com> Cc: stable <stable@vger.kernel.org> # 4.14+ Acked-by: Serge E. Hallyn <serge@hallyn.com> Acked-by: James Morris <james.morris@microsoft.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2018-04-10 19:17:41 -05:00
Eric Biggers	dc32b5c3e6	capabilities: fix buffer overread on very short xattr If userspace attempted to set a "security.capability" xattr shorter than 4 bytes (e.g. 'setfattr -n security.capability -v x file'), then cap_convert_nscap() read past the end of the buffer containing the xattr value because it accessed the ->magic_etc field without verifying that the xattr value is long enough to contain that field. Fix it by validating the xattr value size first. This bug was found using syzkaller with KASAN. The KASAN report was as follows (cleaned up slightly): BUG: KASAN: slab-out-of-bounds in cap_convert_nscap+0x514/0x630 security/commoncap.c:498 Read of size 4 at addr ffff88002d8741c0 by task syz-executor1/2852 CPU: 0 PID: 2852 Comm: syz-executor1 Not tainted 4.15.0-rc6-00200-gcc0aac99d977 #253 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0xe3/0x195 lib/dump_stack.c:53 print_address_description+0x73/0x260 mm/kasan/report.c:252 kasan_report_error mm/kasan/report.c:351 [inline] kasan_report+0x235/0x350 mm/kasan/report.c:409 cap_convert_nscap+0x514/0x630 security/commoncap.c:498 setxattr+0x2bd/0x350 fs/xattr.c:446 path_setxattr+0x168/0x1b0 fs/xattr.c:472 SYSC_setxattr fs/xattr.c:487 [inline] SyS_setxattr+0x36/0x50 fs/xattr.c:483 entry_SYSCALL_64_fastpath+0x18/0x85 Fixes: `8db6c34f1d` ("Introduce v3 namespaced file capabilities") Cc: <stable@vger.kernel.org> # v4.14+ Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2018-01-02 20:49:13 +11:00
Linus Torvalds	55b3a0cb5a	Merge branch 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull general security subsystem updates from James Morris: "TPM (from Jarkko): - essential clean up for tpm_crb so that ARM64 and x86 versions do not distract each other as much as before - /dev/tpm0 rejects now too short writes (shorter buffer than specified in the command header - use DMA-safe buffer in tpm_tis_spi - otherwise mostly minor fixes. Smack: - base support for overlafs Capabilities: - BPRM_FCAPS fixes, from Richard Guy Briggs: The audit subsystem is adding a BPRM_FCAPS record when auditing setuid application execution (SYSCALL execve). This is not expected as it was supposed to be limited to when the file system actually had capabilities in an extended attribute. It lists all capabilities making the event really ugly to parse what is happening. The PATH record correctly records the setuid bit and owner. Suppress the BPRM_FCAPS record on setid. TOMOYO: - Y2038 timestamping fixes" 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (28 commits) MAINTAINERS: update the IMA, EVM, trusted-keys, encrypted-keys entries Smack: Base support for overlayfs MAINTAINERS: remove David Safford as maintainer for encrypted+trusted keys tomoyo: fix timestamping for y2038 capabilities: audit log other surprising conditions capabilities: fix logic for effective root or real root capabilities: invert logic for clarity capabilities: remove a layer of conditional logic capabilities: move audit log decision to function capabilities: use intuitive names for id changes capabilities: use root_priveleged inline to clarify logic capabilities: rename has_cap to has_fcap capabilities: intuitive names for cap gain status capabilities: factor out cap_bprm_set_creds privileged root tpm, tpm_tis: use ARRAY_SIZE() to define TPM_HID_USR_IDX tpm: fix duplicate inline declaration specifier tpm: fix type of a local variables in tpm_tis_spi.c tpm: fix type of a local variable in tpm2_map_command() tpm: fix type of a local variable in tpm2_get_cc_attrs_tbl() tpm-dev-common: Reject too short writes ...	2017-11-13 10:30:44 -08:00
Richard Guy Briggs	dbbbe1105e	capabilities: audit log other surprising conditions The existing condition tested for process effective capabilities set by file attributes but intended to ignore the change if the result was unsurprisingly an effective full set in the case root is special with a setuid root executable file and we are root. Stated again: - When you execute a setuid root application, it is no surprise and expected that it got all capabilities, so we do not want capabilities recorded. if (pE_grew && !(pE_fullset && (eff_root \|\| real_root) && root_priveleged) ) Now make sure we cover other cases: - If something prevented a setuid root app getting all capabilities and it wound up with one capability only, then it is a surprise and should be logged. When it is a setuid root file, we only want capabilities when the process does not get full capabilities.. root_priveleged && setuid_root && !pE_fullset - Similarly if a non-setuid program does pick up capabilities due to file system based capabilities, then we want to know what capabilities were picked up. When it has file system based capabilities we want the capabilities. !is_setuid && (has_fcap && pP_gained) - If it is a non-setuid file and it gets ambient capabilities, we want the capabilities. !is_setuid && pA_gained - These last two are combined into one due to the common first parameter. Related: https://github.com/linux-audit/audit-kernel/issues/16 Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Reviewed-by: Serge Hallyn <serge@hallyn.com> Acked-by: James Morris <james.l.morris@oracle.com> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2017-10-20 15:22:46 +11:00
Richard Guy Briggs	588fb2c7e2	capabilities: fix logic for effective root or real root Now that the logic is inverted, it is much easier to see that both real root and effective root conditions had to be met to avoid printing the BPRM_FCAPS record with audit syscalls. This meant that any setuid root applications would print a full BPRM_FCAPS record when it wasn't necessary, cluttering the event output, since the SYSCALL and PATH records indicated the presence of the setuid bit and effective root user id. Require only one of effective root or real root to avoid printing the unnecessary record. Ref: commit `3fc689e96c` ("Add audit_log_bprm_fcaps/AUDIT_BPRM_FCAPS") See: https://github.com/linux-audit/audit-kernel/issues/16 Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Reviewed-by: Serge Hallyn <serge@hallyn.com> Acked-by: James Morris <james.l.morris@oracle.com> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2017-10-20 15:22:45 +11:00
Richard Guy Briggs	c0d1adefe0	capabilities: invert logic for clarity The way the logic was presented, it was awkward to read and verify. Invert the logic using DeMorgan's Law to be more easily able to read and understand. Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Reviewed-by: Serge Hallyn <serge@hallyn.com> Acked-by: James Morris <james.l.morris@oracle.com> Acked-by: Kees Cook <keescook@chromium.org> Okay-ished-by: Paul Moore <paul@paul-moore.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2017-10-20 15:22:45 +11:00
Richard Guy Briggs	02ebbaf48c	capabilities: remove a layer of conditional logic Remove a layer of conditional logic to make the use of conditions easier to read and analyse. Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Reviewed-by: Serge Hallyn <serge@hallyn.com> Acked-by: James Morris <james.l.morris@oracle.com> Acked-by: Kees Cook <keescook@chromium.org> Okay-ished-by: Paul Moore <paul@paul-moore.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2017-10-20 15:22:45 +11:00
Richard Guy Briggs	9fbc2c7964	capabilities: move audit log decision to function Move the audit log decision logic to its own function to isolate the complexity in one place. Suggested-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Reviewed-by: Serge Hallyn <serge@hallyn.com> Acked-by: James Morris <james.l.morris@oracle.com> Acked-by: Kees Cook <keescook@chromium.org> Okay-ished-by: Paul Moore <paul@paul-moore.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2017-10-20 15:22:44 +11:00
Richard Guy Briggs	81a6a01299	capabilities: use intuitive names for id changes Introduce a number of inlines to make the use of the negation of uid_eq() easier to read and analyse. Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Reviewed-by: Serge Hallyn <serge@hallyn.com> Acked-by: James Morris <james.l.morris@oracle.com> Acked-by: Kees Cook <keescook@chromium.org> Okay-ished-by: Paul Moore <paul@paul-moore.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2017-10-20 15:22:44 +11:00
Richard Guy Briggs	9304b46c91	capabilities: use root_priveleged inline to clarify logic Introduce inline root_privileged() to make use of SECURE_NONROOT easier to read. Suggested-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Reviewed-by: Serge Hallyn <serge@hallyn.com> Acked-by: James Morris <james.l.morris@oracle.com> Acked-by: Kees Cook <keescook@chromium.org> Okay-ished-by: Paul Moore <paul@paul-moore.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2017-10-20 15:22:44 +11:00
Richard Guy Briggs	fc7eadf768	capabilities: rename has_cap to has_fcap Rename has_cap to has_fcap to clarify it applies to file capabilities since the entire source file is about capabilities. Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Reviewed-by: Serge Hallyn <serge@hallyn.com> Acked-by: James Morris <james.l.morris@oracle.com> Acked-by: Kees Cook <keescook@chromium.org> Okay-ished-by: Paul Moore <paul@paul-moore.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2017-10-20 15:22:44 +11:00
Richard Guy Briggs	4c7e715fc8	capabilities: intuitive names for cap gain status Introduce macros cap_gained, cap_grew, cap_full to make the use of the negation of is_subset() easier to read and analyse. Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Reviewed-by: Serge Hallyn <serge@hallyn.com> Acked-by: James Morris <james.l.morris@oracle.com> Acked-by: Kees Cook <keescook@chromium.org> Okay-ished-by: Paul Moore <paul@paul-moore.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2017-10-20 15:22:43 +11:00
Richard Guy Briggs	db1a8922cf	capabilities: factor out cap_bprm_set_creds privileged root Factor out the case of privileged root from the function cap_bprm_set_creds() to make the latter easier to read and analyse. Suggested-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Reviewed-by: Serge Hallyn <serge@hallyn.com> Acked-by: James Morris <james.l.morris@oracle.com> Acked-by: Kees Cook <keescook@chromium.org> Okay-ished-by: Paul Moore <paul@paul-moore.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2017-10-20 15:22:43 +11:00
Colin Ian King	76ba89c76f	commoncap: move assignment of fs_ns to avoid null pointer dereference The pointer fs_ns is assigned from inode->i_ib->s_user_ns before a null pointer check on inode, hence if inode is actually null we will get a null pointer dereference on this assignment. Fix this by only dereferencing inode after the null pointer check on inode. Detected by CoverityScan CID#1455328 ("Dereference before null check") Fixes: `8db6c34f1d` ("Introduce v3 namespaced file capabilities") Signed-off-by: Colin Ian King <colin.king@canonical.com> Cc: stable@vger.kernel.org Acked-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2017-10-19 13:09:33 +11:00
Linus Torvalds	a302824782	Merge branch 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull misc security layer update from James Morris: "This is the remaining 'general' change in the security tree for v4.14, following the direct merging of SELinux (+ TOMOYO), AppArmor, and seccomp. That's everything now for the security tree except IMA, which will follow shortly (I've been traveling for the past week with patchy internet)" * 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: security: fix description of values returned by cap_inode_need_killpriv	2017-09-24 11:40:41 -07:00
Stefan Berger	ab5348c9c2	security: fix description of values returned by cap_inode_need_killpriv cap_inode_need_killpriv returns 1 if security.capability exists and has a value and inode_killpriv() is required, 0 otherwise. Fix the description of the return value to reflect this. Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com> Reviewed-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2017-09-23 21:15:41 -07:00
Linus Torvalds	dd198ce714	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull namespace updates from Eric Biederman: "Life has been busy and I have not gotten half as much done this round as I would have liked. I delayed it so that a minor conflict resolution with the mips tree could spend a little time in linux-next before I sent this pull request. This includes two long delayed user namespace changes from Kirill Tkhai. It also includes a very useful change from Serge Hallyn that allows the security capability attribute to be used inside of user namespaces. The practical effect of this is people can now untar tarballs and install rpms in user namespaces. It had been suggested to generalize this and encode some of the namespace information information in the xattr name. Upon close inspection that makes the things that should be hard easy and the things that should be easy more expensive. Then there is my bugfix/cleanup for signal injection that removes the magic encoding of the siginfo union member from the kernel internal si_code. The mips folks reported the case where I had used FPE_FIXME me is impossible so I have remove FPE_FIXME from mips, while at the same time including a return statement in that case to keep gcc from complaining about unitialized variables. I almost finished the work to get make copy_siginfo_to_user a trivial copy to user. The code is available at: git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace.git neuter-copy_siginfo_to_user-v3 But I did not have time/energy to get the code posted and reviewed before the merge window opened. I was able to see that the security excuse for just copying fields that we know are initialized doesn't work in practice there are buggy initializations that don't initialize the proper fields in siginfo. So we still sometimes copy unitialized data to userspace" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: Introduce v3 namespaced file capabilities mips/signal: In force_fcr31_sig return in the impossible case signal: Remove kernel interal si_code magic fcntl: Don't use ambiguous SIG_POLL si_codes prctl: Allow local CAP_SYS_ADMIN changing exe_file security: Use user_namespace::level to avoid redundant iterations in cap_capable() userns,pidns: Verify the userns for new pid namespaces signal/testing: Don't look for __SI_FAULT in userspace signal/mips: Document a conflict with SI_USER with SIGFPE signal/sparc: Document a conflict with SI_USER with SIGFPE signal/ia64: Document a conflict with SI_USER with SIGFPE signal/alpha: Document a conflict with SI_USER for SIGTRAP	2017-09-11 18:34:47 -07:00
Serge E. Hallyn	8db6c34f1d	Introduce v3 namespaced file capabilities Root in a non-initial user ns cannot be trusted to write a traditional security.capability xattr. If it were allowed to do so, then any unprivileged user on the host could map his own uid to root in a private namespace, write the xattr, and execute the file with privilege on the host. However supporting file capabilities in a user namespace is very desirable. Not doing so means that any programs designed to run with limited privilege must continue to support other methods of gaining and dropping privilege. For instance a program installer must detect whether file capabilities can be assigned, and assign them if so but set setuid-root otherwise. The program in turn must know how to drop partial capabilities, and do so only if setuid-root. This patch introduces v3 of the security.capability xattr. It builds a vfs_ns_cap_data struct by appending a uid_t rootid to struct vfs_cap_data. This is the absolute uid_t (that is, the uid_t in user namespace which mounted the filesystem, usually init_user_ns) of the root id in whose namespaces the file capabilities may take effect. When a task asks to write a v2 security.capability xattr, if it is privileged with respect to the userns which mounted the filesystem, then nothing should change. Otherwise, the kernel will transparently rewrite the xattr as a v3 with the appropriate rootid. This is done during the execution of setxattr() to catch user-space-initiated capability writes. Subsequently, any task executing the file which has the noted kuid as its root uid, or which is in a descendent user_ns of such a user_ns, will run the file with capabilities. Similarly when asking to read file capabilities, a v3 capability will be presented as v2 if it applies to the caller's namespace. If a task writes a v3 security.capability, then it can provide a uid for the xattr so long as the uid is valid in its own user namespace, and it is privileged with CAP_SETFCAP over its namespace. The kernel will translate that rootid to an absolute uid, and write that to disk. After this, a task in the writer's namespace will not be able to use those capabilities (unless rootid was 0), but a task in a namespace where the given uid is root will. Only a single security.capability xattr may exist at a time for a given file. A task may overwrite an existing xattr so long as it is privileged over the inode. Note this is a departure from previous semantics, which required privilege to remove a security.capability xattr. This check can be re-added if deemed useful. This allows a simple setxattr to work, allows tar/untar to work, and allows us to tar in one namespace and untar in another while preserving the capability, without risking leaking privilege into a parent namespace. Example using tar: $ cp /bin/sleep sleepx $ mkdir b1 b2 $ lxc-usernsexec -m b:0:100000:1 -m b:1:$(id -u):1 -- chown 0:0 b1 $ lxc-usernsexec -m b:0:100001:1 -m b:1:$(id -u):1 -- chown 0:0 b2 $ lxc-usernsexec -m b:0:100000:1000 -- tar --xattrs-include=security.capability --xattrs -cf b1/sleepx.tar sleepx $ lxc-usernsexec -m b:0:100001:1000 -- tar --xattrs-include=security.capability --xattrs -C b2 -xf b1/sleepx.tar $ lxc-usernsexec -m b:0:100001:1000 -- getcap b2/sleepx b2/sleepx = cap_sys_admin+ep # /opt/ltp/testcases/bin/getv3xattr b2/sleepx v3 xattr, rootid is 100001 A patch to linux-test-project adding a new set of tests for this functionality is in the nsfscaps branch at github.com/hallyn/ltp Changelog: Nov 02 2016: fix invalid check at refuse_fcap_overwrite() Nov 07 2016: convert rootid from and to fs user_ns (From ebiederm: mar 28 2017) commoncap.c: fix typos - s/v4/v3 get_vfs_caps_from_disk: clarify the fs_ns root access check nsfscaps: change the code split for cap_inode_setxattr() Apr 09 2017: don't return v3 cap for caps owned by current root. return a v2 cap for a true v2 cap in non-init ns Apr 18 2017: . Change the flow of fscap writing to support s_user_ns writing. . Remove refuse_fcap_overwrite(). The value of the previous xattr doesn't matter. Apr 24 2017: . incorporate Eric's incremental diff . move cap_convert_nscap to setxattr and simplify its usage May 8, 2017: . fix leaking dentry refcount in cap_inode_getsecurity Signed-off-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2017-09-01 14:57:15 -05:00
Kees Cook	ee67ae7ef6	commoncap: Move cap_elevated calculation into bprm_set_creds Instead of a separate function, open-code the cap_elevated test, which lets us entirely remove bprm->cap_effective (to use the local "effective" variable instead), and more accurately examine euid/egid changes via the existing local "is_setid". The following LTP tests were run to validate the changes: # ./runltp -f syscalls -s cap # ./runltp -f securebits # ./runltp -f cap_bounds # ./runltp -f filecaps All kernel selftests for capabilities and exec continue to pass as well. Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: James Morris <james.l.morris@oracle.com> Acked-by: Serge Hallyn <serge@hallyn.com> Reviewed-by: Andy Lutomirski <luto@kernel.org>	2017-08-01 12:03:09 -07:00
Kees Cook	46d98eb4e1	commoncap: Refactor to remove bprm_secureexec hook The commoncap implementation of the bprm_secureexec hook is the only LSM that depends on the final call to its bprm_set_creds hook (since it may be called for multiple files, it ignores bprm->called_set_creds). As a result, it cannot safely _clear_ bprm->secureexec since other LSMs may have set it. Instead, remove the bprm_secureexec hook by introducing a new flag to bprm specific to commoncap: cap_elevated. This is similar to cap_effective, but that is used for a specific subset of elevated privileges, and exists solely to track state from bprm_set_creds to bprm_secureexec. As such, it will be removed in the next patch. Here, set the new bprm->cap_elevated flag when setuid/setgid has happened from bprm_fill_uid() or fscapabilities have been prepared. This temporarily moves the bprm_secureexec hook to a static inline. The helper will be removed in the next patch; this makes the step easier to review and bisect, since this does not introduce any changes to inputs nor outputs to the "elevated privileges" calculation. The new flag is merged with the bprm->secureexec flag in setup_new_exec() since this marks the end of any further prepare_binprm() calls. Cc: Andy Lutomirski <luto@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Andy Lutomirski <luto@kernel.org> Acked-by: James Morris <james.l.morris@oracle.com> Acked-by: Serge Hallyn <serge@hallyn.com>	2017-08-01 12:03:08 -07:00
Kirill Tkhai	64db4c7f4c	security: Use user_namespace::level to avoid redundant iterations in cap_capable() When ns->level is not larger then cred->user_ns->level, then ns can't be cred->user_ns's descendant, and there is no a sense to search in parents. So, break the cycle earlier and skip needless iterations. v2: Change comment on suggested by Andy Lutomirski. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2017-07-20 07:46:06 -05:00
James Morris	ca97d939db	security: mark LSM hooks as __ro_after_init Mark all of the registration hooks as __ro_after_init (via the __lsm_ro_after_init macro). Signed-off-by: James Morris <james.l.morris@oracle.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Acked-by: Kees Cook <keescook@chromium.org>	2017-03-06 11:00:15 +11:00
Linus Torvalds	f1ef09fde1	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull namespace updates from Eric Biederman: "There is a lot here. A lot of these changes result in subtle user visible differences in kernel behavior. I don't expect anything will care but I will revert/fix things immediately if any regressions show up. From Seth Forshee there is a continuation of the work to make the vfs ready for unpriviled mounts. We had thought the previous changes prevented the creation of files outside of s_user_ns of a filesystem, but it turns we missed the O_CREAT path. Ooops. Pavel Tikhomirov and Oleg Nesterov worked together to fix a long standing bug in the implemenation of PR_SET_CHILD_SUBREAPER where only children that are forked after the prctl are considered and not children forked before the prctl. The only known user of this prctl systemd forks all children after the prctl. So no userspace regressions will occur. Holding earlier forked children to the same rules as later forked children creates a semantic that is sane enough to allow checkpoing of processes that use this feature. There is a long delayed change by Nikolay Borisov to limit inotify instances inside a user namespace. Michael Kerrisk extends the API for files used to maniuplate namespaces with two new trivial ioctls to allow discovery of the hierachy and properties of namespaces. Konstantin Khlebnikov with the help of Al Viro adds code that when a network namespace exits purges it's sysctl entries from the dcache. As in some circumstances this could use a lot of memory. Vivek Goyal fixed a bug with stacked filesystems where the permissions on the wrong inode were being checked. I continue previous work on ptracing across exec. Allowing a file to be setuid across exec while being ptraced if the tracer has enough credentials in the user namespace, and if the process has CAP_SETUID in it's own namespace. Proc files for setuid or otherwise undumpable executables are now owned by the root in the user namespace of their mm. Allowing debugging of setuid applications in containers to work better. A bug I introduced with permission checking and automount is now fixed. The big change is to mark the mounts that the kernel initiates as a result of an automount. This allows the permission checks in sget to be safely suppressed for this kind of mount. As the permission check happened when the original filesystem was mounted. Finally a special case in the mount namespace is removed preventing unbounded chains in the mount hash table, and making the semantics simpler which benefits CRIU. The vfs fix along with related work in ima and evm I believe makes us ready to finish developing and merge fully unprivileged mounts of the fuse filesystem. The cleanups of the mount namespace makes discussing how to fix the worst case complexity of umount. The stacked filesystem fixes pave the way for adding multiple mappings for the filesystem uids so that efficient and safer containers can be implemented" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: proc/sysctl: Don't grab i_lock under sysctl_lock. vfs: Use upper filesystem inode in bprm_fill_uid() proc/sysctl: prune stale dentries during unregistering mnt: Tuck mounts under others instead of creating shadow/side mounts. prctl: propagate has_child_subreaper flag to every descendant introduce the walk_process_tree() helper nsfs: Add an ioctl() to return owner UID of a userns fs: Better permission checking for submounts exit: fix the setns() && PR_SET_CHILD_SUBREAPER interaction vfs: open() with O_CREAT should not create inodes with unknown ids nsfs: Add an ioctl() to return the namespace type proc: Better ownership of files for non-dumpable tasks in user namespaces exec: Remove LSM_UNSAFE_PTRACE_CAP exec: Test the ptracer's saved cred to see if the tracee can gain caps exec: Don't reset euid and egid when the tracee has CAP_SETUID inotify: Convert to using per-namespace limits	2017-02-23 20:33:51 -08:00
Eric W. Biederman	9227dd2a84	exec: Remove LSM_UNSAFE_PTRACE_CAP With previous changes every location that tests for LSM_UNSAFE_PTRACE_CAP also tests for LSM_UNSAFE_PTRACE making the LSM_UNSAFE_PTRACE_CAP redundant, so remove it. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2017-01-24 12:03:08 +13:00
Eric W. Biederman	20523132ec	exec: Test the ptracer's saved cred to see if the tracee can gain caps Now that we have user namespaces and non-global capabilities verify the tracer has capabilities in the relevant user namespace instead of in the current_user_ns(). As the test for setting LSM_UNSAFE_PTRACE_CAP is currently ptracer_capable(p, current_user_ns()) and the new task credentials are in current_user_ns() this change does not have any user visible change and simply moves the test to where it is used, making the code easier to read. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2017-01-24 12:03:08 +13:00
Eric W. Biederman	70169420f5	exec: Don't reset euid and egid when the tracee has CAP_SETUID Don't reset euid and egid when the tracee has CAP_SETUID in it's user namespace. I punted on relaxing this permission check long ago but now that I have read this code closely it is clear it is safe to test against CAP_SETUID in the user namespace. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2017-01-24 12:03:07 +13:00
Casey Schaufler	d69dece5f5	LSM: Add /sys/kernel/security/lsm I am still tired of having to find indirect ways to determine what security modules are active on a system. I have added /sys/kernel/security/lsm, which contains a comma separated list of the active security modules. No more groping around in /proc/filesystems or other clever hacks. Unchanged from previous versions except for being updated to the latest security next branch. Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> Acked-by: John Johansen <john.johansen@canonical.com> Acked-by: Paul Moore <paul@paul-moore.com> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: James Morris <james.l.morris@oracle.com>	2017-01-19 13:18:29 +11:00

1 2 3 4

162 Commits