linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-11-23 23:20:49 +07:00

Author	SHA1	Message	Date
THOBY Simon	7b2e93207c	IMA: remove the dependency on CRYPTO_MD5 commit 8510505d55e194d3f6c9644c9f9d12c4f6b0395a upstream. MD5 is a weak digest algorithm that shouldn't be used for cryptographic operation. It hinders the efficiency of a patch set that aims to limit the digests allowed for the extended file attribute namely security.ima. MD5 is no longer a requirement for IMA, nor should it be used there. The sole place where we still use the MD5 algorithm inside IMA is setting the ima_hash algorithm to MD5, if the user supplies 'ima_hash=md5' parameter on the command line. With commit `ab60368ab6` ("ima: Fallback to the builtin hash algorithm"), setting "ima_hash=md5" fails gracefully when CRYPTO_MD5 is not set: ima: Can not allocate md5 (reason: -2) ima: Allocating md5 failed, going to use default hash algorithm sha256 Remove the CRYPTO_MD5 dependency for IMA. Signed-off-by: THOBY Simon <Simon.THOBY@viveris.fr> Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com> [zohar@linux.ibm.com: include commit number in patch description for stable.] Cc: stable@vger.kernel.org # 4.17 Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-07-05 19:13:55 +02:00
Austin Kim	6dfbee5d38	IMA: remove -Wmissing-prototypes warning commit a32ad90426a9c8eb3915eed26e08ce133bd9e0da upstream. With W=1 build, the compiler throws warning message as below: security/integrity/ima/ima_mok.c:24:12: warning: no previous prototype for ‘ima_mok_init’ [-Wmissing-prototypes] __init int ima_mok_init(void) Silence the warning by adding static keyword to ima_mok_init(). Signed-off-by: Austin Kim <austin.kim@lge.com> Fixes: `41c89b64d7` ("IMA: create machine owner and blacklist keyrings") Cc: stable@vger.kernel.org Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-07-05 19:13:54 +02:00
Daniel Borkmann	25af1ec4c5	bpf: Add lockdown check for probe_write_user helper commit 51e1bb9eeaf7868db56e58f47848e364ab4c4129 upstream. Back then, commit `96ae522795` ("bpf: Add bpf_probe_write_user BPF helper to be called in tracers") added the bpf_probe_write_user() helper in order to allow to override user space memory. Its original goal was to have a facility to "debug, divert, and manipulate execution of semi-cooperative processes" under CAP_SYS_ADMIN. Write to kernel was explicitly disallowed since it would otherwise tamper with its integrity. One use case was shown in `cf9b1199de` ("samples/bpf: Add test/example of using bpf_probe_write_user bpf helper") where the program DNATs traffic at the time of connect(2) syscall, meaning, it rewrites the arguments to a syscall while they're still in userspace, and before the syscall has a chance to copy the argument into kernel space. These days we have better mechanisms in BPF for achieving the same (e.g. for load-balancers), but without having to write to userspace memory. Of course the bpf_probe_write_user() helper can also be used to abuse many other things for both good or bad purpose. Outside of BPF, there is a similar mechanism for ptrace(2) such as PTRACE_PEEK{TEXT,DATA} and PTRACE_POKE{TEXT,DATA}, but would likely require some more effort. Commit `96ae522795` explicitly dedicated the helper for experimentation purpose only. Thus, move the helper's availability behind a newly added LOCKDOWN_BPF_WRITE_USER lockdown knob so that the helper is disabled under the "integrity" mode. More fine-grained control can be implemented also from LSM side with this change. Fixes: `96ae522795` ("bpf: Add bpf_probe_write_user BPF helper to be called in tracers") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-07-05 18:53:10 +02:00
Xiu Jianfeng	05af1b5cdd	selinux: correct the return value when loads initial sids commit 4c156084daa8ee70978e4b150b5eb5fc7b1f15be upstream. It should not return 0 when SID 0 is assigned to isids. This patch fixes it. Cc: stable@vger.kernel.org Fixes: `e3e0b582c3` ("selinux: remove unused initial SIDs and improve handling") Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com> [PM: remove changelog from description] Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-07-05 18:52:30 +02:00
AuxXxilium	5fa3ea047a	init: add dsm gpl source Signed-off-by: AuxXxilium <info@auxxxilium.tech>	2024-07-05 18:00:04 +02:00
Tetsuo Handa	3780348c1a	smackfs: restrict bytes count in smk_set_cipso() commit 49ec114a6e62d8d320037ce71c1aaf9650b3cafd upstream. Oops, I failed to update subject line. From 07571157c91b98ce1a4aa70967531e64b78e8346 Mon Sep 17 00:00:00 2001 From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Date: Mon, 12 Apr 2021 22:25:06 +0900 Subject: smackfs: restrict bytes count in smk_set_cipso() Commit 7ef4c19d245f3dc2 ("smackfs: restrict bytes count in smackfs write functions") missed that count > SMK_CIPSOMAX check applies to only format == SMK_FIXED24_FMT case. Reported-by: syzbot <syzbot+77c53db50c9fff774e8e@syzkaller.appspotmail.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-07-19 09:45:03 +02:00
Minchan Kim	f38371821c	selinux: use __GFP_NOWARN with GFP_NOWAIT in the AVC [ Upstream commit 648f2c6100cfa18e7dfe43bc0b9c3b73560d623c ] In the field, we have seen lots of allocation failure from the call path below. 06-03 13:29:12.999 1010315 31557 31557 W Binder : 31542_2: page allocation failure: order:0, mode:0x800(GFP_NOWAIT), nodemask=(null),cpuset=background,mems_allowed=0 ... ... 06-03 13:29:12.999 1010315 31557 31557 W Call trace: 06-03 13:29:12.999 1010315 31557 31557 W : dump_backtrace.cfi_jt+0x0/0x8 06-03 13:29:12.999 1010315 31557 31557 W : dump_stack+0xc8/0x14c 06-03 13:29:12.999 1010315 31557 31557 W : warn_alloc+0x158/0x1c8 06-03 13:29:12.999 1010315 31557 31557 W : __alloc_pages_slowpath+0x9d8/0xb80 06-03 13:29:12.999 1010315 31557 31557 W : __alloc_pages_nodemask+0x1c4/0x430 06-03 13:29:12.999 1010315 31557 31557 W : allocate_slab+0xb4/0x390 06-03 13:29:12.999 1010315 31557 31557 W : ___slab_alloc+0x12c/0x3a4 06-03 13:29:12.999 1010315 31557 31557 W : kmem_cache_alloc+0x358/0x5e4 06-03 13:29:12.999 1010315 31557 31557 W : avc_alloc_node+0x30/0x184 06-03 13:29:12.999 1010315 31557 31557 W : avc_update_node+0x54/0x4f0 06-03 13:29:12.999 1010315 31557 31557 W : avc_has_extended_perms+0x1a4/0x460 06-03 13:29:12.999 1010315 31557 31557 W : selinux_file_ioctl+0x320/0x3d0 06-03 13:29:12.999 1010315 31557 31557 W : __arm64_sys_ioctl+0xec/0x1fc 06-03 13:29:12.999 1010315 31557 31557 W : el0_svc_common+0xc0/0x24c 06-03 13:29:12.999 1010315 31557 31557 W : el0_svc+0x28/0x88 06-03 13:29:12.999 1010315 31557 31557 W : el0_sync_handler+0x8c/0xf0 06-03 13:29:12.999 1010315 31557 31557 W : el0_sync+0x1a4/0x1c0 .. .. 06-03 13:29:12.999 1010315 31557 31557 W SLUB : Unable to allocate memory on node -1, gfp=0x900(GFP_NOWAIT\|__GFP_ZERO) 06-03 13:29:12.999 1010315 31557 31557 W cache : avc_node, object size: 72, buffer size: 80, default order: 0, min order: 0 06-03 13:29:12.999 1010315 31557 31557 W node 0 : slabs: 57, objs: 2907, free: 0 06-03 13:29:12.999 1010161 10686 10686 W SLUB : Unable to allocate memory on node -1, gfp=0x900(GFP_NOWAIT\|__GFP_ZERO) 06-03 13:29:12.999 1010161 10686 10686 W cache : avc_node, object size: 72, buffer size: 80, default order: 0, min order: 0 06-03 13:29:12.999 1010161 10686 10686 W node 0 : slabs: 57, objs: 2907, free: 0 06-03 13:29:12.999 1010161 10686 10686 W SLUB : Unable to allocate memory on node -1, gfp=0x900(GFP_NOWAIT\|__GFP_ZERO) 06-03 13:29:12.999 1010161 10686 10686 W cache : avc_node, object size: 72, buffer size: 80, default order: 0, min order: 0 06-03 13:29:12.999 1010161 10686 10686 W node 0 : slabs: 57, objs: 2907, free: 0 06-03 13:29:12.999 1010161 10686 10686 W SLUB : Unable to allocate memory on node -1, gfp=0x900(GFP_NOWAIT\|__GFP_ZERO) 06-03 13:29:12.999 1010161 10686 10686 W cache : avc_node, object size: 72, buffer size: 80, default order: 0, min order: 0 06-03 13:29:12.999 1010161 10686 10686 W node 0 : slabs: 57, objs: 2907, free: 0 06-03 13:29:13.000 1010161 10686 10686 W SLUB : Unable to allocate memory on node -1, gfp=0x900(GFP_NOWAIT\|__GFP_ZERO) 06-03 13:29:13.000 1010161 10686 10686 W cache : avc_node, object size: 72, buffer size: 80, default order: 0, min order: 0 06-03 13:29:13.000 1010161 10686 10686 W node 0 : slabs: 57, objs: 2907, free: 0 06-03 13:29:13.000 1010161 10686 10686 W SLUB : Unable to allocate memory on node -1, gfp=0x900(GFP_NOWAIT\|__GFP_ZERO) 06-03 13:29:13.000 1010161 10686 10686 W cache : avc_node, object size: 72, buffer size: 80, default order: 0, min order: 0 06-03 13:29:13.000 1010161 10686 10686 W node 0 : slabs: 57, objs: 2907, free: 0 06-03 13:29:13.000 1010161 10686 10686 W SLUB : Unable to allocate memory on node -1, gfp=0x900(GFP_NOWAIT\|__GFP_ZERO) 06-03 13:29:13.000 1010161 10686 10686 W cache : avc_node, object size: 72, buffer size: 80, default order: 0, min order: 0 06-03 13:29:13.000 1010161 10686 10686 W node 0 : slabs: 57, objs: 2907, free: 0 06-03 13:29:13.000 10230 30892 30892 W SLUB : Unable to allocate memory on node -1, gfp=0x900(GFP_NOWAIT\|__GFP_ZERO) 06-03 13:29:13.000 10230 30892 30892 W cache : avc_node, object size: 72, buffer size: 80, default order: 0, min order: 0 06-03 13:29:13.000 10230 30892 30892 W node 0 : slabs: 57, objs: 2907, free: 0 06-03 13:29:13.000 10230 30892 30892 W SLUB : Unable to allocate memory on node -1, gfp=0x900(GFP_NOWAIT\|__GFP_ZERO) 06-03 13:29:13.000 10230 30892 30892 W cache : avc_node, object size: 72, buffer size: 80, default order: 0, min order: 0 Based on [1], selinux is tolerate for failure of memory allocation. Then, use __GFP_NOWARN together. [1] `476accbe2f` ("selinux: use GFP_NOWAIT in the AVC kmem_caches") Signed-off-by: Minchan Kim <minchan@kernel.org> [PM: subj fix, line wraps, normalized commit refs] Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-07-19 09:44:49 +02:00
Mimi Zohar	912d16a2d7	evm: fix writing <securityfs>/evm overflow [ Upstream commit 49219d9b8785ba712575c40e48ce0f7461254626 ] EVM_SETUP_COMPLETE is defined as 0x80000000, which is larger than INT_MAX. The "-fno-strict-overflow" compiler option properly prevents signaling EVM that the EVM policy setup is complete. Define and read an unsigned int. Fixes: `f00d797507` ("EVM: Allow userspace to signal an RSA key has been loaded") Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-07-14 16:56:04 +02:00
Roberto Sassu	53124265fc	evm: Refuse EVM_ALLOW_METADATA_WRITES only if an HMAC key is loaded commit 9acc89d31f0c94c8e573ed61f3e4340bbd526d0c upstream. EVM_ALLOW_METADATA_WRITES is an EVM initialization flag that can be set to temporarily disable metadata verification until all xattrs/attrs necessary to verify an EVM portable signature are copied to the file. This flag is cleared when EVM is initialized with an HMAC key, to avoid that the HMAC is calculated on unverified xattrs/attrs. Currently EVM unnecessarily denies setting this flag if EVM is initialized with a public key, which is not a concern as it cannot be used to trust xattrs/attrs updates. This patch removes this limitation. Fixes: `ae1ba1676b` ("EVM: Allow userland to permit modification of EVM-protected metadata") Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Cc: stable@vger.kernel.org # 4.16.x Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-07-14 16:55:46 +02:00
Roberto Sassu	7b84c7d7e2	evm: Execute evm_inode_init_security() only when an HMAC key is loaded commit 9eea2904292c2d8fa98df141d3bf7c41ec9dc1b5 upstream. evm_inode_init_security() requires an HMAC key to calculate the HMAC on initial xattrs provided by LSMs. However, it checks generically whether a key has been loaded, including also public keys, which is not correct as public keys are not suitable to calculate the HMAC. Originally, support for signature verification was introduced to verify a possibly immutable initial ram disk, when no new files are created, and to switch to HMAC for the root filesystem. By that time, an HMAC key should have been loaded and usable to calculate HMACs for new files. More recently support for requiring an HMAC key was removed from the kernel, so that signature verification can be used alone. Since this is a legitimate use case, evm_inode_init_security() should not return an error when no HMAC key has been loaded. This patch fixes this problem by replacing the evm_key_loaded() check with a check of the EVM_INIT_HMAC flag in evm_initialized. Fixes: `26ddabfe96` ("evm: enable EVM when X509 certificate is loaded") Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Cc: stable@vger.kernel.org # 4.5.x Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-07-14 16:55:46 +02:00
Eric Snowberg	1573d595e2	integrity: Load mokx variables into the blacklist keyring [ Upstream commit ebd9c2ae369a45bdd9f8615484db09be58fc242b ] During boot the Secure Boot Forbidden Signature Database, dbx, is loaded into the blacklist keyring. Systems booted with shim have an equivalent Forbidden Signature Database called mokx. Currently mokx is only used by shim and grub, the contents are ignored by the kernel. Add the ability to load mokx into the blacklist keyring during boot. Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com> Suggested-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> cc: keyrings@vger.kernel.org Link: https://lore.kernel.org/r/c33c8e3839a41e9654f41cc92c7231104931b1d7.camel@HansenPartnership.com/ Link: https://lore.kernel.org/r/20210122181054.32635-5-eric.snowberg@oracle.com/ # v5 Link: https://lore.kernel.org/r/161428674320.677100.12637282414018170743.stgit@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/161433313205.902181.2502803393898221637.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/161529607422.163428.13530426573612578854.stgit@warthog.procyon.org.uk/ # v3 Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-06-30 08:47:30 -04:00
Eric Snowberg	45109066f6	certs: Add EFI_CERT_X509_GUID support for dbx entries [ Upstream commit 56c5812623f95313f6a46fbf0beee7fa17c68bbf ] This fixes CVE-2020-26541. The Secure Boot Forbidden Signature Database, dbx, contains a list of now revoked signatures and keys previously approved to boot with UEFI Secure Boot enabled. The dbx is capable of containing any number of EFI_CERT_X509_SHA256_GUID, EFI_CERT_SHA256_GUID, and EFI_CERT_X509_GUID entries. Currently when EFI_CERT_X509_GUID are contained in the dbx, the entries are skipped. Add support for EFI_CERT_X509_GUID dbx entries. When a EFI_CERT_X509_GUID is found, it is added as an asymmetrical key to the .blacklist keyring. Anytime the .platform keyring is used, the keys in the .blacklist keyring are referenced, if a matching key is found, the key will be rejected. [DH: Made the following changes: - Added to have a config option to enable the facility. This allows a Kconfig solution to make sure that pkcs7_validate_trust() is enabled.[1][2] - Moved the functions out from the middle of the blacklist functions. - Added kerneldoc comments.] Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> cc: Randy Dunlap <rdunlap@infradead.org> cc: Mickaël Salaün <mic@digikod.net> cc: Arnd Bergmann <arnd@kernel.org> cc: keyrings@vger.kernel.org Link: https://lore.kernel.org/r/20200901165143.10295-1-eric.snowberg@oracle.com/ # rfc Link: https://lore.kernel.org/r/20200909172736.73003-1-eric.snowberg@oracle.com/ # v2 Link: https://lore.kernel.org/r/20200911182230.62266-1-eric.snowberg@oracle.com/ # v3 Link: https://lore.kernel.org/r/20200916004927.64276-1-eric.snowberg@oracle.com/ # v4 Link: https://lore.kernel.org/r/20210122181054.32635-2-eric.snowberg@oracle.com/ # v5 Link: https://lore.kernel.org/r/161428672051.677100.11064981943343605138.stgit@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/161433310942.902181.4901864302675874242.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/161529605075.163428.14625520893961300757.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/bc2c24e3-ed68-2521-0bf4-a1f6be4a895d@infradead.org/ [1] Link: https://lore.kernel.org/r/20210225125638.1841436-1-arnd@kernel.org/ [2] Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-06-30 08:47:30 -04:00
Colin Ian King	31c9a4b24d	KEYS: trusted: Fix memory leak on object td commit 83a775d5f9bfda95b1c295f95a3a041a40c7f321 upstream. Two error return paths are neglecting to free allocated object td, causing a memory leak. Fix this by returning via the error return path that securely kfree's td. Fixes clang scan-build warning: security/keys/trusted-keys/trusted_tpm1.c:496:10: warning: Potential memory leak [unix.Malloc] Cc: stable@vger.kernel.org Fixes: 5df16caada3f ("KEYS: trusted: Fix incorrect handling of tpm_get_random()") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-05-19 10:12:50 +02:00
Li Huafei	6b4b3b8404	ima: Fix the error code for restoring the PCR value [ Upstream commit 7990ccafaa37dc6d8bb095d4d7cd997e8903fd10 ] In ima_restore_measurement_list(), hdr[HDR_PCR].data is pointing to a buffer of type u8, which contains the dumped 32-bit pcr value. Currently, only the least significant byte is used to restore the pcr value. We should convert hdr[HDR_PCR].data to a pointer of type u32 before fetching the value to restore the correct pcr value. Fixes: `47fdee60b4` ("ima: use ima_parse_buf() to parse measurements headers") Signed-off-by: Li Huafei <lihuafei1@huawei.com> Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-05-14 09:50:30 +02:00
James Bottomley	09a119a2d4	security: keys: trusted: fix TPM2 authorizations [ Upstream commit de66514d934d70ce73c302ce0644b54970fc7196 ] In TPM 1.2 an authorization was a 20 byte number. The spec actually recommended you to hash variable length passwords and use the sha1 hash as the authorization. Because the spec doesn't require this hashing, the current authorization for trusted keys is a 40 digit hex number. For TPM 2.0 the spec allows the passing in of variable length passwords and passphrases directly, so we should allow that in trusted keys for ease of use. Update the 'blobauth' parameter to take this into account, so we can now use plain text passwords for the keys. so before keyctl add trusted kmk "new 32 blobauth=f572d396fae9206628714fb2ce00f72e94f2258fkeyhandle=81000001" @u after we will accept both the old hex sha1 form as well as a new directly supplied password: keyctl add trusted kmk "new 32 blobauth=hello keyhandle=81000001" @u Since a sha1 hex code must be exactly 40 bytes long and a direct password must be 20 or less, we use the length as the discriminator for which form is input. Note this is both and enhancement and a potential bug fix. The TPM 2.0 spec requires us to strip leading zeros, meaning empyty authorization is a zero length HMAC whereas we're currently passing in 20 bytes of zeros. A lot of TPMs simply accept this as OK, but the Microsoft TPM emulator rejects it with TPM_RC_BAD_AUTH, so this patch makes the Microsoft TPM emulator work with trusted keys. Fixes: `0fe5480303` ("keys, trusted: seal/unseal with TPM 2.0 chips") Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-05-14 09:50:20 +02:00
Paul Moore	4c0ddc8712	selinux: add proper NULL termination to the secclass_map permissions commit e4c82eafb609c2badc56f4e11bc50fcf44b8e9eb upstream. This patch adds the missing NULL termination to the "bpf" and "perf_event" object class permission lists. This missing NULL termination should really only affect the tools under scripts/selinux, with the most important being genheaders.c, although in practice this has not been an issue on any of my dev/test systems. If the problem were to manifest itself it would likely result in bogus permissions added to the end of the object class; thankfully with no access control checks using these bogus permissions and no policies defining these permissions the impact would likely be limited to some noise about undefined permissions during policy load. Cc: stable@vger.kernel.org Fixes: `ec27c3568a` ("selinux: bpf: Add selinux check for eBPF syscall operations") Fixes: `da97e18458` ("perf_event: Add support for LSM and SELinux checks") Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-05-14 09:49:59 +02:00
Arnd Bergmann	f37b9c142e	security: commoncap: fix -Wstringop-overread warning commit 82e5d8cc768b0c7b03c551a9ab1f8f3f68d5f83f upstream. gcc-11 introdces a harmless warning for cap_inode_getsecurity: security/commoncap.c: In function ‘cap_inode_getsecurity’: security/commoncap.c:440:33: error: ‘memcpy’ reading 16 bytes from a region of size 0 [-Werror=stringop-overread] 440 \| memcpy(&nscap->data, &cap->data, sizeof(__le32) * 2 * VFS_CAP_U32); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The problem here is that tmpbuf is initialized to NULL, so gcc assumes it is not accessible unless it gets set by vfs_getxattr_alloc(). This is a legitimate warning as far as I can tell, but the code is correct since it correctly handles the error when that function fails. Add a separate NULL check to tell gcc about it as well. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: James Morris <jamorris@linux.microsoft.com> Cc: Andrey Zhizhikin <andrey.z@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-05-11 14:47:36 +02:00
James Bottomley	bf84ef2dd2	KEYS: trusted: Fix TPM reservation for seal/unseal [ Upstream commit 9d5171eab462a63e2fbebfccf6026e92be018f20 ] The original patch 8c657a0590de ("KEYS: trusted: Reserve TPM for seal and unseal operations") was correct on the mailing list: https://lore.kernel.org/linux-integrity/20210128235621.127925-4-jarkko@kernel.org/ But somehow got rebased so that the tpm_try_get_ops() in tpm2_seal_trusted() got lost. This causes an imbalanced put of the TPM ops and causes oopses on TIS based hardware. This fix puts back the lost tpm_try_get_ops() Fixes: 8c657a0590de ("KEYS: trusted: Reserve TPM for seal and unseal operations") Reported-by: Mimi Zohar <zohar@linux.ibm.com> Acked-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-04-28 13:39:59 +02:00
Ondrej Mosnacek	a28124e8ad	selinux: fix race between old and new sidtab commit 9ad6e9cb39c66366bf7b9aece114aca277981a1f upstream. Since commit `1b8b31a2e6` ("selinux: convert policy read-write lock to RCU"), there is a small window during policy load where the new policy pointer has already been installed, but some threads may still be holding the old policy pointer in their read-side RCU critical sections. This means that there may be conflicting attempts to add a new SID entry to both tables via sidtab_context_to_sid(). See also (and the rest of the thread): https://lore.kernel.org/selinux/CAFqZXNvfux46_f8gnvVvRYMKoes24nwm2n3sPbMjrB8vKTW00g@mail.gmail.com/ Fix this by installing the new policy pointer under the old sidtab's spinlock along with marking the old sidtab as "frozen". Then, if an attempt to add new entry to a "frozen" sidtab is detected, make sidtab_context_to_sid() return -ESTALE to indicate that a new policy has been installed and that the caller will have to abort the policy transaction and try again after re-taking the policy pointer (which is guaranteed to be a newer policy). This requires adding a retry-on-ESTALE logic to all callers of sidtab_context_to_sid(), but fortunately these are easy to determine and aren't that many. This seems to be the simplest solution for this problem, even if it looks somewhat ugly. Note that other places in the kernel (e.g. do_mknodat() in fs/namei.c) use similar stale-retry patterns, so I think it's reasonable. Cc: stable@vger.kernel.org Fixes: `1b8b31a2e6` ("selinux: convert policy read-write lock to RCU") Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:41:57 +02:00
Ondrej Mosnacek	fd75d73aa2	selinux: fix cond_list corruption when changing booleans commit d8f5f0ea5b86300390b026b6c6e7836b7150814a upstream. Currently, duplicate_policydb_cond_list() first copies the whole conditional avtab and then tries to link to the correct entries in cond_dup_av_list() using avtab_search(). However, since the conditional avtab may contain multiple entries with the same key, this approach often fails to find the right entry, potentially leading to wrong rules being activated/deactivated when booleans are changed. To fix this, instead start with an empty conditional avtab and add the individual entries one-by-one while building the new av_lists. This approach leads to the correct result, since each entry is present in the av_lists exactly once. The issue can be reproduced with Fedora policy as follows: # sesearch -s ftpd_t -t public_content_rw_t -c dir -p create -A allow ftpd_t non_security_file_type:dir { add_name create getattr ioctl link lock open read remove_name rename reparent rmdir search setattr unlink watch watch_reads write }; [ ftpd_full_access ]:True allow ftpd_t public_content_rw_t:dir { add_name create link remove_name rename reparent rmdir setattr unlink watch watch_reads write }; [ ftpd_anon_write ]:True # setsebool ftpd_anon_write=off ftpd_connect_all_unreserved=off ftpd_connect_db=off ftpd_full_access=off On fixed kernels, the sesearch output is the same after the setsebool command: # sesearch -s ftpd_t -t public_content_rw_t -c dir -p create -A allow ftpd_t non_security_file_type:dir { add_name create getattr ioctl link lock open read remove_name rename reparent rmdir search setattr unlink watch watch_reads write }; [ ftpd_full_access ]:True allow ftpd_t public_content_rw_t:dir { add_name create link remove_name rename reparent rmdir setattr unlink watch watch_reads write }; [ ftpd_anon_write ]:True While on the broken kernels, it will be different: # sesearch -s ftpd_t -t public_content_rw_t -c dir -p create -A allow ftpd_t non_security_file_type:dir { add_name create getattr ioctl link lock open read remove_name rename reparent rmdir search setattr unlink watch watch_reads write }; [ ftpd_full_access ]:True allow ftpd_t non_security_file_type:dir { add_name create getattr ioctl link lock open read remove_name rename reparent rmdir search setattr unlink watch watch_reads write }; [ ftpd_full_access ]:True allow ftpd_t non_security_file_type:dir { add_name create getattr ioctl link lock open read remove_name rename reparent rmdir search setattr unlink watch watch_reads write }; [ ftpd_full_access ]:True While there, also simplify the computation of nslots. This changes the nslots values for nrules 2 or 3 to just two slots instead of 4, which makes the sequence more consistent. Cc: stable@vger.kernel.org Fixes: `c7c556f1e8` ("selinux: refactor changing booleans") Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:41:57 +02:00
Ondrej Mosnacek	4f29b08e23	selinux: make nslot handling in avtab more robust commit 442dc00f82a9727dc0c48c44f792c168f593c6df upstream. 1. Make sure all fileds are initialized in avtab_init(). 2. Slightly refactor avtab_alloc() to use the above fact. 3. Use h->nslot == 0 as a sentinel in the access functions to prevent dereferencing h->htable when it's not allocated. Cc: stable@vger.kernel.org Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-14 08:41:57 +02:00
Mimi Zohar	546f7fcc45	integrity: double check iint_cache was initialized commit 92063f3ca73aab794bd5408d3361fd5b5ea33079 upstream. The kernel may be built with multiple LSMs, but only a subset may be enabled on the boot command line by specifying "lsm=". Not including "integrity" on the ordered LSM list may result in a NULL deref. As reported by Dmitry Vyukov: in qemu: qemu-system-x86_64 -enable-kvm -machine q35,nvdimm -cpu max,migratable=off -smp 4 -m 4G,slots=4,maxmem=16G -hda wheezy.img -kernel arch/x86/boot/bzImage -nographic -vga std -soundhw all -usb -usbdevice tablet -bt hci -bt device:keyboard -net user,host=10.0.2.10,hostfwd=tcp::10022-:22 -net nic,model=virtio-net-pci -object memory-backend-file,id=pmem1,share=off,mem-path=/dev/zero,size=64M -device nvdimm,id=nvdimm1,memdev=pmem1 -append "console=ttyS0 root=/dev/sda earlyprintk=serial rodata=n oops=panic panic_on_warn=1 panic=86400 lsm=smack numa=fake=2 nopcid dummy_hcd.num=8" -pidfile vm_pid -m 2G -cpu host But it crashes on NULL deref in integrity_inode_get during boot: Run /sbin/init as init process BUG: kernel NULL pointer dereference, address: 000000000000001c PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP KASAN CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.12.0-rc2+ #97 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-44-g88ab0c15525c-prebuilt.qemu.org 04/01/2014 RIP: 0010:kmem_cache_alloc+0x2b/0x370 mm/slub.c:2920 Code: 57 41 56 41 55 41 54 41 89 f4 55 48 89 fd 53 48 83 ec 10 44 8b 3d d9 1f 90 0b 65 48 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 <8b> 5f 1c 4cf RSP: 0000:ffffc9000032f9d8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff888017fc4f00 RCX: 0000000000000000 RDX: ffff888040220000 RSI: 0000000000000c40 RDI: 0000000000000000 RBP: 0000000000000000 R08: 0000000000000000 R09: ffff888019263627 R10: ffffffff83937cd1 R11: 0000000000000000 R12: 0000000000000c40 R13: ffff888019263538 R14: 0000000000000000 R15: 0000000000ffffff FS: 0000000000000000(0000) GS:ffff88802d180000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000001c CR3: 000000000b48e000 CR4: 0000000000750ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: integrity_inode_get+0x47/0x260 security/integrity/iint.c:105 process_measurement+0x33d/0x17e0 security/integrity/ima/ima_main.c:237 ima_bprm_check+0xde/0x210 security/integrity/ima/ima_main.c:474 security_bprm_check+0x7d/0xa0 security/security.c:845 search_binary_handler fs/exec.c:1708 [inline] exec_binprm fs/exec.c:1761 [inline] bprm_execve fs/exec.c:1830 [inline] bprm_execve+0x764/0x19a0 fs/exec.c:1792 kernel_execve+0x370/0x460 fs/exec.c:1973 try_to_run_init_process+0x14/0x4e init/main.c:1366 kernel_init+0x11d/0x1b8 init/main.c:1477 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294 Modules linked in: CR2: 000000000000001c ---[ end trace 22d601a500de7d79 ]--- Since LSMs and IMA may be configured at build time, but not enabled at run time, panic the system if "integrity" was not initialized before use. Reported-by: Dmitry Vyukov <dvyukov@google.com> Fixes: `79f7865d84` ("LSM: Introduce "lsm=" for boottime LSM selection") Cc: stable@vger.kernel.org Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-30 14:31:55 +02:00
Ondrej Mosnacek	19c9967e49	selinux: fix variable scope issue in live sidtab conversion commit 6406887a12ee5dcdaffff1a8508d91113d545559 upstream. Commit `02a52c5c8c` ("selinux: move policy commit after updating selinuxfs") moved the selinux_policy_commit() call out of security_load_policy() into sel_write_load(), which caused a subtle yet rather serious bug. The problem is that security_load_policy() passes a reference to the convert_params local variable to sidtab_convert(), which stores it in the sidtab, where it may be accessed until the policy is swapped over and RCU synchronized. Before `02a52c5c8c`, selinux_policy_commit() was called directly from security_load_policy(), so the convert_params pointer remained valid all the way until the old sidtab was destroyed, but now that's no longer the case and calls to sidtab_context_to_sid() on the old sidtab after security_load_policy() returns may cause invalid memory accesses. This can be easily triggered using the stress test from commit `ee1a84fdfe` ("selinux: overhaul sidtab to fix bug and improve performance"): ``` function rand_cat() { echo $(( $RANDOM % 1024 )) } function do_work() { while true; do echo -n "system_u:system_r:kernel_t:s0:c$(rand_cat),c$(rand_cat)" \ >/sys/fs/selinux/context 2>/dev/null \|\| true done } do_work >/dev/null & do_work >/dev/null & do_work >/dev/null & while load_policy; do echo -n .; sleep 0.1; done kill %1 kill %2 kill %3 ``` Fix this by allocating the temporary sidtab convert structures dynamically and passing them among the selinux_policy_{load,cancel,commit} functions. Fixes: `02a52c5c8c` ("selinux: move policy commit after updating selinuxfs") Cc: stable@vger.kernel.org Tested-by: Tyler Hicks <tyhicks@linux.microsoft.com> Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com> Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> [PM: merge fuzz in security.h and services.c] Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-30 14:31:53 +02:00
Ondrej Mosnacek	9731e08a33	selinux: don't log MAC_POLICY_LOAD record on failed policy load commit 519dad3bcd809dc1523bf80ab0310ddb3bf00ade upstream. If sel_make_policy_nodes() fails, we should jump to 'out', not 'out1', as the latter would incorrectly log an MAC_POLICY_LOAD audit record, even though the policy hasn't actually been reloaded. The 'out1' jump label now becomes unused and can be removed. Fixes: `02a52c5c8c` ("selinux: move policy commit after updating selinuxfs") Cc: stable@vger.kernel.org Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-30 14:31:53 +02:00
Eric W. Biederman	5d5422a294	Revert 95ebabde382c ("capabilities: Don't allow writing ambiguous v3 file capabilities") commit 3b0c2d3eaa83da259d7726192cf55a137769012f upstream. It turns out that there are in fact userspace implementations that care and this recent change caused a regression. https://github.com/containers/buildah/issues/3071 As the motivation for the original change was future development, and the impact is existing real world code just revert this change and allow the ambiguity in v3 file caps. Cc: stable@vger.kernel.org Fixes: 95ebabde382c ("capabilities: Don't allow writing ambiguous v3 file capabilities") Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-17 17:06:27 +01:00
Tetsuo Handa	aa40f5e33c	tomoyo: recognize kernel threads correctly commit 9c83465f3245c2faa82ffeb7016f40f02bfaa0ad upstream. Commit `db68ce10c4` ("new helper: uaccess_kernel()") replaced segment_eq(get_fs(), KERNEL_DS) with uaccess_kernel(). But the correct method for tomoyo to check whether current is a kernel thread in order to assume that kernel threads are privileged for socket operations was (current->flags & PF_KTHREAD). Now that uaccess_kernel() became 0 on x86, tomoyo has to fix this problem. Do like commit 942cb357ae7d9249 ("Smack: Handle io_uring kernel thread privileges") does. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-09 11:11:15 +01:00
Tetsuo Handa	e00420943a	tomoyo: ignore data race while checking quota commit 5797e861e402fff2bedce4ec8b7c89f4248b6073 upstream. syzbot is reporting that tomoyo's quota check is racy [1]. But this check is tolerant of some degree of inaccuracy. Thus, teach KCSAN to ignore this data race. [1] https://syzkaller.appspot.com/bug?id=999533deec7ba6337f8aa25d8bd1a4d5f7e50476 Reported-by: syzbot <syzbot+0789a72b46fd91431bd8@syzkaller.appspotmail.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-07 12:34:05 +01:00
Sabyrzhan Tasbolatov	fa5b656092	smackfs: restrict bytes count in smackfs write functions commit 7ef4c19d245f3dc233fd4be5acea436edd1d83d8 upstream. syzbot found WARNINGs in several smackfs write operations where bytes count is passed to memdup_user_nul which exceeds GFP MAX_ORDER. Check count size if bigger than PAGE_SIZE. Per smackfs doc, smk_write_net4addr accepts any label or -CIPSO, smk_write_net6addr accepts any label or -DELETE. I couldn't find any general rule for other label lengths except SMK_LABELLEN, SMK_LONGLABEL, SMK_CIPSOMAX which are documented. Let's constrain, in general, smackfs label lengths for PAGE_SIZE. Although fuzzer crashes write to smackfs/netlabel on 0x400000 length. Here is a quick way to reproduce the WARNING: python -c "print('A' * 0x400000)" > /sys/fs/smackfs/netlabel Reported-by: syzbot+a71a442385a0b2815497@syzkaller.appspotmail.com Signed-off-by: Sabyrzhan Tasbolatov <snovitoll@gmail.com> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-07 12:34:05 +01:00
Jarkko Sakkinen	67118bb78d	KEYS: trusted: Reserve TPM for seal and unseal operations commit 8c657a0590de585b1115847c17b34a58025f2f4b upstream. When TPM 2.0 trusted keys code was moved to the trusted keys subsystem, the operations were unwrapped from tpm_try_get_ops() and tpm_put_ops(), which are used to take temporarily the ownership of the TPM chip. The ownership is only taken inside tpm_send(), but this is not sufficient, as in the key load TPM2_CC_LOAD, TPM2_CC_UNSEAL and TPM2_FLUSH_CONTEXT need to be done as a one single atom. Take the TPM chip ownership before sending anything with tpm_try_get_ops() and tpm_put_ops(), and use tpm_transmit_cmd() to send TPM commands instead of tpm_send(), reverting back to the old behaviour. Fixes: `2e19e10131` ("KEYS: trusted: Move TPM2 trusted keys code") Reported-by: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: stable@vger.kernel.org Cc: David Howells <dhowells@redhat.com> Cc: Mimi Zohar <zohar@linux.ibm.com> Cc: Sumit Garg <sumit.garg@linaro.org> Acked-by Sumit Garg <sumit.garg@linaro.org> Tested-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-04 11:38:29 +01:00
Jarkko Sakkinen	54c527c18e	KEYS: trusted: Fix migratable=1 failing commit 8da7520c80468c48f981f0b81fc1be6599e3b0ad upstream. Consider the following transcript: $ keyctl add trusted kmk "new 32 blobauth=helloworld keyhandle=80000000 migratable=1" @u add_key: Invalid argument The documentation has the following description: migratable= 0\|1 indicating permission to reseal to new PCR values, default 1 (resealing allowed) The consequence is that "migratable=1" should succeed. Fix this by allowing this condition to pass instead of return -EINVAL. [*] Documentation/security/keys/trusted-encrypted.rst Cc: stable@vger.kernel.org Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: Mimi Zohar <zohar@linux.ibm.com> Cc: David Howells <dhowells@redhat.com> Fixes: `d00a1c72f7` ("keys: add new trusted key-type") Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-04 11:38:29 +01:00
Jarkko Sakkinen	9d83cc1a1e	KEYS: trusted: Fix incorrect handling of tpm_get_random() commit 5df16caada3fba3b21cb09b85cdedf99507f4ec1 upstream. When tpm_get_random() was introduced, it defined the following API for the return value: 1. A positive value tells how many bytes of random data was generated. 2. A negative value on error. However, in the call sites the API was used incorrectly, i.e. as it would only return negative values and otherwise zero. Returning he positive read counts to the user space does not make any possible sense. Fix this by returning -EIO when tpm_get_random() returns a positive value. Fixes: `41ab999c80` ("tpm: Move tpm_get_random api into the TPM device driver") Cc: stable@vger.kernel.org Cc: Mimi Zohar <zohar@linux.ibm.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: David Howells <dhowells@redhat.com> Cc: Kent Yoder <key@linux.vnet.ibm.com> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-04 11:38:29 +01:00
Amir Goldstein	2fe9215301	selinux: fix inconsistency between inode_getxattr and inode_listsecurity commit a9ffe682c58aaff643764547f5420e978b6e0830 upstream. When inode has no listxattr op of its own (e.g. squashfs) vfs_listxattr calls the LSM inode_listsecurity hooks to list the xattrs that LSMs will intercept in inode_getxattr hooks. When selinux LSM is installed but not initialized, it will list the security.selinux xattr in inode_listsecurity, but will not intercept it in inode_getxattr. This results in -ENODATA for a getxattr call for an xattr returned by listxattr. This situation was manifested as overlayfs failure to copy up lower files from squashfs when selinux is built-in but not initialized, because ovl_copy_xattr() iterates the lower inode xattrs by vfs_listxattr() and vfs_getxattr(). Match the logic of inode_listsecurity to that of inode_getxattr and do not list the security.selinux xattr if selinux is not initialized. Reported-by: Michael Labriola <michael.d.labriola@gmail.com> Tested-by: Michael Labriola <michael.d.labriola@gmail.com> Link: https://lore.kernel.org/linux-unionfs/2nv9d47zt7.fsf@aldarion.sourceruckus.org/ Fixes: `c8e222616c` ("selinux: allow reading labels before policy is loaded") Cc: stable@vger.kernel.org#v5.9+ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-04 11:38:28 +01:00
David Howells	d7b0efadc3	certs: Fix blacklist flag type confusion [ Upstream commit 4993e1f9479a4161fd7d93e2b8b30b438f00cb0f ] KEY_FLAG_KEEP is not meant to be passed to keyring_alloc() or key_alloc(), as these only take KEY_ALLOC_* flags. KEY_FLAG_KEEP has the same value as KEY_ALLOC_BYPASS_RESTRICTION, but fortunately only key_create_or_update() uses it. LSMs using the key_alloc hook don't check that flag. KEY_FLAG_KEEP is then ignored but fortunately (again) the root user cannot write to the blacklist keyring, so it is not possible to remove a key/hash from it. Fix this by adding a KEY_ALLOC_SET_KEEP flag that tells key_alloc() to set KEY_FLAG_KEEP on the new key. blacklist_init() can then, correctly, pass this to keyring_alloc(). We can also use this in ima_mok_init() rather than setting the flag manually. Note that this doesn't fix an observable bug with the current implementation but it is required to allow addition of new hashes to the blacklist in the future without making it possible for them to be removed. Fixes: `734114f878` ("KEYS: Add a system blacklist keyring") Reported-by: Mickaël Salaün <mic@linux.microsoft.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Mickaël Salaün <mic@linux.microsoft.com> cc: Mimi Zohar <zohar@linux.vnet.ibm.com> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-04 11:37:59 +01:00
Gabriel Krisman Bertazi	6e223a3d90	watch_queue: Drop references to /dev/watch_queue [ Upstream commit 8fe62e0c0e2efa5437f3ee81b65d69e70a45ecd2 ] The merged API doesn't use a watch_queue device, but instead relies on pipes, so let the documentation reflect that. Fixes: `f7e47677e3` ("watch_queue: Add a key/keyring notification facility") Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Jarkko Sakkinen <jarkko@kernel.org> Reviewed-by: Ben Boeckel <mathstuf@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-04 11:37:59 +01:00
Eric W. Biederman	54b4e5df95	capabilities: Don't allow writing ambiguous v3 file capabilities [ Upstream commit 95ebabde382c371572297915b104e55403674e73 ] The v3 file capabilities have a uid field that records the filesystem uid of the root user of the user namespace the file capabilities are valid in. When someone is silly enough to have the same underlying uid as the root uid of multiple nested containers a v3 filesystem capability can be ambiguous. In the spirit of don't do that then, forbid writing a v3 filesystem capability if it is ambiguous. Fixes: `8db6c34f1d` ("Introduce v3 namespaced file capabilities") Reviewed-by: Andrew G. Morgan <morgan@kernel.org> Reviewed-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-04 11:37:52 +01:00
Lakshmi Ramasubramanian	c365d333e9	ima: Free IMA measurement buffer after kexec syscall [ Upstream commit f31e3386a4e92ba6eda7328cb508462956c94c64 ] IMA allocates kernel virtual memory to carry forward the measurement list, from the current kernel to the next kernel on kexec system call, in ima_add_kexec_buffer() function. This buffer is not freed before completing the kexec system call resulting in memory leak. Add ima_buffer field in "struct kimage" to store the virtual address of the buffer allocated for the IMA measurement list. Free the memory allocated for the IMA measurement list in kimage_file_post_load_cleanup() function. Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com> Suggested-by: Tyler Hicks <tyhicks@linux.microsoft.com> Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com> Fixes: `7b8589cc29` ("ima: on soft reboot, save the measurement list") Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-04 11:37:50 +01:00
Lakshmi Ramasubramanian	1facf2415b	ima: Free IMA measurement buffer on error [ Upstream commit 6d14c6517885fa68524238787420511b87d671df ] IMA allocates kernel virtual memory to carry forward the measurement list, from the current kernel to the next kernel on kexec system call, in ima_add_kexec_buffer() function. In error code paths this memory is not freed resulting in memory leak. Free the memory allocated for the IMA measurement list in the error code paths in ima_add_kexec_buffer() function. Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com> Suggested-by: Tyler Hicks <tyhicks@linux.microsoft.com> Fixes: `7b8589cc29` ("ima: on soft reboot, save the measurement list") Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-04 11:37:50 +01:00
Dinghao Liu	494e9ec12c	evm: Fix memleak in init_desc [ Upstream commit ccf11dbaa07b328fa469415c362d33459c140a37 ] tmp_tfm is allocated, but not freed on subsequent kmalloc failure, which leads to a memory leak. Free tmp_tfm. Fixes: `d46eb36995` ("evm: crypto hash replaced by shash") Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> [zohar@linux.ibm.com: formatted/reworded patch description] Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-04 11:37:40 +01:00
Miklos Szeredi	02dee03d48	cap: fix conversions on getxattr [ Upstream commit f2b00be488730522d0fb7a8a5de663febdcefe0a ] If a capability is stored on disk in v2 format cap_inode_getsecurity() will currently return in v2 format unconditionally. This is wrong: v2 cap should be equivalent to a v3 cap with zero rootid, and so the same conversions performed on it. If the rootid cannot be mapped, v3 is returned unconverted. Fix this so that both v2 and v3 return -EOVERFLOW if the rootid (or the owner of the fs user namespace in case of v2) cannot be mapped into the current user namespace. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-02-17 11:02:22 +01:00
Al Viro	a3fddad7af	dump_common_audit_data(): fix racy accesses to ->d_name commit d36a1dd9f77ae1e72da48f4123ed35627848507d upstream. We are not guaranteed the locking environment that would prevent dentry getting renamed right under us. And it's possible for old long name to be freed after rename, leading to UAF here. Cc: stable@kernel.org # v2.6.2+ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-01-19 18:27:29 +01:00
Roberto Sassu	0f2206e3d9	ima: Don't modify file descriptor mode on the fly commit 207cdd565dfc95a0a5185263a567817b7ebf5467 upstream. Commit `a408e4a86b` ("ima: open a new file instance if no read permissions") already introduced a second open to measure a file when the original file descriptor does not allow it. However, it didn't remove the existing method of changing the mode of the original file descriptor, which is still necessary if the current process does not have enough privileges to open a new one. Changing the mode isn't really an option, as the filesystem might need to do preliminary steps to make the read possible. Thus, this patch removes the code and keeps the second open as the only option to measure a file when it is unreadable with the original file descriptor. Cc: <stable@vger.kernel.org> # 4.20.x: `0014cc04e8` ima: Set file->f_mode Fixes: `2fe5d6def1` ("ima: integrity appraisal extension") Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-12-30 11:54:17 +01:00
Casey Schaufler	8f939abd81	Smack: Handle io_uring kernel thread privileges [ Upstream commit 942cb357ae7d9249088e3687ee6a00ed2745a0c7 ] Smack assumes that kernel threads are privileged for smackfs operations. This was necessary because the credential of the kernel thread was not related to a user operation. With io_uring the credential does reflect a user's rights and can be used. Suggested-by: Jens Axboe <axboe@kernel.dk> Acked-by: Jens Axboe <axboe@kernel.dk> Acked-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:54:02 +01:00
Paul Moore	6e5ea342fc	selinux: fix inode_doinit_with_dentry() LABEL_INVALID error handling [ Upstream commit 200ea5a2292dc444a818b096ae6a32ba3caa51b9 ] A previous fix, commit 83370b31a915 ("selinux: fix error initialization in inode_doinit_with_dentry()"), changed how failures were handled before a SELinux policy was loaded. Unfortunately that patch was potentially problematic for two reasons: it set the isec->initialized state without holding a lock, and it didn't set the inode's SELinux label to the "default" for the particular filesystem. The later can be a problem if/when a later attempt to revalidate the inode fails and SELinux reverts to the existing inode label. This patch should restore the default inode labeling that existed before the original fix, without affecting the LABEL_INVALID marking such that revalidation will still be attempted in the future. Fixes: 83370b31a915 ("selinux: fix error initialization in inode_doinit_with_dentry()") Reported-by: Sven Schnelle <svens@linux.ibm.com> Tested-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:03 +01:00
Tianyue Ren	f0d7de0926	selinux: fix error initialization in inode_doinit_with_dentry() [ Upstream commit 83370b31a915493231e5b9addc72e4bef69f8d31 ] Mark the inode security label as invalid if we cannot find a dentry so that we will retry later rather than marking it initialized with the unlabeled SID. Fixes: `9287aed2ad` ("selinux: Convert isec->lock into a spinlock") Signed-off-by: Tianyue Ren <rentianyue@kylinos.cn> [PM: minor comment tweaks] Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:52:58 +01:00
Linus Torvalds	30636a59f4	selinux/stable-5.10 PR 20201113 -----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAl+vD0MUHHBhdWxAcGF1 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXOHZw//SbZ4c32LGxANDJrASdRWGRYzUsSm cQZrLR373TMMmawZDenvRK0Wfa8vj+BK5V5knBXXTDFq5M+TVtlzhRhED4ZH40RP CmaL+XC24ed1IPsYiZZ0lytXrDnb7EgmGaSjHP7RttBdLG/2ABamm9FNZFpiFJjV y9txgNyJ9NvrdFCO+UoLEm9BFq4x/Ddzu+BAEqu1iSGaKXufiBlq9uQ+3qommJlx Bv+i6HeqCruTj+y8p8tkR7aLma9jsbqCpBlQkG1ja4pgxMXwC2uRbNzz//dlxWZ5 pKtz8HJqi+myla/rwNZAA82uZlkFnEOp7Yy404Deu73WznMdrQ1oJqlGDrXdafKn BJwpcUQnJ7NP2MK1JrMbxdVllUKsZF2ICP0t3CFTuuyJ2CakOuaSW+ERBvbO6NYa G7h9Ll8AO4ADbG3W+/c6nybDt8AkjBLJTg7uXWO3LIjjZ/UJwJxO4YwzYbaMdg6Q 7zVNj7j36YLHhv9Hc60vhCjO53KXYo60Z6TCblkeyClifvM8Gx109JkOwPXWFGIY Bh7KbKtGr2dZoFxU2yS5oQgtbzJhsfJQGYsF8uANHcj76hQ/rh+Ff1NKKofHaRSl umQSgNbekaQjnmq08otVlIHtjdaG+or1PFW2Yep37OwsCXwKrFG4YZRY9FdpwQeG Cjx3JAJTibvjMqg= =qpR+ -----END PGP SIGNATURE----- Merge tag 'selinux-pr-20201113' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux fix from Paul Moore: "One small SELinux patch to make sure we return an error code when an allocation fails. It passes all of our tests, but given the nature of the patch that isn't surprising" * tag 'selinux-pr-20201113' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: Fix error return code in sel_ib_pkey_sid_slow()	2020-11-14 12:04:02 -08:00
Chen Zhou	c350f8bea2	selinux: Fix error return code in sel_ib_pkey_sid_slow() Fix to return a negative error code from the error handling case instead of 0 in function sel_ib_pkey_sid_slow(), as done elsewhere in this function. Cc: stable@vger.kernel.org Fixes: `409dcf3153` ("selinux: Add a cache for quicker retreival of PKey SIDs") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2020-11-12 20:16:09 -05:00
Gustavo A. R. Silva	4739eeafb9	ima: Replace zero-length array with flexible-array member There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>	2020-10-29 17:22:59 -05:00
Linus Torvalds	81ecf91eab	SafeSetID changes for v5.10 -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEgvWslnM+qUy+sgVg5n2WYw6TPBAFAl+Ifu8ACgkQ5n2WYw6T PBCoxA/+Pn0XvwYa6V773lPNjon+Oa94Aq7Wl6YryDMJakiGDJFSJa0tEI8TmRkJ z21kjww2Us9gEvfmNoc0t4oDJ98UNAXERjc98fOZgxH1d1urpGUI7qdQ07YCo0xZ CDOvqXk/PobGF6p9BpF5QWqEJNq6G8xAKpA8nLa6OUPcjofHroWCgIs86Rl3CtTc DwjcOvCgUoTxFm9Vpvm04njFFkVuGUwmXuhyV3Xjh2vNhHvfpP/ibTPmmv1sx4dO 9WE8BjW0HL5VMzms/BE/mnXmbu2BdPs+PW9/RjQfebbAH8DM3Noqr9f3Db8eqp7t TiqU8AO06TEVZa011+V3aywgz9rnH+XJ17TfutB28Z7lG3s4XPZYDgzubJxb1X8M 4d2mCL3N/ao5otx6FqpgJ2oK0ZceB/voY9qyyfErEBhRumxifl7AQCHxt3LumH6m fvvNY+UcN/n7hZPJ7sgZVi/hnnwvO0e1eX0L9ZdNsDjR1bgzBQCdkY53XNxam+rM z7tmT3jlDpNtPzOzFCZeiJuTgWYMDdJFqekPLess/Vqaswzc4PPT2lyQ6N81NR5H +mzYf/PNIg5fqN8QlMQEkMTv2fnC19dHJT83NPgy4dQObpXzUqYGWAmdKcBxLpnG du8wDpPHusChRFMZKRMTXztdMvMAuNqY+KJ6bFojG0Z+qgR7oQk= =/anB -----END PGP SIGNATURE----- Merge tag 'safesetid-5.10' of git://github.com/micah-morton/linux Pull SafeSetID updates from Micah Morton: "The changes are mostly contained to within the SafeSetID LSM, with the exception of a few 1-line changes to change some ns_capable() calls to ns_capable_setid() -- causing a flag (CAP_OPT_INSETID) to be set that is examined by SafeSetID code and nothing else in the kernel. The changes to SafeSetID internally allow for setting up GID transition security policies, as already existed for UIDs" * tag 'safesetid-5.10' of git://github.com/micah-morton/linux: LSM: SafeSetID: Fix warnings reported by test bot LSM: SafeSetID: Add GID security policy handling LSM: Signal to SafeSetID when setting group IDs	2020-10-25 10:45:26 -07:00
Jens Axboe	91989c7078	task_work: cleanup notification modes A previous commit changed the notification mode from true/false to an int, allowing notify-no, notify-yes, or signal-notify. This was backwards compatible in the sense that any existing true/false user would translate to either 0 (on notification sent) or 1, the latter which mapped to TWA_RESUME. TWA_SIGNAL was assigned a value of 2. Clean this up properly, and define a proper enum for the notification mode. Now we have: - TWA_NONE. This is 0, same as before the original change, meaning no notification requested. - TWA_RESUME. This is 1, same as before the original change, meaning that we use TIF_NOTIFY_RESUME. - TWA_SIGNAL. This uses TIF_SIGPENDING/JOBCTL_TASK_WORK for the notification. Clean up all the callers, switching their 0/1/false/true to using the appropriate TWA_* mode for notifications. Fixes: `e91b481623` ("task_work: teach task_work_add() to do signal_wake_up()") Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-10-17 15:05:30 -06:00
Linus Torvalds	9ff9b0d392	networking changes for the 5.10 merge window Add redirect_neigh() BPF packet redirect helper, allowing to limit stack traversal in common container configs and improving TCP back-pressure. Daniel reports ~10Gbps => ~15Gbps single stream TCP performance gain. Expand netlink policy support and improve policy export to user space. (Ge)netlink core performs request validation according to declared policies. Expand the expressiveness of those policies (min/max length and bitmasks). Allow dumping policies for particular commands. This is used for feature discovery by user space (instead of kernel version parsing or trial and error). Support IGMPv3/MLDv2 multicast listener discovery protocols in bridge. Allow more than 255 IPv4 multicast interfaces. Add support for Type of Service (ToS) reflection in SYN/SYN-ACK packets of TCPv6. In Multi-patch TCP (MPTCP) support concurrent transmission of data on multiple subflows in a load balancing scenario. Enhance advertising addresses via the RM_ADDR/ADD_ADDR options. Support SMC-Dv2 version of SMC, which enables multi-subnet deployments. Allow more calls to same peer in RxRPC. Support two new Controller Area Network (CAN) protocols - CAN-FD and ISO 15765-2:2016. Add xfrm/IPsec compat layer, solving the 32bit user space on 64bit kernel problem. Add TC actions for implementing MPLS L2 VPNs. Improve nexthop code - e.g. handle various corner cases when nexthop objects are removed from groups better, skip unnecessary notifications and make it easier to offload nexthops into HW by converting to a blocking notifier. Support adding and consuming TCP header options by BPF programs, opening the doors for easy experimental and deployment-specific TCP option use. Reorganize TCP congestion control (CC) initialization to simplify life of TCP CC implemented in BPF. Add support for shipping BPF programs with the kernel and loading them early on boot via the User Mode Driver mechanism, hence reusing all the user space infra we have. Support sleepable BPF programs, initially targeting LSM and tracing. Add bpf_d_path() helper for returning full path for given 'struct path'. Make bpf_tail_call compatible with bpf-to-bpf calls. Allow BPF programs to call map_update_elem on sockmaps. Add BPF Type Format (BTF) support for type and enum discovery, as well as support for using BTF within the kernel itself (current use is for pretty printing structures). Support listing and getting information about bpf_links via the bpf syscall. Enhance kernel interfaces around NIC firmware update. Allow specifying overwrite mask to control if settings etc. are reset during update; report expected max time operation may take to users; support firmware activation without machine reboot incl. limits of how much impact reset may have (e.g. dropping link or not). Extend ethtool configuration interface to report IEEE-standard counters, to limit the need for per-vendor logic in user space. Adopt or extend devlink use for debug, monitoring, fw update in many drivers (dsa loop, ice, ionic, sja1105, qed, mlxsw, mv88e6xxx, dpaa2-eth). In mlxsw expose critical and emergency SFP module temperature alarms. Refactor port buffer handling to make the defaults more suitable and support setting these values explicitly via the DCBNL interface. Add XDP support for Intel's igb driver. Support offloading TC flower classification and filtering rules to mscc_ocelot switches. Add PTP support for Marvell Octeontx2 and PP2.2 hardware, as well as fixed interval period pulse generator and one-step timestamping in dpaa-eth. Add support for various auth offloads in WiFi APs, e.g. SAE (WPA3) offload. Add Lynx PHY/PCS MDIO module, and convert various drivers which have this HW to use it. Convert mvpp2 to split PCS. Support Marvell Prestera 98DX3255 24-port switch ASICs, as well as 7-port Mediatek MT7531 IP. Add initial support for QCA6390 and IPQ6018 in ath11k WiFi driver, and wcn3680 support in wcn36xx. Improve performance for packets which don't require much offloads on recent Mellanox NICs by 20% by making multiple packets share a descriptor entry. Move chelsio inline crypto drivers (for TLS and IPsec) from the crypto subtree to drivers/net. Move MDIO drivers out of the phy directory. Clean up a lot of W=1 warnings, reportedly the actively developed subsections of networking drivers should now build W=1 warning free. Make sure drivers don't use in_interrupt() to dynamically adapt their code. Convert tasklets to use new tasklet_setup API (sadly this conversion is not yet complete). Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAl+ItRwACgkQMUZtbf5S IrtTMg//UxpdR/MirT1DatBU0K/UGAZY82hV7F/UC8tPgjfHZeHvWlDFxfi3YP81 PtPKbhRZ7DhwBXefUp6nY3UdvjftrJK2lJm8prJUPSsZRye8Wlcb7y65q7/P2y2U Efucyopg6RUrmrM0DUsIGYGJgylQLHnMYUl/keCsD4t5Bp4ksyi9R2t5eitGoWzh r3QGdbSa0AuWx4iu0i+tqp6Tj0ekMBMXLVb35dtU1t0joj2KTNEnSgABN3prOa8E iWYf2erOau68Ogp3yU3miCy0ZU4p/7qGHTtzbcp677692P/ekak6+zmfHLT9/Pjy 2Stq2z6GoKuVxdktr91D9pA3jxG4LxSJmr0TImcGnXbvkMP3Ez3g9RrpV5fn8j6F mZCH8TKZAoD5aJrAJAMkhZmLYE1pvDa7KolSk8WogXrbCnTEb5Nv8FHTS1Qnk3yl wSKXuvutFVNLMEHCnWQLtODbTST9DI/aOi6EctPpuOA/ZyL1v3pl+gfp37S+LUTe owMnT/7TdvKaTD0+gIyU53M6rAWTtr5YyRQorX9awIu/4Ha0F0gYD7BJZQUGtegp HzKt59NiSrFdbSH7UdyemdBF4LuCgIhS7rgfeoUXMXmuPHq7eHXyHZt5dzPPa/xP 81P0MAvdpFVwg8ij2yp2sHS7sISIRKq17fd1tIewUabxQbjXqPc= =bc1U -----END PGP SIGNATURE----- Merge tag 'net-next-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: - Add redirect_neigh() BPF packet redirect helper, allowing to limit stack traversal in common container configs and improving TCP back-pressure. Daniel reports ~10Gbps => ~15Gbps single stream TCP performance gain. - Expand netlink policy support and improve policy export to user space. (Ge)netlink core performs request validation according to declared policies. Expand the expressiveness of those policies (min/max length and bitmasks). Allow dumping policies for particular commands. This is used for feature discovery by user space (instead of kernel version parsing or trial and error). - Support IGMPv3/MLDv2 multicast listener discovery protocols in bridge. - Allow more than 255 IPv4 multicast interfaces. - Add support for Type of Service (ToS) reflection in SYN/SYN-ACK packets of TCPv6. - In Multi-patch TCP (MPTCP) support concurrent transmission of data on multiple subflows in a load balancing scenario. Enhance advertising addresses via the RM_ADDR/ADD_ADDR options. - Support SMC-Dv2 version of SMC, which enables multi-subnet deployments. - Allow more calls to same peer in RxRPC. - Support two new Controller Area Network (CAN) protocols - CAN-FD and ISO 15765-2:2016. - Add xfrm/IPsec compat layer, solving the 32bit user space on 64bit kernel problem. - Add TC actions for implementing MPLS L2 VPNs. - Improve nexthop code - e.g. handle various corner cases when nexthop objects are removed from groups better, skip unnecessary notifications and make it easier to offload nexthops into HW by converting to a blocking notifier. - Support adding and consuming TCP header options by BPF programs, opening the doors for easy experimental and deployment-specific TCP option use. - Reorganize TCP congestion control (CC) initialization to simplify life of TCP CC implemented in BPF. - Add support for shipping BPF programs with the kernel and loading them early on boot via the User Mode Driver mechanism, hence reusing all the user space infra we have. - Support sleepable BPF programs, initially targeting LSM and tracing. - Add bpf_d_path() helper for returning full path for given 'struct path'. - Make bpf_tail_call compatible with bpf-to-bpf calls. - Allow BPF programs to call map_update_elem on sockmaps. - Add BPF Type Format (BTF) support for type and enum discovery, as well as support for using BTF within the kernel itself (current use is for pretty printing structures). - Support listing and getting information about bpf_links via the bpf syscall. - Enhance kernel interfaces around NIC firmware update. Allow specifying overwrite mask to control if settings etc. are reset during update; report expected max time operation may take to users; support firmware activation without machine reboot incl. limits of how much impact reset may have (e.g. dropping link or not). - Extend ethtool configuration interface to report IEEE-standard counters, to limit the need for per-vendor logic in user space. - Adopt or extend devlink use for debug, monitoring, fw update in many drivers (dsa loop, ice, ionic, sja1105, qed, mlxsw, mv88e6xxx, dpaa2-eth). - In mlxsw expose critical and emergency SFP module temperature alarms. Refactor port buffer handling to make the defaults more suitable and support setting these values explicitly via the DCBNL interface. - Add XDP support for Intel's igb driver. - Support offloading TC flower classification and filtering rules to mscc_ocelot switches. - Add PTP support for Marvell Octeontx2 and PP2.2 hardware, as well as fixed interval period pulse generator and one-step timestamping in dpaa-eth. - Add support for various auth offloads in WiFi APs, e.g. SAE (WPA3) offload. - Add Lynx PHY/PCS MDIO module, and convert various drivers which have this HW to use it. Convert mvpp2 to split PCS. - Support Marvell Prestera 98DX3255 24-port switch ASICs, as well as 7-port Mediatek MT7531 IP. - Add initial support for QCA6390 and IPQ6018 in ath11k WiFi driver, and wcn3680 support in wcn36xx. - Improve performance for packets which don't require much offloads on recent Mellanox NICs by 20% by making multiple packets share a descriptor entry. - Move chelsio inline crypto drivers (for TLS and IPsec) from the crypto subtree to drivers/net. Move MDIO drivers out of the phy directory. - Clean up a lot of W=1 warnings, reportedly the actively developed subsections of networking drivers should now build W=1 warning free. - Make sure drivers don't use in_interrupt() to dynamically adapt their code. Convert tasklets to use new tasklet_setup API (sadly this conversion is not yet complete). * tag 'net-next-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2583 commits) Revert "bpfilter: Fix build error with CONFIG_BPFILTER_UMH" net, sockmap: Don't call bpf_prog_put() on NULL pointer bpf, selftest: Fix flaky tcp_hdr_options test when adding addr to lo bpf, sockmap: Add locking annotations to iterator netfilter: nftables: allow re-computing sctp CRC-32C in 'payload' statements net: fix pos incrementment in ipv6_route_seq_next net/smc: fix invalid return code in smcd_new_buf_create() net/smc: fix valid DMBE buffer sizes net/smc: fix use-after-free of delayed events bpfilter: Fix build error with CONFIG_BPFILTER_UMH cxgb4/ch_ipsec: Replace the module name to ch_ipsec from chcr net: sched: Fix suspicious RCU usage while accessing tcf_tunnel_info bpf: Fix register equivalence tracking. rxrpc: Fix loss of final ack on shutdown rxrpc: Fix bundle counting for exclusive connections netfilter: restore NF_INET_NUMHOOKS ibmveth: Identify ingress large send packets. ibmveth: Switch order of ibmveth_helper calls. cxgb4: handle 4-tuple PEDIT to NAT mode translation selftests: Add VRF route leaking tests ...	2020-10-15 18:42:13 -07:00

1 2 3 4 5 ...

4794 Commits