linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-16 21:57:31 +07:00

Author	SHA1	Message	Date
Linus Torvalds	dafa5f6577	Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Fix dcache flushing crash in skcipher. - Add hash finup self-tests. - Reschedule during speed tests. Algorithms: - Remove insecure vmac and replace it with vmac64. - Add public key verification for DH/ECDH. Drivers: - Decrease priority of sha-mb on x86. - Improve NEON latency/throughput on ARM64. - Add md5/sha384/sha512/des/3des to inside-secure. - Support eip197d in inside-secure. - Only register algorithms supported by the host in virtio. - Add cts and remove incompatible cts1 from ccree. - Add hisilicon SEC security accelerator driver. - Replace msm hwrng driver with qcom pseudo rng driver. Misc: - Centralize CRC polynomials" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (121 commits) crypto: arm64/ghash-ce - implement 4-way aggregation crypto: arm64/ghash-ce - replace NEON yield check with block limit crypto: hisilicon - sec_send_request() can be static lib/mpi: remove redundant variable esign crypto: arm64/aes-ce-gcm - don't reload key schedule if avoidable crypto: arm64/aes-ce-gcm - implement 2-way aggregation crypto: arm64/aes-ce-gcm - operate on two input blocks at a time crypto: dh - make crypto_dh_encode_key() make robust crypto: dh - fix calculating encoded key size crypto: ccp - Check for NULL PSP pointer at module unload crypto: arm/chacha20 - always use vrev for 16-bit rotates crypto: ccree - allow bigger than sector XTS op crypto: ccree - zero all of request ctx before use crypto: ccree - remove cipher ivgen left overs crypto: ccree - drop useless type flag during reg crypto: ablkcipher - fix crash flushing dcache in error path crypto: blkcipher - fix crash flushing dcache in error path crypto: skcipher - fix crash flushing dcache in error path crypto: skcipher - remove unnecessary setting of walk->nbytes crypto: scatterwalk - remove scatterwalk_samebuf() ...	2018-08-15 16:01:47 -07:00
Linus Torvalds	f24d6f2654	Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asm updates from Thomas Gleixner: "The lowlevel and ASM code updates for x86: - Make stack trace unwinding more reliable - ASM instruction updates for better code generation - Various cleanups" * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/entry/64: Add two more instruction suffixes x86/asm/64: Use 32-bit XOR to zero registers x86/build/vdso: Simplify 'cmd_vdso2c' x86/build/vdso: Remove unused vdso-syms.lds x86/stacktrace: Enable HAVE_RELIABLE_STACKTRACE for the ORC unwinder x86/unwind/orc: Detect the end of the stack x86/stacktrace: Do not fail for ORC with regs on stack x86/stacktrace: Clarify the reliable success paths x86/stacktrace: Remove STACKTRACE_DUMP_ONCE x86/stacktrace: Do not unwind after user regs x86/asm: Use CC_SET/CC_OUT in percpu_cmpxchg8b_double() to micro-optimize code generation	2018-08-13 13:35:26 -07:00
Ondrej Mosnacek	877ccce7cb	crypto: x86/aegis,morus - Fix and simplify CPUID checks It turns out I had misunderstood how the x86_match_cpu() function works. It evaluates a logical OR of the matching conditions, not logical AND. This caused the CPU feature checks for AEGIS to pass even if only SSE2 (but not AES-NI) was supported (or vice versa), leading to potential crashes if something tried to use the registered algs. This patch switches the checks to a simpler method that is used e.g. in the Camellia x86 code. The patch also removes the MODULE_DEVICE_TABLE declarations which actually seem to cause the modules to be auto-loaded at boot, which is not desired. The crypto API on-demand module loading is sufficient. Fixes: `1d373d4e8e` ("crypto: x86 - Add optimized AEGIS implementations") Fixes: `6ecc9d9ff9` ("crypto: x86 - Add optimized MORUS implementations") Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Tested-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-08-07 17:51:15 +08:00
Herbert Xu	c5f5aeef9b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux Merge mainline to pick up `c7513c2a27` ("crypto/arm64: aes-ce-gcm - add missing kernel_neon_begin/end pair").	2018-08-03 17:55:12 +08:00
Eric Biggers	c87a405e3b	crypto: ahash - remove useless setting of cra_type Some ahash algorithms set .cra_type = &crypto_ahash_type. But this is redundant with the C structure type ('struct ahash_alg'), and crypto_register_ahash() already sets the .cra_type automatically. Apparently the useless assignment has just been copy+pasted around. So, remove the useless assignment from all the ahash algorithms. This patch shouldn't change any actual behavior. Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Gilad Ben-Yossef <gilad@benyossef.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-07-09 00:30:26 +08:00
Eric Biggers	6a38f62245	crypto: ahash - remove useless setting of type flags Many ahash algorithms set .cra_flags = CRYPTO_ALG_TYPE_AHASH. But this is redundant with the C structure type ('struct ahash_alg'), and crypto_register_ahash() already sets the type flag automatically, clearing any type flag that was already there. Apparently the useless assignment has just been copy+pasted around. So, remove the useless assignment from all the ahash algorithms. This patch shouldn't change any actual behavior. Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Gilad Ben-Yossef <gilad@benyossef.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-07-09 00:30:25 +08:00
Eric Biggers	e50944e219	crypto: shash - remove useless setting of type flags Many shash algorithms set .cra_flags = CRYPTO_ALG_TYPE_SHASH. But this is redundant with the C structure type ('struct shash_alg'), and crypto_register_shash() already sets the type flag automatically, clearing any type flag that was already there. Apparently the useless assignment has just been copy+pasted around. So, remove the useless assignment from all the shash algorithms. This patch shouldn't change any actual behavior. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-07-09 00:30:24 +08:00
Eric Biggers	8aeef492fe	crypto: x86/sha-mb - decrease priority of multibuffer algorithms With all the crypto modules enabled on x86, and with a CPU that supports AVX-2 but not SHA-NI instructions (e.g. Haswell, Broadwell, Skylake), the "multibuffer" implementations of SHA-1, SHA-256, and SHA-512 are the highest priority. However, these implementations only perform well when many hash requests are being submitted concurrently, filling all 8 AVX-2 lanes. Otherwise, they are incredibly slow, as they waste time waiting for more requests to arrive before proceeding to execute each request. For example, here are the speeds I see hashing 4096-byte buffers with a single thread on a Haswell-based processor: generic avx2 mb (multibuffer) ------- -------- ---------------- sha1 602 MB/s 997 MB/s 0.61 MB/s sha256 228 MB/s 412 MB/s 0.61 MB/s sha512 312 MB/s 559 MB/s 0.61 MB/s So, the multibuffer implementation is 500 to 1000 times slower than the other implementations. Note that with smaller buffers or more update()s per digest, the difference would be even greater. I believe the vast majority of people are in the boat where the multibuffer code is much slower, and only a small minority are doing the highly parallel, hashing-intensive, latency-flexible workloads (maybe IPsec on servers?) where the multibuffer code may be beneficial. Yet, people often aren't familiar with all the crypto config options and so the multibuffer code may inadvertently be built into the kernel. Also the multibuffer code apparently hasn't been very well tested, seeing as it was sometimes computing the wrong SHA-256 digest. So, let's make the multibuffer algorithms low priority. Users who want to use them can either request them explicitly by driver name, or use NETLINK_CRYPTO (crypto_user) to increase their priority at runtime. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-07-09 00:30:21 +08:00
Eric Biggers	af839b4e54	crypto: x86/sha256-mb - fix digest copy in sha256_mb_mgr_get_comp_job_avx2() There is a copy-paste error where sha256_mb_mgr_get_comp_job_avx2() copies the SHA-256 digest state from sha256_mb_mgr::args::digest to job_sha256::result_digest. Consequently, the sha256_mb algorithm sometimes calculates the wrong digest. Fix it. Reproducer using AF_ALG: #include <assert.h> #include <linux/if_alg.h> #include <stdio.h> #include <string.h> #include <sys/socket.h> #include <unistd.h> static const __u8 expected[32] = "\xad\x7f\xac\xb2\x58\x6f\xc6\xe9\x66\xc0\x04\xd7\xd1\xd1\x6b\x02" "\x4f\x58\x05\xff\x7c\xb4\x7c\x7a\x85\xda\xbd\x8b\x48\x89\x2c\xa7"; int main() { int fd; struct sockaddr_alg addr = { .salg_type = "hash", .salg_name = "sha256_mb", }; __u8 data[4096] = { 0 }; __u8 digest[32]; int ret; int i; fd = socket(AF_ALG, SOCK_SEQPACKET, 0); bind(fd, (void *)&addr, sizeof(addr)); fork(); fd = accept(fd, 0, 0); do { ret = write(fd, data, 4096); assert(ret == 4096); ret = read(fd, digest, 32); assert(ret == 32); } while (memcmp(digest, expected, 32) == 0); printf("wrong digest: "); for (i = 0; i < 32; i++) printf("%02x", digest[i]); printf("\n"); } Output was: wrong digest: ad7facb2000000000000000000000000ffffffef7cb47c7a85dabd8b48892ca7 Fixes: `172b1d6b5a` ("crypto: sha256-mb - fix ctx pointer and digest copy") Cc: <stable@vger.kernel.org> # v4.8+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-07-09 00:30:19 +08:00
Jan Beulich	a7bea83089	x86/asm/64: Use 32-bit XOR to zero registers Some Intel CPUs don't recognize 64-bit XORs as zeroing idioms. Zeroing idioms don't require execution bandwidth, as they're being taken care of in the frontend (through register renaming). Use 32-bit XORs instead. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Alok Kataria <akataria@vmware.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: davem@davemloft.net Cc: herbert@gondor.apana.org.au Cc: pavel@ucw.cz Cc: rjw@rjwysocki.net Link: http://lkml.kernel.org/r/5B39FF1A02000078001CFB54@prv1-mh.provo.novell.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-07-03 09:59:29 +02:00
Borislav Petkov	221e00d1fc	crypto: x86 - Add missing RETs Add explicit RETs to the tail calls of AEGIS and MORUS crypto algorithms otherwise they run into INT3 padding due to ("x86/asm: Pad assembly functions with INT3 instructions") leading to spurious debug exceptions. Mike Galbraith <efault@gmx.de> took care of all the remaining callsites. Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Ondrej Mosnacek <omosnacek@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-07-01 23:33:20 +08:00
Eric Biggers	b7b73cd5d7	crypto: x86/salsa20 - remove x86 salsa20 implementations The x86 assembly implementations of Salsa20 use the frame base pointer register (%ebp or %rbp), which breaks frame pointer convention and breaks stack traces when unwinding from an interrupt in the crypto code. Recent (v4.10+) kernels will warn about this, e.g. WARNING: kernel stack regs at 00000000a8291e69 in syzkaller047086:4677 has bad 'bp' value 000000001077994c [...] But after looking into it, I believe there's very little reason to still retain the x86 Salsa20 code. First, these are not vectorized (SSE2/SSSE3/AVX2) implementations, which would be needed to get anywhere close to the best Salsa20 performance on any remotely modern x86 processor; they're just regular x86 assembly. Second, it's still unclear that anyone is actually using the kernel's Salsa20 at all, especially given that now ChaCha20 is supported too, and with much more efficient SSSE3 and AVX2 implementations. Finally, in benchmarks I did on both Intel and AMD processors with both gcc 8.1.0 and gcc 4.9.4, the x86_64 salsa20-asm is actually slightly slower than salsa20-generic (~3% slower on Skylake, ~10% slower on Zen), while the i686 salsa20-asm is only slightly faster than salsa20-generic (~15% faster on Skylake, ~20% faster on Zen). The gcc version made little difference. So, the x86_64 salsa20-asm is pretty clearly useless. That leaves just the i686 salsa20-asm, which based on my tests provides a 15-20% speed boost. But that's without updating the code to not use %ebp. And given the maintenance cost, the small speed difference vs. salsa20-generic, the fact that few people still use i686 kernels, the doubt that anyone is even using the kernel's Salsa20 at all, and the fact that a SSE2 implementation would almost certainly be much faster on any remotely modern x86 processor yet no one has cared enough to add one yet, I don't think it's worthwhile to keep. Thus, just remove both the x86_64 and i686 salsa20-asm implementations. Reported-by: syzbot+ffa3a158337bbc01ff09@syzkaller.appspotmail.com Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-05-31 00:13:57 +08:00
Ondrej Mosnacek	2808f17319	crypto: morus - Mark MORUS SIMD glue as x86-specific Commit `56e8e57fc3` ("crypto: morus - Add common SIMD glue code for MORUS") accidetally consiedered the glue code to be usable by different architectures, but it seems to be only usable on x86. This patch moves it under arch/x86/crypto and adds 'depends on X86' to the Kconfig options and also removes the prompt to hide these internal options from the user. Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Ondrej Mosnacek <omosnacek@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-05-31 00:13:41 +08:00
Ondrej Mosnacek	dd09f58ce0	crypto: x86/aegis256 - Fix wrong key buffer size AEGIS-256 key is two blocks, not one. Fixes: `1d373d4e8e` ("crypto: x86 - Add optimized AEGIS implementations") Reported-by: Eric Biggers <ebiggers3@gmail.com> Signed-off-by: Ondrej Mosnacek <omosnacek@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-05-27 00:12:12 +08:00
Ondrej Mosnacek	6ecc9d9ff9	crypto: x86 - Add optimized MORUS implementations This patch adds optimized implementations of MORUS-640 and MORUS-1280, utilizing the SSE2 and AVX2 x86 extensions. For MORUS-1280 (which operates on 256-bit blocks) we provide both AVX2 and SSE2 implementation. Although SSE2 MORUS-1280 is slower than AVX2 MORUS-1280, it is comparable in speed to the SSE2 MORUS-640. Signed-off-by: Ondrej Mosnacek <omosnacek@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-05-19 00:15:35 +08:00
Ondrej Mosnacek	1d373d4e8e	crypto: x86 - Add optimized AEGIS implementations This patch adds optimized implementations of AEGIS-128, AEGIS-128L, and AEGIS-256, utilizing the AES-NI and SSE2 x86 extensions. Signed-off-by: Ondrej Mosnacek <omosnacek@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-05-19 00:14:00 +08:00
Colin Ian King	158b52ff14	crypto: ghash-clmulni - fix spelling mistake: "acclerated" -> "accelerated" Trivial fix to spelling mistake in module description text Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-05-05 14:52:58 +08:00
Wu Fengguang	9cc16b4d32	crypto: x86/des3_ede - des3_ede_skciphers[] can be static Fixes: `09c0f03bf8` ("crypto: x86/des3_ede - convert to skcipher interface") Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-03-09 22:45:53 +08:00
Eric Biggers	75d8a5532f	crypto: x86/glue_helper - rename glue_skwalk_fpu_begin() There are no users of the original glue_fpu_begin() anymore, so rename glue_skwalk_fpu_begin() to glue_fpu_begin() so that it matches glue_fpu_end() again. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-03-03 00:03:35 +08:00
Eric Biggers	0d87d0f425	crypto: x86/glue_helper - remove blkcipher_walk functions Now that all glue_helper users have been switched from the blkcipher interface over to the skcipher interface, remove the versions of the glue_helper functions that handled the blkcipher interface. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-03-03 00:03:34 +08:00
Eric Biggers	217afccf65	crypto: lrw - remove lrw_crypt() Now that all users of lrw_crypt() have been removed in favor of the LRW template wrapping an ECB mode algorithm, remove lrw_crypt(). Also remove crypto/lrw.h as that is no longer needed either; and fold 'struct lrw_table_ctx' into 'struct priv', lrw_init_table() into setkey(), and lrw_free_table() into exit_tfm(). Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-03-03 00:03:34 +08:00
Eric Biggers	44893bc296	crypto: x86/camellia-aesni-avx, avx2 - convert to skcipher interface Convert the AESNI AVX and AESNI AVX2 implementations of Camellia from the (deprecated) ablkcipher and blkcipher interfaces over to the skcipher interface. Note that this includes replacing the use of ablk_helper with crypto_simd. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-03-03 00:03:32 +08:00
Eric Biggers	1af6d03710	crypto: x86/camellia - convert to skcipher interface Convert the x86 asm implementation of Camellia from the (deprecated) blkcipher interface over to the skcipher interface. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-03-03 00:03:32 +08:00
Eric Biggers	451cc49324	crypto: x86/camellia - remove XTS algorithm The XTS template now wraps an ECB mode algorithm rather than the block cipher directly. Therefore it is now redundant for crypto modules to wrap their ECB code with generic XTS code themselves via xts_crypt(). Remove the xts-camellia-asm algorithm which did this. Users who request xts(camellia) and previously would have gotten xts-camellia-asm will now get xts(ecb-camellia-asm) instead, which is just as fast. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-03-03 00:03:32 +08:00
Eric Biggers	6043d341f0	crypto: x86/camellia - remove LRW algorithm The LRW template now wraps an ECB mode algorithm rather than the block cipher directly. Therefore it is now redundant for crypto modules to wrap their ECB code with generic LRW code themselves via lrw_crypt(). Remove the lrw-camellia-asm algorithm which did this. Users who request lrw(camellia) and previously would have gotten lrw-camellia-asm will now get lrw(ecb-camellia-asm) instead, which is just as fast. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-03-03 00:03:31 +08:00
Eric Biggers	44c9b75409	crypto: x86/camellia-aesni-avx2 - remove LRW algorithm The LRW template now wraps an ECB mode algorithm rather than the block cipher directly. Therefore it is now redundant for crypto modules to wrap their ECB code with generic LRW code themselves via lrw_crypt(). Remove the lrw-camellia-aesni-avx2 algorithm which did this. Users who request lrw(camellia) and previously would have gotten lrw-camellia-aesni-avx2 will now get lrw(ecb-camellia-aesni-avx2) instead, which is just as fast. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-03-03 00:03:31 +08:00
Eric Biggers	6fcb81b562	crypto: x86/camellia-aesni-avx - remove LRW algorithm The LRW template now wraps an ECB mode algorithm rather than the block cipher directly. Therefore it is now redundant for crypto modules to wrap their ECB code with generic LRW code themselves via lrw_crypt(). Remove the lrw-camellia-aesni algorithm which did this. Users who request lrw(camellia) and previously would have gotten lrw-camellia-aesni will now get lrw(ecb-camellia-aesni) instead, which is just as fast. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-03-03 00:03:30 +08:00
Eric Biggers	09c0f03bf8	crypto: x86/des3_ede - convert to skcipher interface Convert the x86 asm implementation of Triple DES from the (deprecated) blkcipher interface over to the skcipher interface. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-03-03 00:03:30 +08:00
Eric Biggers	c1679171c6	crypto: x86/blowfish: convert to skcipher interface Convert the x86 asm implementation of Blowfish from the (deprecated) blkcipher interface over to the skcipher interface. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-03-03 00:03:29 +08:00
Eric Biggers	4bd9692431	crypto: x86/cast6-avx - convert to skcipher interface Convert the AVX implementation of CAST6 from the (deprecated) ablkcipher and blkcipher interfaces over to the skcipher interface. Note that this includes replacing the use of ablk_helper with crypto_simd. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-03-03 00:03:28 +08:00
Eric Biggers	f51a1fa439	crypto: x86/cast6-avx - remove LRW algorithm The LRW template now wraps an ECB mode algorithm rather than the block cipher directly. Therefore it is now redundant for crypto modules to wrap their ECB code with generic LRW code themselves via lrw_crypt(). Remove the lrw-cast6-avx algorithm which did this. Users who request lrw(cast6) and previously would have gotten lrw-cast6-avx will now get lrw(ecb-cast6-avx) instead, which is just as fast. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-03-03 00:03:27 +08:00
Eric Biggers	1e63183a20	crypto: x86/cast5-avx - convert to skcipher interface Convert the AVX implementation of CAST5 from the (deprecated) ablkcipher and blkcipher interfaces over to the skcipher interface. Note that this includes replacing the use of ablk_helper with crypto_simd. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-03-03 00:03:26 +08:00
Eric Biggers	8f461b1e02	crypto: x86/cast5-avx - fix ECB encryption when long sg follows short one With ecb-cast5-avx, if a 128+ byte scatterlist element followed a shorter one, then the algorithm accidentally encrypted/decrypted only 8 bytes instead of the expected 128 bytes. Fix it by setting the encryption/decryption 'fn' correctly. Fixes: `c12ab20b16` ("crypto: cast5/avx - avoid using temporary stack buffers") Cc: <stable@vger.kernel.org> # v3.8+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-03-03 00:03:25 +08:00
Eric Biggers	0e6ab46dad	crypto: x86/twofish-avx - convert to skcipher interface Convert the AVX implementation of Twofish from the (deprecated) ablkcipher and blkcipher interfaces over to the skcipher interface. Note that this includes replacing the use of ablk_helper with crypto_simd. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-03-03 00:03:25 +08:00
Eric Biggers	876e9f0c12	crypto: x86/twofish-avx - remove LRW algorithm The LRW template now wraps an ECB mode algorithm rather than the block cipher directly. Therefore it is now redundant for crypto modules to wrap their ECB code with generic LRW code themselves via lrw_crypt(). Remove the lrw-twofish-avx algorithm which did this. Users who request lrw(twofish) and previously would have gotten lrw-twofish-avx will now get lrw(ecb-twofish-avx) instead, which is just as fast. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-03-03 00:03:25 +08:00
Eric Biggers	37992fa47f	crypto: x86/twofish-3way - convert to skcipher interface Convert the 3-way implementation of Twofish from the (deprecated) blkcipher interface over to the skcipher interface. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-03-03 00:03:24 +08:00
Eric Biggers	ebeea983dd	crypto: x86/twofish-3way - remove XTS algorithm The XTS template now wraps an ECB mode algorithm rather than the block cipher directly. Therefore it is now redundant for crypto modules to wrap their ECB code with generic XTS code themselves via xts_crypt(). Remove the xts-twofish-3way algorithm which did this. Users who request xts(twofish) and previously would have gotten xts-twofish-3way will now get xts(ecb-twofish-3way) instead, which is just as fast. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-03-03 00:03:24 +08:00
Eric Biggers	68bfc4924b	crypto: x86/twofish-3way - remove LRW algorithm The LRW template now wraps an ECB mode algorithm rather than the block cipher directly. Therefore it is now redundant for crypto modules to wrap their ECB code with generic LRW code themselves via lrw_crypt(). Remove the lrw-twofish-3way algorithm which did this. Users who request lrw(twofish) and previously would have gotten lrw-twofish-3way will now get lrw(ecb-twofish-3way) instead, which is just as fast. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-03-03 00:03:23 +08:00
Eric Biggers	e16bf974b3	crypto: x86/serpent-avx,avx2 - convert to skcipher interface Convert the AVX and AVX2 implementations of Serpent from the (deprecated) ablkcipher and blkcipher interfaces over to the skcipher interface. Note that this includes replacing the use of ablk_helper with crypto_simd. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-03-03 00:03:22 +08:00
Eric Biggers	340b830326	crypto: x86/serpent-avx - remove LRW algorithm The LRW template now wraps an ECB mode algorithm rather than the block cipher directly. Therefore it is now redundant for crypto modules to wrap their ECB code with generic LRW code themselves via lrw_crypt(). Remove the lrw-serpent-avx algorithm which did this. Users who request lrw(serpent) and previously would have gotten lrw-serpent-avx will now get lrw(ecb-serpent-avx) instead, which is just as fast. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-03-03 00:03:21 +08:00
Eric Biggers	e5f382e643	crypto: x86/serpent-avx2 - remove LRW algorithm The LRW template now wraps an ECB mode algorithm rather than the block cipher directly. Therefore it is now redundant for crypto modules to wrap their ECB code with generic LRW code themselves via lrw_crypt(). Remove the lrw-serpent-avx2 algorithm which did this. Users who request lrw(serpent) and previously would have gotten lrw-serpent-avx2 will now get lrw(ecb-serpent-avx2) instead, which is just as fast. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-03-03 00:03:20 +08:00
Eric Biggers	e0f409dcb8	crypto: x86/serpent-sse2 - convert to skcipher interface Convert the SSE2 implementation of Serpent from the (deprecated) ablkcipher and blkcipher interfaces over to the skcipher interface. Note that this includes replacing the use of ablk_helper with crypto_simd. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-03-03 00:03:20 +08:00
Eric Biggers	8bab4e3cd5	crypto: x86/serpent-sse2 - remove XTS algorithm The XTS template now wraps an ECB mode algorithm rather than the block cipher directly. Therefore it is now redundant for crypto modules to wrap their ECB code with generic XTS code themselves via xts_crypt(). Remove the xts-serpent-sse2 algorithm which did this. Users who request xts(serpent) and previously would have gotten xts-serpent-sse2 will now get xts(ecb-serpent-sse2) instead, which is just as fast. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-03-03 00:03:19 +08:00
Eric Biggers	2a05cfc35f	crypto: x86/serpent-sse2 - remove LRW algorithm The LRW template now wraps an ECB mode algorithm rather than the block cipher directly. Therefore it is now redundant for crypto modules to wrap their ECB code with generic LRW code themselves via lrw_crypt(). Remove the lrw-serpent-sse2 algorithm which did this. Users who request lrw(serpent) and previously would have gotten lrw-serpent-sse2 will now get lrw(ecb-serpent-sse2) instead, which is just as fast. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-03-03 00:03:19 +08:00
Eric Biggers	f15f2a2542	crypto: x86/glue_helper - add skcipher_walk functions Add ECB, CBC, and CTR functions to glue_helper which use skcipher_walk rather than blkcipher_walk. This will allow converting the remaining x86 algorithms from the blkcipher interface over to the skcipher interface, after which we'll be able to remove the blkcipher_walk versions of these functions. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-03-03 00:03:18 +08:00
Dave Watson	e845520707	crypto: aesni - Update aesni-intel_glue to use scatter/gather Add gcmaes_crypt_by_sg routine, that will do scatter/gather by sg. Either src or dst may contain multiple buffers, so iterate over both at the same time if they are different. If the input is the same as the output, iterate only over one. Currently both the AAD and TAG must be linear, so copy them out with scatterlist_map_and_copy. If first buffer contains the entire AAD, we can optimize and not copy. Since the AAD can be any size, if copied it must be on the heap. TAG can be on the stack since it is always < 16 bytes. Only the SSE routines are updated so far, so leave the previous gcmaes_en/decrypt routines, and branch to the sg ones if the keysize is inappropriate for avx, or we are SSE only. Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-02-22 22:16:52 +08:00
Dave Watson	fb8986e643	crypto: aesni - Introduce scatter/gather asm function stubs The asm macros are all set up now, introduce entry points. GCM_INIT and GCM_COMPLETE have arguments supplied, so that the new scatter/gather entry points don't have to take all the arguments, and only the ones they need. Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-02-22 22:16:51 +08:00
Dave Watson	933d6aefd5	crypto: aesni - Add fast path for > 16 byte update We can fast-path any < 16 byte read if the full message is > 16 bytes, and shift over by the appropriate amount. Usually we are reading > 16 bytes, so this should be faster than the READ_PARTIAL macro introduced in `b20209c91e` for the average case. Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-02-22 22:16:50 +08:00
Dave Watson	ae952c5ec6	crypto: aesni - Introduce partial block macro Before this diff, multiple calls to GCM_ENC_DEC will succeed, but only if all calls are a multiple of 16 bytes. Handle partial blocks at the start of GCM_ENC_DEC, and update aadhash as appropriate. The data offset %r11 is also updated after the partial block. Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-02-22 22:16:49 +08:00
Dave Watson	1476db2d12	crypto: aesni - Move HashKey computation from stack to gcm_context HashKey computation only needs to happen once per scatter/gather operation, save it between calls in gcm_context struct instead of on the stack. Since the asm no longer stores anything on the stack, we can use %rsp directly, and clean up the frame save/restore macros a bit. Hashkeys actually only need to be calculated once per key and could be moved to when set_key is called, however, the current glue code falls back to generic aes code if fpu is disabled. Signed-off-by: Dave Watson <davejwatson@fb.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-02-22 22:16:49 +08:00

1 2 3 4 5 ...

487 Commits