linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-28 11:18:45 +07:00

Author	SHA1	Message	Date
Eric Biggers	a1b22a5f45	crypto: arm/chacha20 - faster 8-bit rotations and other optimizations Optimize ChaCha20 NEON performance by: - Implementing the 8-bit rotations using the 'vtbl.8' instruction. - Streamlining the part that adds the original state and XORs the data. - Making some other small tweaks. On ARM Cortex-A7, these optimizations improve ChaCha20 performance from about 12.08 cycles per byte to about 11.37 -- a 5.9% improvement. There is a tradeoff involved with the 'vtbl.8' rotation method since there is at least one CPU (Cortex-A53) where it's not fastest. But it seems to be a better default; see the added comment. Overall, this patch reduces Cortex-A53 performance by less than 0.5%. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-09-04 11:37:05 +08:00
Eric Biggers	4e34e51f48	crypto: arm/chacha20 - always use vrev for 16-bit rotates The 4-way ChaCha20 NEON code implements 16-bit rotates with vrev32.16, but the one-way code (used on remainder blocks) implements it with vshl + vsri, which is slower. Switch the one-way code to vrev32.16 too. Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-08-03 18:06:05 +08:00
Ard Biesheuvel	afaf712e99	crypto: arm/chacha20 - implement NEON version based on SSE3 code This is a straight port to ARM/NEON of the x86 SSE3 implementation of the ChaCha20 stream cipher. It uses the new skcipher walksize attribute to process the input in strides of 4x the block size. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-01-13 00:26:48 +08:00
Herbert Xu	5386e5d1f8	Revert "crypto: arm64/ARM: NEON accelerated ChaCha20" This patch reverts the following commits: `8621caa0d4` `8096667273` I should not have applied them because they had already been obsoleted by a subsequent patch series. They also cause a build failure because of the subsequent commit `9ae433bc79`. Fixes: `9ae433bc79` ("crypto: chacha20 - convert generic and...") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-12-28 17:39:26 +08:00
Ard Biesheuvel	8096667273	crypto: arm/chacha20 - implement NEON version based on SSE3 code This is a straight port to ARM/NEON of the x86 SSE3 implementation of the ChaCha20 stream cipher. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-12-27 17:47:29 +08:00

5 Commits