linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-03 03:46:51 +07:00

Author	SHA1	Message	Date
Russell King	75c912a367	ARM: add .gitignore entry for aesbs-core.S This avoids this file being incorrectly added to git. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2013-10-07 15:43:53 +01:00
Ard Biesheuvel	e4e7f10bfc	ARM: add support for bit sliced AES using NEON instructions Bit sliced AES gives around 45% speedup on Cortex-A15 for encryption and around 25% for decryption. This implementation of the AES algorithm does not rely on any lookup tables so it is believed to be invulnerable to cache timing attacks. This algorithm processes up to 8 blocks in parallel in constant time. This means that it is not usable by chaining modes that are strictly sequential in nature, such as CBC encryption. CBC decryption, however, can benefit from this implementation and runs about 25% faster. The other chaining modes implemented in this module, XTS and CTR, can execute fully in parallel in both directions. The core code has been adopted from the OpenSSL project (in collaboration with the original author, on cc). For ease of maintenance, this version is identical to the upstream OpenSSL code, i.e., all modifications that were required to make it suitable for inclusion into the kernel have been made upstream. The original can be found here: http://git.openssl.org/gitweb/?p=openssl.git;a=commit;h=6f6a6130 Note to integrators: While this implementation is significantly faster than the existing table based ones (generic or ARM asm), especially in CTR mode, the effects on power efficiency are unclear as of yet. This code does fundamentally more work, by calculating values that the table based code obtains by a simple lookup; only by doing all of that work in a SIMD fashion, it manages to perform better. Cc: Andy Polyakov <appro@openssl.org> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>	2013-10-04 20:48:38 +02:00
Ard Biesheuvel	5ce26f3b5a	ARM: move AES typedefs and function prototypes to separate header Put the struct definitions for AES keys and the asm function prototypes in a separate header and export the asm functions from the module. This allows other drivers to use them directly. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>	2013-10-04 09:26:54 +02:00
Ard Biesheuvel	40190c85f4	ARM: 7837/3: fix Thumb-2 bug in AES assembler code Patch `638591c` enabled building the AES assembler code in Thumb2 mode. However, this code used arithmetic involving PC rather than adr{l} instructions to generate PC-relative references to the lookup tables, and this needs to take into account the different PC offset when running in Thumb mode. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Nicolas Pitre <nico@linaro.org> Cc: stable@vger.kernel.org Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2013-09-22 11:43:38 +01:00
Ard Biesheuvel	934fc24df1	ARM: 7723/1: crypto: sha1-armv4-large.S: fix SP handling Make the SHA1 asm code ABI conformant by making sure all stack accesses occur above the stack pointer. Origin: http://git.openssl.org/gitweb/?p=openssl.git;a=commit;h=1a9d60d2 Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Nicolas Pitre <nico@linaro.org> Cc: stable@vger.kernel.org Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2013-05-22 22:01:35 +01:00
Dave Martin	638591cd7b	ARM: 7626/1: arm/crypto: Make asm SHA-1 and AES code Thumb-2 compatible This patch fixes aes-armv4.S and sha1-armv4-large.S to work natively in Thumb. This allows ARM/Thumb interworking workarounds to be removed. I also take the opportunity to convert some explicit assembler directives for exported functions to the standard ENTRY()/ENDPROC(). For the code itself: * In sha1_block_data_order, use of TEQ with sp is deprecated in ARMv7 and not supported in Thumb. For the branches back to .L_00_15 and .L_40_59, the TEQ is converted to a CMP, under the assumption that clobbering the C flag here will not cause incorrect behaviour. For the first branch back to .L_20_39_or_60_79 the C flag is important, so sp is moved temporarily into another register so that TEQ can be used for the comparison. * In the AES code, most forms of register-indexed addressing with shifts and rotates are not permitted for loads and stores in Thumb, so the address calculation is done using a separate instruction for the Thumb case. The resulting code is unlikely to be optimally scheduled, but it should not have a large impact given the overall size of the code. I haven't run any benchmarks. Signed-off-by: Dave Martin <dave.martin@linaro.org> Tested-by: David McCullough <ucdevel@gmail.com> (ARM only) Acked-by: David McCullough <ucdevel@gmail.com> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2013-01-13 12:41:22 +00:00
David McCullough	f0be44f4fb	arm/crypto: Add optimized AES and SHA1 routines Add assembler versions of AES and SHA1 for ARM platforms. This has provided up to a 50% improvement in IPsec/TCP throughout for tunnels using AES128/SHA1. Platform CPU SPeed Endian Before (bps) After (bps) Improvement IXP425 533 MHz big 11217042 15566294 ~38% KS8695 166 MHz little 3828549 5795373 ~51% Signed-off-by: David McCullough <ucdevel@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2012-09-07 04:17:02 +08:00

7 Commits