linux_dsm_epyc7002/arch/x86/crypto
Ard Biesheuvel bf93113d46 crypto: x86/aes-ni-xts - use direct calls to and 4-way stride
[ Upstream commit 86ad60a65f29dd862a11c22bb4b5be28d6c5cef1 ]

The XTS asm helper arrangement is a bit odd: the 8-way stride helper
consists of back-to-back calls to the 4-way core transforms, which
are called indirectly, based on a boolean that indicates whether we
are performing encryption or decryption.

Given how costly indirect calls are on x86, let's switch to direct
calls, and given how the 8-way stride doesn't really add anything
substantial, use a 4-way stride instead, and make the asm core
routine deal with any multiple of 4 blocks. Since 512 byte sectors
or 4 KB blocks are the typical quantities XTS operates on, increase
the stride exported to the glue helper to 512 bytes as well.

As a result, the number of indirect calls is reduced from 3 per 64 bytes
of in/output to 1 per 512 bytes of in/output, which produces a 65% speedup
when operating on 1 KB blocks (measured on a Intel(R) Core(TM) i7-8650U CPU)

Fixes: 9697fa39ef ("x86/retpoline/crypto: Convert crypto assembler indirect jumps")
Tested-by: Eric Biggers <ebiggers@google.com> # x86_64
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-20 10:43:43 +01:00
..
.gitignore .gitignore: add SPDX License Identifier 2020-03-25 11:50:48 +01:00
aegis128-aesni-asm.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
aegis128-aesni-glue.c crypto: remove CRYPTO_TFM_RES_BAD_KEY_LEN 2020-01-09 11:30:53 +08:00
aes_ctrby8_avx-x86_64.S crypto: x86 - Remove include/asm/inst.h 2020-07-16 21:49:07 +10:00
aes_glue.c crypto: x86/aes - drop scalar assembler implementations 2019-07-26 14:56:02 +10:00
aesni-intel_asm.S crypto: x86/aes-ni-xts - use direct calls to and 4-way stride 2021-03-20 10:43:43 +01:00
aesni-intel_avx-x86_64.S crypto: aesni - Use TEST %reg,%reg instead of CMP $0,%reg 2021-03-20 10:43:43 +01:00
aesni-intel_glue.c crypto: x86/aes-ni-xts - use direct calls to and 4-way stride 2021-03-20 10:43:43 +01:00
blake2s-core.S x86: remove always-defined CONFIG_AS_SSSE3 2020-04-09 00:01:59 +09:00
blake2s-glue.c crypto: algapi - Remove skbuff.h inclusion 2020-08-20 14:04:28 +10:00
blowfish_glue.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 156 2019-05-30 11:26:35 -07:00
blowfish-x86_64-asm_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
camellia_aesni_avx2_glue.c crypto: remove CRYPTO_TFM_RES_BAD_KEY_LEN 2020-01-09 11:30:53 +08:00
camellia_aesni_avx_glue.c crypto: remove CRYPTO_TFM_RES_BAD_KEY_LEN 2020-01-09 11:30:53 +08:00
camellia_glue.c crypto: remove CRYPTO_TFM_RES_BAD_KEY_LEN 2020-01-09 11:30:53 +08:00
camellia-aesni-avx2-asm_64.S x86: Change {JMP,CALL}_NOSPEC argument 2020-04-30 20:14:34 +02:00
camellia-aesni-avx-asm_64.S x86: Change {JMP,CALL}_NOSPEC argument 2020-04-30 20:14:34 +02:00
camellia-x86_64-asm_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
cast5_avx_glue.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 156 2019-05-30 11:26:35 -07:00
cast5-avx-x86_64-asm_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
cast6_avx_glue.c crypto: remove CRYPTO_TFM_RES_BAD_KEY_LEN 2020-01-09 11:30:53 +08:00
cast6-avx-x86_64-asm_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
chacha_glue.c crypto: algapi - Remove skbuff.h inclusion 2020-08-20 14:04:28 +10:00
chacha-avx2-x86_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
chacha-avx512vl-x86_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
chacha-ssse3-x86_64.S crypto: x86/chacha-sse3 - use unaligned loads for state array 2020-07-16 21:49:04 +10:00
crc32-pclmul_asm.S crypto: x86 - Remove include/asm/inst.h 2020-07-16 21:49:07 +10:00
crc32-pclmul_glue.c crypto: Convert to new CPU match macros 2020-03-24 21:36:06 +01:00
crc32c-intel_glue.c crypto: x86/crc32c-intel - Use CRC32 mnemonic 2020-08-21 14:45:28 +10:00
crc32c-pcl-intel-asm_64.S crypto: x86/crc32c - fix building with clang ias 2020-07-23 17:34:16 +10:00
crct10dif-pcl-asm_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
crct10dif-pclmul_glue.c crypto: Convert to new CPU match macros 2020-03-24 21:36:06 +01:00
curve25519-x86_64.c crypto: curve25519-x86_64 - Use XORL r32,32 2020-09-11 14:39:13 +10:00
des3_ede_glue.c crypto: x86/des - switch to library interface 2019-08-22 14:57:33 +10:00
des3_ede-asm_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
ghash-clmulni-intel_asm.S crypto: x86 - Remove include/asm/inst.h 2020-07-16 21:49:07 +10:00
ghash-clmulni-intel_glue.c crypto: Convert to new CPU match macros 2020-03-24 21:36:06 +01:00
glue_helper-asm-avx2.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
glue_helper-asm-avx.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 157 2019-05-30 11:26:37 -07:00
glue_helper.c crypto: x86 - Regularize glue function prototypes 2019-12-11 16:36:54 +08:00
Makefile x86: update AS_* macros to binutils >=2.23, supporting ADX and AVX2 2020-04-09 00:12:48 +09:00
nh-avx2-x86_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
nh-sse2-x86_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
nhpoly1305-avx2-glue.c crypto: algapi - Remove skbuff.h inclusion 2020-08-20 14:04:28 +10:00
nhpoly1305-sse2-glue.c crypto: algapi - Remove skbuff.h inclusion 2020-08-20 14:04:28 +10:00
poly1305_glue.c crypto: x86/poly1305 - add back a needed assignment 2020-10-24 09:38:32 +11:00
poly1305-x86_64-cryptogams.pl crypto: poly1305-x86_64 - Use XORL r32,32 2020-09-11 14:39:13 +10:00
serpent_avx2_glue.c crypto: x86 - Regularize glue function prototypes 2019-12-11 16:36:54 +08:00
serpent_avx_glue.c crypto: x86 - Regularize glue function prototypes 2019-12-11 16:36:54 +08:00
serpent_sse2_glue.c crypto: x86 - Regularize glue function prototypes 2019-12-11 16:36:54 +08:00
serpent-avx2-asm_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
serpent-avx-x86_64-asm_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
serpent-sse2-i586-asm_32.S x86/asm/32: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 12:03:43 +02:00
serpent-sse2-x86_64-asm_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
sha1_avx2_x86_64_asm.S crypto: x86/sha - Eliminate casts on asm implementations 2020-01-22 16:21:08 +08:00
sha1_ni_asm.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
sha1_ssse3_asm.S x86: remove always-defined CONFIG_AS_AVX 2020-04-09 00:01:59 +09:00
sha1_ssse3_glue.c crypto: lib/sha1 - remove unnecessary includes of linux/cryptohash.h 2020-05-08 15:32:17 +10:00
sha256_ni_asm.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
sha256_ssse3_glue.c crypto: lib/sha1 - remove unnecessary includes of linux/cryptohash.h 2020-05-08 15:32:17 +10:00
sha256-avx2-asm.S x86: update AS_* macros to binutils >=2.23, supporting ADX and AVX2 2020-04-09 00:12:48 +09:00
sha256-avx-asm.S x86: remove always-defined CONFIG_AS_AVX 2020-04-09 00:01:59 +09:00
sha256-ssse3-asm.S crypto: x86/sha - Eliminate casts on asm implementations 2020-01-22 16:21:08 +08:00
sha512_ssse3_glue.c crypto: lib/sha1 - remove unnecessary includes of linux/cryptohash.h 2020-05-08 15:32:17 +10:00
sha512-avx2-asm.S x86: update AS_* macros to binutils >=2.23, supporting ADX and AVX2 2020-04-09 00:12:48 +09:00
sha512-avx-asm.S x86: remove always-defined CONFIG_AS_AVX 2020-04-09 00:01:59 +09:00
sha512-ssse3-asm.S crypto: x86/sha - Eliminate casts on asm implementations 2020-01-22 16:21:08 +08:00
twofish_avx_glue.c crypto: remove CRYPTO_TFM_RES_BAD_KEY_LEN 2020-01-09 11:30:53 +08:00
twofish_glue_3way.c crypto: x86 - Regularize glue function prototypes 2019-12-11 16:36:54 +08:00
twofish_glue.c crypto: prefix module autoloading with "crypto-" 2014-11-24 22:43:57 +08:00
twofish-avx-x86_64-asm_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
twofish-i586-asm_32.S x86/asm/32: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 12:03:43 +02:00
twofish-x86_64-asm_64-3way.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
twofish-x86_64-asm_64.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00