mirror of
https://github.com/AuxXxilium/linux_dsm_epyc7002.git
synced 2024-12-28 11:18:45 +07:00
bf93113d46
[ Upstream commit 86ad60a65f29dd862a11c22bb4b5be28d6c5cef1 ]
The XTS asm helper arrangement is a bit odd: the 8-way stride helper
consists of back-to-back calls to the 4-way core transforms, which
are called indirectly, based on a boolean that indicates whether we
are performing encryption or decryption.
Given how costly indirect calls are on x86, let's switch to direct
calls, and given how the 8-way stride doesn't really add anything
substantial, use a 4-way stride instead, and make the asm core
routine deal with any multiple of 4 blocks. Since 512 byte sectors
or 4 KB blocks are the typical quantities XTS operates on, increase
the stride exported to the glue helper to 512 bytes as well.
As a result, the number of indirect calls is reduced from 3 per 64 bytes
of in/output to 1 per 512 bytes of in/output, which produces a 65% speedup
when operating on 1 KB blocks (measured on a Intel(R) Core(TM) i7-8650U CPU)
Fixes:
|
||
---|---|---|
.. | ||
.gitignore | ||
aegis128-aesni-asm.S | ||
aegis128-aesni-glue.c | ||
aes_ctrby8_avx-x86_64.S | ||
aes_glue.c | ||
aesni-intel_asm.S | ||
aesni-intel_avx-x86_64.S | ||
aesni-intel_glue.c | ||
blake2s-core.S | ||
blake2s-glue.c | ||
blowfish_glue.c | ||
blowfish-x86_64-asm_64.S | ||
camellia_aesni_avx2_glue.c | ||
camellia_aesni_avx_glue.c | ||
camellia_glue.c | ||
camellia-aesni-avx2-asm_64.S | ||
camellia-aesni-avx-asm_64.S | ||
camellia-x86_64-asm_64.S | ||
cast5_avx_glue.c | ||
cast5-avx-x86_64-asm_64.S | ||
cast6_avx_glue.c | ||
cast6-avx-x86_64-asm_64.S | ||
chacha_glue.c | ||
chacha-avx2-x86_64.S | ||
chacha-avx512vl-x86_64.S | ||
chacha-ssse3-x86_64.S | ||
crc32-pclmul_asm.S | ||
crc32-pclmul_glue.c | ||
crc32c-intel_glue.c | ||
crc32c-pcl-intel-asm_64.S | ||
crct10dif-pcl-asm_64.S | ||
crct10dif-pclmul_glue.c | ||
curve25519-x86_64.c | ||
des3_ede_glue.c | ||
des3_ede-asm_64.S | ||
ghash-clmulni-intel_asm.S | ||
ghash-clmulni-intel_glue.c | ||
glue_helper-asm-avx2.S | ||
glue_helper-asm-avx.S | ||
glue_helper.c | ||
Makefile | ||
nh-avx2-x86_64.S | ||
nh-sse2-x86_64.S | ||
nhpoly1305-avx2-glue.c | ||
nhpoly1305-sse2-glue.c | ||
poly1305_glue.c | ||
poly1305-x86_64-cryptogams.pl | ||
serpent_avx2_glue.c | ||
serpent_avx_glue.c | ||
serpent_sse2_glue.c | ||
serpent-avx2-asm_64.S | ||
serpent-avx-x86_64-asm_64.S | ||
serpent-sse2-i586-asm_32.S | ||
serpent-sse2-x86_64-asm_64.S | ||
sha1_avx2_x86_64_asm.S | ||
sha1_ni_asm.S | ||
sha1_ssse3_asm.S | ||
sha1_ssse3_glue.c | ||
sha256_ni_asm.S | ||
sha256_ssse3_glue.c | ||
sha256-avx2-asm.S | ||
sha256-avx-asm.S | ||
sha256-ssse3-asm.S | ||
sha512_ssse3_glue.c | ||
sha512-avx2-asm.S | ||
sha512-avx-asm.S | ||
sha512-ssse3-asm.S | ||
twofish_avx_glue.c | ||
twofish_glue_3way.c | ||
twofish_glue.c | ||
twofish-avx-x86_64-asm_64.S | ||
twofish-i586-asm_32.S | ||
twofish-x86_64-asm_64-3way.S | ||
twofish-x86_64-asm_64.S |