linux_dsm_epyc7002/arch
Ard Biesheuvel bf93113d46 crypto: x86/aes-ni-xts - use direct calls to and 4-way stride
[ Upstream commit 86ad60a65f29dd862a11c22bb4b5be28d6c5cef1 ]

The XTS asm helper arrangement is a bit odd: the 8-way stride helper
consists of back-to-back calls to the 4-way core transforms, which
are called indirectly, based on a boolean that indicates whether we
are performing encryption or decryption.

Given how costly indirect calls are on x86, let's switch to direct
calls, and given how the 8-way stride doesn't really add anything
substantial, use a 4-way stride instead, and make the asm core
routine deal with any multiple of 4 blocks. Since 512 byte sectors
or 4 KB blocks are the typical quantities XTS operates on, increase
the stride exported to the glue helper to 512 bytes as well.

As a result, the number of indirect calls is reduced from 3 per 64 bytes
of in/output to 1 per 512 bytes of in/output, which produces a 65% speedup
when operating on 1 KB blocks (measured on a Intel(R) Core(TM) i7-8650U CPU)

Fixes: 9697fa39ef ("x86/retpoline/crypto: Convert crypto assembler indirect jumps")
Tested-by: Eric Biggers <ebiggers@google.com> # x86_64
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-20 10:43:43 +01:00
..
alpha local64.h: make <asm/local64.h> mandatory 2021-01-12 20:18:16 +01:00
arc arch/arc: add copy_user_page() to <asm/page.h> to fix build error on ARC 2021-01-19 18:27:26 +01:00
arm ARM: efistub: replace adrl pseudo-op with adr_l macro invocation 2021-03-17 17:06:26 +01:00
arm64 KVM: arm64: Fix nVHE hyp panic host context restore 2021-03-17 17:06:37 +01:00
c6x arch-cleanup-2020-10-22 2020-10-23 10:06:38 -07:00
csky csky: Fix a size determination in gpr_get() 2021-03-04 11:38:21 +01:00
h8300 h8300: fix PREEMPTION build, TI_PRE_COUNT undefined 2021-02-17 11:02:28 +01:00
hexagon local64.h: make <asm/local64.h> mandatory 2021-01-12 20:18:16 +01:00
ia64 local64.h: make <asm/local64.h> mandatory 2021-01-12 20:18:16 +01:00
m68k local64.h: make <asm/local64.h> mandatory 2021-01-12 20:18:16 +01:00
microblaze local64.h: make <asm/local64.h> mandatory 2021-01-12 20:18:16 +01:00
mips crypto: mips/poly1305 - enable for all MIPS processors 2021-03-17 17:06:10 +01:00
nds32 local64.h: make <asm/local64.h> mandatory 2021-01-12 20:18:16 +01:00
nios2 nios2: fixed broken sys_clone syscall 2021-03-04 11:38:16 +01:00
openrisc sched/idle: Fix arch_cpu_idle() vs tracing 2020-11-24 16:47:35 +01:00
parisc parisc: Enable -mlong-calls gcc option with CONFIG_COMPILE_TEST 2021-03-11 14:17:21 +01:00
powerpc powerpc: Fix missing declaration of [en/dis]able_kernel_vsx() 2021-03-17 17:06:35 +01:00
riscv riscv: Get rid of MAX_EARLY_MAPPING_SIZE 2021-03-07 12:34:05 +01:00
s390 s390/smp: __smp_rescan_cpus() - move cpumask away from stack 2021-03-17 17:06:25 +01:00
sh sh: Remove unused HAVE_COPY_THREAD_TLS macro 2021-01-27 11:55:20 +01:00
sparc sparc64: Use arch_validate_flags() to validate ADI flag 2021-03-17 17:06:24 +01:00
um um: defer killing userspace on page table update failures 2021-03-04 11:38:42 +01:00
x86 crypto: x86/aes-ni-xts - use direct calls to and 4-way stride 2021-03-20 10:43:43 +01:00
xtensa local64.h: make <asm/local64.h> mandatory 2021-01-12 20:18:16 +01:00
.gitignore
Kconfig fanotify: Fix sys_fanotify_mark() on native x86-32 2021-01-17 14:16:59 +01:00