linux_dsm_epyc7002/arch/x86
Jussi Kivilinna 0b95ec56ae crypto: camellia - add assembler implementation for x86_64
Patch adds x86_64 assembler implementation of Camellia block cipher. Two set of
functions are provided. First set is regular 'one-block at time' encrypt/decrypt
functions. Second is 'two-block at time' functions that gain performance increase
on out-of-order CPUs. Performance of 2-way functions should be equal to 1-way
functions with in-order CPUs.

Patch has been tested with tcrypt and automated filesystem tests.

Tcrypt benchmark results:

AMD Phenom II 1055T (fam:16, model:10):

camellia-asm vs camellia_generic:
128bit key:                                             (lrw:256bit)    (xts:256bit)
size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
16B     1.27x   1.22x   1.30x   1.42x   1.30x   1.34x   1.19x   1.05x   1.23x   1.24x
64B     1.74x   1.79x   1.43x   1.87x   1.81x   1.87x   1.48x   1.38x   1.55x   1.62x
256B    1.90x   1.87x   1.43x   1.94x   1.94x   1.95x   1.63x   1.62x   1.67x   1.70x
1024B   1.96x   1.93x   1.43x   1.95x   1.98x   2.01x   1.67x   1.69x   1.74x   1.80x
8192B   1.96x   1.96x   1.39x   1.93x   2.01x   2.03x   1.72x   1.64x   1.71x   1.76x

256bit key:                                             (lrw:384bit)    (xts:512bit)
size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
16B     1.23x   1.23x   1.33x   1.39x   1.34x   1.38x   1.04x   1.18x   1.21x   1.29x
64B     1.72x   1.69x   1.42x   1.78x   1.81x   1.89x   1.57x   1.52x   1.56x   1.65x
256B    1.85x   1.88x   1.42x   1.86x   1.93x   1.96x   1.69x   1.65x   1.70x   1.75x
1024B   1.88x   1.86x   1.45x   1.95x   1.96x   1.95x   1.77x   1.71x   1.77x   1.78x
8192B   1.91x   1.86x   1.42x   1.91x   2.03x   1.98x   1.73x   1.71x   1.78x   1.76x

camellia-asm vs aes-asm (8kB block):
         128bit  256bit
ecb-enc  1.15x   1.22x
ecb-dec  1.16x   1.16x
cbc-enc  0.85x   0.90x
cbc-dec  1.20x   1.23x
ctr-enc  1.28x   1.30x
ctr-dec  1.27x   1.28x
lrw-enc  1.12x   1.16x
lrw-dec  1.08x   1.10x
xts-enc  1.11x   1.15x
xts-dec  1.14x   1.15x

Intel Core2 T8100 (fam:6, model:23, step:6):

camellia-asm vs camellia_generic:
128bit key:                                             (lrw:256bit)    (xts:256bit)
size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
16B     1.10x   1.12x   1.14x   1.16x   1.16x   1.15x   1.02x   1.02x   1.08x   1.08x
64B     1.61x   1.60x   1.17x   1.68x   1.67x   1.66x   1.43x   1.42x   1.44x   1.42x
256B    1.65x   1.73x   1.17x   1.77x   1.81x   1.80x   1.54x   1.53x   1.58x   1.54x
1024B   1.76x   1.74x   1.18x   1.80x   1.85x   1.85x   1.60x   1.59x   1.65x   1.60x
8192B   1.77x   1.75x   1.19x   1.81x   1.85x   1.86x   1.63x   1.61x   1.66x   1.62x

256bit key:                                             (lrw:384bit)    (xts:512bit)
size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
16B     1.10x   1.07x   1.13x   1.16x   1.11x   1.16x   1.03x   1.02x   1.08x   1.07x
64B     1.61x   1.62x   1.15x   1.66x   1.63x   1.68x   1.47x   1.46x   1.47x   1.44x
256B    1.71x   1.70x   1.16x   1.75x   1.69x   1.79x   1.58x   1.57x   1.59x   1.55x
1024B   1.78x   1.72x   1.17x   1.75x   1.80x   1.80x   1.63x   1.62x   1.65x   1.62x
8192B   1.76x   1.73x   1.17x   1.78x   1.80x   1.81x   1.64x   1.62x   1.68x   1.64x

camellia-asm vs aes-asm (8kB block):
         128bit  256bit
ecb-enc  1.17x   1.21x
ecb-dec  1.17x   1.20x
cbc-enc  0.80x   0.82x
cbc-dec  1.22x   1.24x
ctr-enc  1.25x   1.26x
ctr-dec  1.25x   1.26x
lrw-enc  1.14x   1.18x
lrw-dec  1.13x   1.17x
xts-enc  1.14x   1.18x
xts-dec  1.14x   1.17x

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-03-14 17:25:56 +08:00
..
boot doc: fix broken references 2011-09-27 18:08:04 +02:00
configs iommu: Rename the DMAR and INTR_REMAP config options 2011-09-21 10:22:03 +02:00
crypto crypto: camellia - add assembler implementation for x86_64 2012-03-14 17:25:56 +08:00
ia32 x86-64: Cleanup some assembly entry points 2011-12-05 17:24:43 +01:00
include/asm Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2012-01-10 22:01:27 -08:00
kernel Merge branch 'akpm' (aka "Andrew's patch-bomb") 2012-01-10 16:42:48 -08:00
kvm KVM: x86 emulator: implement RDPMC (0F 33) 2011-12-27 11:24:43 +02:00
lguest lguest: add export.h to lguest files for THIS_MODULE/EXPORT_SYMBOL 2011-10-31 19:32:13 -04:00
lib Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-01-06 13:59:14 -08:00
math-emu
mm Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-01-06 14:00:12 -08:00
net net: bpf_jit: fix an off-one bug in x86_64 cond jump target 2011-12-19 15:47:29 -05:00
oprofile Merge branch 'core' of git://amd64.org/linux/rric into perf/core 2011-11-15 11:05:18 +01:00
pci typo fixes: aera -> area, exntension -> extension 2011-12-09 15:22:07 +01:00
platform Merge branch 'tty-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty 2012-01-09 12:09:24 -08:00
power x86: Fix files explicitly requiring export.h for EXPORT_SYMBOL/THIS_MODULE 2011-10-31 19:30:35 -04:00
tools x86/tools: Add decoded instruction dump mode 2011-12-05 14:53:23 +01:00
um fix braino in um patchset (mea culpa) 2011-11-21 12:10:21 -08:00
vdso Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2011-10-28 05:03:12 -07:00
video x86: fix up files really needing to include module.h 2011-10-31 19:30:36 -04:00
xen Merge branch 'stable/for-linus-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen 2012-01-10 10:09:59 -08:00
.gitignore
Kbuild net: filter: Just In Time compiler for x86-64 2011-04-27 23:05:08 -07:00
Kconfig lib: use generic pci_iomap on all architectures 2012-01-10 18:04:27 -08:00
Kconfig.cpu x86: Add support for cmpxchg_double 2011-06-25 12:17:32 -07:00
Kconfig.debug doc: fix broken references 2011-09-27 18:08:04 +02:00
Makefile
Makefile_32.cpu x86, cpu: Move AMD Elan Kconfig under "Processor family" 2011-04-08 13:01:25 -07:00
Makefile.um um: take arch/um/sys-x86 to arch/x86/um 2011-11-02 14:15:05 +01:00