Convert the AESNI AVX and AESNI AVX2 implementations of Camellia from
the (deprecated) ablkcipher and blkcipher interfaces over to the
skcipher interface. Note that this includes replacing the use of
ablk_helper with crypto_simd.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Convert the x86 asm implementation of Camellia from the (deprecated)
blkcipher interface over to the skcipher interface.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The XTS template now wraps an ECB mode algorithm rather than the block
cipher directly. Therefore it is now redundant for crypto modules to
wrap their ECB code with generic XTS code themselves via xts_crypt().
Remove the xts-camellia-asm algorithm which did this. Users who request
xts(camellia) and previously would have gotten xts-camellia-asm will now
get xts(ecb-camellia-asm) instead, which is just as fast.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The LRW template now wraps an ECB mode algorithm rather than the block
cipher directly. Therefore it is now redundant for crypto modules to
wrap their ECB code with generic LRW code themselves via lrw_crypt().
Remove the lrw-camellia-asm algorithm which did this. Users who request
lrw(camellia) and previously would have gotten lrw-camellia-asm will now
get lrw(ecb-camellia-asm) instead, which is just as fast.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The LRW template now wraps an ECB mode algorithm rather than the block
cipher directly. Therefore it is now redundant for crypto modules to
wrap their ECB code with generic LRW code themselves via lrw_crypt().
Remove the lrw-camellia-aesni-avx2 algorithm which did this. Users who
request lrw(camellia) and previously would have gotten
lrw-camellia-aesni-avx2 will now get lrw(ecb-camellia-aesni-avx2)
instead, which is just as fast.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The LRW template now wraps an ECB mode algorithm rather than the block
cipher directly. Therefore it is now redundant for crypto modules to
wrap their ECB code with generic LRW code themselves via lrw_crypt().
Remove the lrw-camellia-aesni algorithm which did this. Users who
request lrw(camellia) and previously would have gotten
lrw-camellia-aesni will now get lrw(ecb-camellia-aesni) instead, which
is just as fast.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Convert the x86 asm implementation of Triple DES from the (deprecated)
blkcipher interface over to the skcipher interface.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Convert the x86 asm implementation of Blowfish from the (deprecated)
blkcipher interface over to the skcipher interface.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Convert the AVX implementation of CAST6 from the (deprecated) ablkcipher
and blkcipher interfaces over to the skcipher interface. Note that this
includes replacing the use of ablk_helper with crypto_simd.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The LRW template now wraps an ECB mode algorithm rather than the block
cipher directly. Therefore it is now redundant for crypto modules to
wrap their ECB code with generic LRW code themselves via lrw_crypt().
Remove the lrw-cast6-avx algorithm which did this. Users who request
lrw(cast6) and previously would have gotten lrw-cast6-avx will now get
lrw(ecb-cast6-avx) instead, which is just as fast.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Convert the AVX implementation of CAST5 from the (deprecated) ablkcipher
and blkcipher interfaces over to the skcipher interface. Note that this
includes replacing the use of ablk_helper with crypto_simd.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
With ecb-cast5-avx, if a 128+ byte scatterlist element followed a
shorter one, then the algorithm accidentally encrypted/decrypted only 8
bytes instead of the expected 128 bytes. Fix it by setting the
encryption/decryption 'fn' correctly.
Fixes: c12ab20b16 ("crypto: cast5/avx - avoid using temporary stack buffers")
Cc: <stable@vger.kernel.org> # v3.8+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Convert the AVX implementation of Twofish from the (deprecated)
ablkcipher and blkcipher interfaces over to the skcipher interface.
Note that this includes replacing the use of ablk_helper with
crypto_simd.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The LRW template now wraps an ECB mode algorithm rather than the block
cipher directly. Therefore it is now redundant for crypto modules to
wrap their ECB code with generic LRW code themselves via lrw_crypt().
Remove the lrw-twofish-avx algorithm which did this. Users who request
lrw(twofish) and previously would have gotten lrw-twofish-avx will now
get lrw(ecb-twofish-avx) instead, which is just as fast.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Convert the 3-way implementation of Twofish from the (deprecated)
blkcipher interface over to the skcipher interface.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The XTS template now wraps an ECB mode algorithm rather than the block
cipher directly. Therefore it is now redundant for crypto modules to
wrap their ECB code with generic XTS code themselves via xts_crypt().
Remove the xts-twofish-3way algorithm which did this. Users who request
xts(twofish) and previously would have gotten xts-twofish-3way will now
get xts(ecb-twofish-3way) instead, which is just as fast.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The LRW template now wraps an ECB mode algorithm rather than the block
cipher directly. Therefore it is now redundant for crypto modules to
wrap their ECB code with generic LRW code themselves via lrw_crypt().
Remove the lrw-twofish-3way algorithm which did this. Users who request
lrw(twofish) and previously would have gotten lrw-twofish-3way will now
get lrw(ecb-twofish-3way) instead, which is just as fast.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Convert the AVX and AVX2 implementations of Serpent from the
(deprecated) ablkcipher and blkcipher interfaces over to the skcipher
interface. Note that this includes replacing the use of ablk_helper
with crypto_simd.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The LRW template now wraps an ECB mode algorithm rather than the block
cipher directly. Therefore it is now redundant for crypto modules to
wrap their ECB code with generic LRW code themselves via lrw_crypt().
Remove the lrw-serpent-avx algorithm which did this. Users who request
lrw(serpent) and previously would have gotten lrw-serpent-avx will now
get lrw(ecb-serpent-avx) instead, which is just as fast.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The LRW template now wraps an ECB mode algorithm rather than the block
cipher directly. Therefore it is now redundant for crypto modules to
wrap their ECB code with generic LRW code themselves via lrw_crypt().
Remove the lrw-serpent-avx2 algorithm which did this. Users who request
lrw(serpent) and previously would have gotten lrw-serpent-avx2 will now
get lrw(ecb-serpent-avx2) instead, which is just as fast.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Convert the SSE2 implementation of Serpent from the (deprecated)
ablkcipher and blkcipher interfaces over to the skcipher interface.
Note that this includes replacing the use of ablk_helper with
crypto_simd.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The XTS template now wraps an ECB mode algorithm rather than the block
cipher directly. Therefore it is now redundant for crypto modules to
wrap their ECB code with generic XTS code themselves via xts_crypt().
Remove the xts-serpent-sse2 algorithm which did this. Users who request
xts(serpent) and previously would have gotten xts-serpent-sse2 will now
get xts(ecb-serpent-sse2) instead, which is just as fast.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The LRW template now wraps an ECB mode algorithm rather than the block
cipher directly. Therefore it is now redundant for crypto modules to
wrap their ECB code with generic LRW code themselves via lrw_crypt().
Remove the lrw-serpent-sse2 algorithm which did this. Users who request
lrw(serpent) and previously would have gotten lrw-serpent-sse2 will now
get lrw(ecb-serpent-sse2) instead, which is just as fast.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add ECB, CBC, and CTR functions to glue_helper which use skcipher_walk
rather than blkcipher_walk. This will allow converting the remaining
x86 algorithms from the blkcipher interface over to the skcipher
interface, after which we'll be able to remove the blkcipher_walk
versions of these functions.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add an ARM NEON-accelerated implementation of Speck-XTS. It operates on
128-byte chunks at a time, i.e. 8 blocks for Speck128 or 16 blocks for
Speck64. Each 128-byte chunk goes through XTS preprocessing, then is
encrypted/decrypted (doing one cipher round for all the blocks, then the
next round, etc.), then goes through XTS postprocessing.
The performance depends on the processor but can be about 3 times faster
than the generic code. For example, on an ARMv7 processor we observe
the following performance with Speck128/256-XTS:
xts-speck128-neon: Encryption 107.9 MB/s, Decryption 108.1 MB/s
xts(speck128-generic): Encryption 32.1 MB/s, Decryption 36.6 MB/s
In comparison to AES-256-XTS without the Cryptography Extensions:
xts-aes-neonbs: Encryption 41.2 MB/s, Decryption 36.7 MB/s
xts(aes-asm): Encryption 31.7 MB/s, Decryption 30.8 MB/s
xts(aes-generic): Encryption 21.2 MB/s, Decryption 20.9 MB/s
Speck64/128-XTS is even faster:
xts-speck64-neon: Encryption 138.6 MB/s, Decryption 139.1 MB/s
Note that as with the generic code, only the Speck128 and Speck64
variants are supported. Also, for now only the XTS mode of operation is
supported, to target the disk and file encryption use cases. The NEON
code also only handles the portion of the data that is evenly divisible
into 128-byte chunks, with any remainder handled by a C fallback. Of
course, other modes of operation could be added later if needed, and/or
the NEON code could be updated to handle other buffer sizes.
The XTS specification is only defined for AES which has a 128-bit block
size, so for the GF(2^64) math needed for Speck64-XTS we use the
reducing polynomial 'x^64 + x^4 + x^3 + x + 1' given by the original XEX
paper. Of course, when possible users should use Speck128-XTS, but even
that may be too slow on some processors; Speck64-XTS can be faster.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add gcmaes_crypt_by_sg routine, that will do scatter/gather
by sg. Either src or dst may contain multiple buffers, so
iterate over both at the same time if they are different.
If the input is the same as the output, iterate only over one.
Currently both the AAD and TAG must be linear, so copy them out
with scatterlist_map_and_copy. If first buffer contains the
entire AAD, we can optimize and not copy. Since the AAD
can be any size, if copied it must be on the heap. TAG can
be on the stack since it is always < 16 bytes.
Only the SSE routines are updated so far, so leave the previous
gcmaes_en/decrypt routines, and branch to the sg ones if the
keysize is inappropriate for avx, or we are SSE only.
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The asm macros are all set up now, introduce entry points.
GCM_INIT and GCM_COMPLETE have arguments supplied, so that
the new scatter/gather entry points don't have to take all the
arguments, and only the ones they need.
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
We can fast-path any < 16 byte read if the full message is > 16 bytes,
and shift over by the appropriate amount. Usually we are
reading > 16 bytes, so this should be faster than the READ_PARTIAL
macro introduced in b20209c91e for the average case.
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Before this diff, multiple calls to GCM_ENC_DEC will
succeed, but only if all calls are a multiple of 16 bytes.
Handle partial blocks at the start of GCM_ENC_DEC, and update
aadhash as appropriate.
The data offset %r11 is also updated after the partial block.
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
HashKey computation only needs to happen once per scatter/gather operation,
save it between calls in gcm_context struct instead of on the stack.
Since the asm no longer stores anything on the stack, we can use
%rsp directly, and clean up the frame save/restore macros a bit.
Hashkeys actually only need to be calculated once per key and could
be moved to when set_key is called, however, the current glue code
falls back to generic aes code if fpu is disabled.
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Prepare to handle partial blocks between scatter/gather calls.
For the last partial block, we only want to calculate the aadhash
in GCM_COMPLETE, and a new partial block macro will handle both
aadhash update and encrypting partial blocks between calls.
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Fill in aadhash, aadlen, pblocklen, curcount with appropriate values.
pblocklen, aadhash, and pblockenckey are also updated at the end
of each scatter/gather operation, to be carried over to the next
operation.
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
AAD hash only needs to be calculated once for each scatter/gather operation.
Move it to its own macro, and call it from GCM_INIT instead of
INITIAL_BLOCKS.
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Introduce a gcm_context_data struct that will be used to pass
context data between scatter/gather update calls. It is passed
as the second argument (after crypto keys), other args are
renumbered.
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Make a macro for the main encode/decode routine. Only a small handful
of lines differ for enc and dec. This will also become the main
scatter/gather update routine.
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Merge encode and decode tag calculations in GCM_COMPLETE macro.
Scatter/gather routines will call this once at the end of encryption
or decryption.
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Reduce code duplication by introducting GCM_INIT macro. This macro
will also be exposed as a function for implementing scatter/gather
support, since INIT only needs to be called once for the full
operation.
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Macro-ify function save and restore. These will be used in new functions
added for scatter/gather update operations.
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use macro operations to merge implemetations of INITIAL_BLOCKS,
since they differ by only a small handful of lines.
Use macro counter \@ to simplify implementation.
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Move the AES inverse S-box to the .rodata section
where it is safe from abuse by speculation.
Signed-off-by: Jinbum Park <jinb.park7@gmail.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Pull crypto fixes from Herbert Xu:
"This fixes the following issues:
- oversize stack frames on mn10300 in sha3-generic
- warning on old compilers in sha3-generic
- API error in sun4i_ss_prng
- potential dead-lock in sun4i_ss_prng
- null-pointer dereference in sha512-mb
- endless loop when DECO acquire fails in caam
- kernel oops when hashing empty message in talitos"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: sun4i_ss_prng - convert lock to _bh in sun4i_ss_prng_generate
crypto: sun4i_ss_prng - fix return value of sun4i_ss_prng_generate
crypto: caam - fix endless loop when DECO acquire fails
crypto: sha3-generic - Use __optimize to support old compilers
compiler-gcc.h: __nostackprotector needs gcc-4.4 and up
compiler-gcc.h: Introduce __optimize function attribute
crypto: sha3-generic - deal with oversize stack frames
crypto: talitos - fix Kernel Oops on hashing an empty file
crypto: sha512-mb - initialize pending lengths correctly
except, again, POLLFREE and POLL_BUSY_LOOP.
With this, we finally get to the promised end result:
- POLL{IN,OUT,...} are plain integers and *not* in __poll_t, so any
stray instances of ->poll() still using those will be caught by
sparse.
- eventpoll.c and select.c warning-free wrt __poll_t
- no more kernel-side definitions of POLL... - userland ones are
visible through the entire kernel (and used pretty much only for
mangle/demangle)
- same behavior as after the first series (i.e. sparc et.al. epoll(2)
working correctly).
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is the mindless scripted replacement of kernel use of POLL*
variables as described by Al, done by this script:
for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done
done
with de-mangling cleanups yet to come.
NOTE! On almost all architectures, the EPOLL* constants have the same
values as the POLL* constants do. But they keyword here is "almost".
For various bad reasons they aren't the same, and epoll() doesn't
actually work quite correctly in some cases due to this on Sparc et al.
The next patch from Al will sort out the final differences, and we
should be all done.
Scripted-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
nios2: defconfig: Cleanup from old Kconfig options
nios2: dts: Remove leading 0x and 0s from bindings notation
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJagGD7AAoJEFWoEK+e3syCyIgQALRklRmZIDK+ucnnwGhDZIWQ
Nip13tRFaFDmdwYymvZfv22b9dCAMINcoNY7yj6qCV2r1KBktQI/EmCL1QIj4PqS
uILFiQwGSj/YFapY5I32vDoHPKE5xP1UohJDsJh1mZbKWqzN5ydWh7sfOLscsWmT
4ybAMIKLt6xketsLeC/ygznjXeGUeUFKtQzwp8bcrCNsZkf9/BqwtnKLlrH2hurf
AdARsGEiH3aD/Tz2vTDvdSyMuqGORRGgDZVB+pevmdkUg5pyuAP+sfm/DB6lP8gB
beLsuikcDccdqnsMg57WrIAxcblc+S2fUfWzwNQUB9GyELO49vFvuVSD5Vp1xMtq
DWWiE4jhnCgpY13uKxRW01Ddo0u78PvdbojrxZ4iyBWAvlyhdaPXwXP0TwmhVl4A
tnIpRntPeP/0X5Htprd3SnXJFE5qRifbIDXTYbPG2QOFVIysvWjQuhN2rqi0PVJ5
duNL6uJ+Dt+NNwVamryyYuUsyrqU8CZtjLgI3P3m7xW9rscWgPZM4brJzYetq5mn
JwO2HwQBbNIcUSfCrFIaq1LEYKUOPY+glDXodxD519R0IWmJHBDTmbch+P92Xc5G
A2UINPNDj89qxcTEWJ+1qzheT3oRQAH1UaN9cIfwa1hPmjTPlhK2ltnOQ5atQJhU
fbf5dUlW16qqIf+DxK2K
=gek7
-----END PGP SIGNATURE-----
Merge tag 'nios2-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2
Pull nios2 update from Ley Foon Tan:
- clean up old Kconfig options from defconfig
- remove leading 0x and 0s from bindings notation in dts files
* tag 'nios2-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2:
nios2: defconfig: Cleanup from old Kconfig options
nios2: dts: Remove leading 0x and 0s from bindings notation
The commit 917538e212 ("kasan: clean up KASAN_SHADOW_SCALE_SHIFT
usage") removed KASAN_SHADOW_SCALE_SHIFT definition from
include/linux/kasan.h and added it to architecture-specific headers,
except for xtensa. This broke the xtensa build with KASAN enabled.
Define KASAN_SHADOW_SCALE_SHIFT in arch/xtensa/include/asm/kasan.h
Reported by: kbuild test robot <fengguang.wu@intel.com>
Fixes: 917538e212 ("kasan: clean up KASAN_SHADOW_SCALE_SHIFT usage")
Acked-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Remove old, dead Kconfig option INET_LRO. It is gone since
commit 7bbf3cae65 ("ipv4: Remove inet_lro library").
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Acked-by: Ley Foon Tan <ley.foon.tan@intel.com>