The aesni_gcm_enc/dec functions can access memory before the start of
the data buffer if the length of the data buffer is less than 16 bytes.
This is because they perform the read via a single 16-byte load. This
can potentially result in accessing a page that is not mapped and thus
causing the machine to crash. This patch fixes that by reading the
partial block byte-by-byte and optionally an via 8-byte load if the block
was at least 8 bytes.
Fixes: 0487ccac ("crypto: aesni - make non-AVX AES-GCM work with any aadlen")
Cc: <stable@vger.kernel.org>
Signed-off-by: Junaid Shahid <junaids@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
All hardware crypto devices have their CONFIG names using the following
convention:
CRYPTO_DEV_name_algo
This patch apply this conventions on STM32 CONFIG names.
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Reviewed-by: Fabien Dessenne <fabien.dessenne@st.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
There are no init and exit callbacks, so delete its comments.
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Offload split key generation in CAAM engine, using DKP.
DKP is supported starting with Era 6.
Note that the way assoclen is transmitted from the job descriptor
to the shared descriptor changes - DPOVRD register is used instead
of MATH3 (where available), since DKP protocol thrashes the MATH
registers.
The replacement of MDHA split key generation with DKP has the side
effect of the crypto engine writing the authentication key, and thus
the DMA mapping direction for the buffer holding the key has to change
from DMA_TO_DEVICE to DMA_BIDIRECTIONAL.
There are two cases:
-key is inlined in descriptor - descriptor buffer mapping changes
-key is referenced - key buffer mapping changes
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Save Era in driver's private data for further usage,
like deciding whether an erratum applies or a feature is available
based on its value.
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
ablkcipher shared descriptors are relatively small, thus there is enough
space for the key to be inlined.
Accordingly, there is no need to copy the key in ctx->key.
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Key data is not modified, it is copied in the shared descriptor.
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Using %rbp as a temporary register breaks frame pointer convention and
breaks stack traces when unwinding from an interrupt in the crypto code.
In twofish-3way, we can't simply replace %rbp with another register
because there are none available. Instead, we use the stack to hold the
values that %rbp, %r11, and %r12 were holding previously. Each of these
values represents the half of the output from the previous Feistel round
that is being passed on unchanged to the following round. They are only
used once per round, when they are exchanged with %rax, %rbx, and %rcx.
As a result, we free up 3 registers (one per block) and can reassign
them so that %rbp is not used, and additionally %r14 and %r15 are not
used so they do not need to be saved/restored.
There may be a small overhead caused by replacing 'xchg REG, REG' with
the needed sequence 'mov MEM, REG; mov REG, MEM; mov REG, REG' once per
round. But, counterintuitively, when I tested "ctr-twofish-3way" on a
Haswell processor, the new version was actually about 2% faster.
(Perhaps 'xchg' is not as well optimized as plain moves.)
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The performance of some aead tfm providers is affected by
the amount of parallelism possible with the processing.
Introduce an async aead concurrent multiple buffer
processing speed test to be able to test performance of such
tfm providers.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The performance of some skcipher tfm providers is affected by
the amount of parallelism possible with the processing.
Introduce an async skcipher concurrent multiple buffer
processing speed test to be able to test performance of such
tfm providers.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The multi buffer concurrent requests ahash speed test only
supported the cycles mode. Add support for the so called
jiffies mode that test performance of bytes/sec.
We only add support for digest mode at the moment.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
For multiple buffers speed tests, the number of buffers, or
requests, used actually sets the level of parallelism a tfm
provider may utilize to hide latency. The existing number
(of 8) is good for some software based providers but not
enough for many HW providers with deep FIFOs.
Add a module parameter that allows setting the number of
multiple buffers/requests used, leaving the default at 8.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The AEAD speed test pretended to support decryption, however that support
was broken as decryption requires a valid auth field which the test did
not provide.
Fix this by running the encryption path once with inout/output sgls
switched to calculate the auth field prior to performing decryption
speed tests.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The multi buffer ahash speed test was allocating multiple
buffers for use with the multiple outstanding requests
it was starting but never actually using them (except
to free them), instead using a different single statically
allocated buffer for all requests.
Fix this by actually using the allocated buffers for the test.
It is noted that it may seem tempting to instead remove the
allocation and free of the multiple buffers and leave things as
they are since this is a hash test where the input is read
only. However, after consideration I believe that multiple
buffers better reflect real life scenario with regard
to data cache and TLB behaviours etc.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Commit 142a27f0a7 added support for a "best" RNG, and in doing so
introduced a hang from rmmod/modprobe -r when the last RNG on the list
was unloaded.
When the hwrng list is depleted, return the global variables to their
original state and decrement all references to the object.
Fixes: 142a27f0a7 ("hwrng: core - Reset user selected rng by writing "" to rng_current")
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Reviewed-by: PrasannaKumar Muralidharan <prasannatsmkumar@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The Inside Secure SafeXcel driver was firstly designed to support the
EIP197 cryptographic engine which is an evolution (with much more
feature, better performances) of the EIP97 cryptographic engine. This
patch convert the Inside Secure SafeXcel driver to support both engines
(EIP97 + EIP197).
The main differences are the register offsets and the context
invalidation process which is EIP197 specific. This patch adds an
indirection on the register offsets and adds checks not to send any
invalidation request when driving the EIP97. A new compatible is added
as well to bind the driver from device trees.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The dequeueing function was putting back a request in the crypto queue
on failure (when not enough resources are available) which is not
perfect as the request will be handled much later. This patch updates
this logic by keeping a reference on the failed request to try
proceeding it later when enough resources are available.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch modifies the result handling logic to continue handling
results when the completed requests counter is full and not showing the
actual number of requests to handle.
Suggested-by: Ofer Heifetz <oferh@marvell.com>
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patches moves the result request acknowledgment from a per request
process to acknowledging all the result requests handled at once.
Suggested-by: Ofer Heifetz <oferh@marvell.com>
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Increase the ring size to handle more requests in parallel, while
keeping the batch size (for interrupt coalescing) to its previous value.
The ring size and batch size are now unlinked.
Suggested-by: Ofer Heifetz <oferh@marvell.com>
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch updates the dequeueing logic to dequeue all requests at once.
Since we can have many requests in the queue, the interrupt coalescing
is kept so that the ring interrupt fires every EIP197_MAX_BATCH_SZ at
most.
To allow dequeueing all requests at once while still using reasonable
settings for the interrupt coalescing, the result handling function was
updated to setup the threshold interrupt when needed (i.e. when more
requests than EIP197_MAX_BATCH_SZ are in the queue). When using this
capability the ring is marked as busy so that the dequeue function
enqueue new requests without setting the threshold interrupt.
Suggested-by: Ofer Heifetz <oferh@marvell.com>
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch moves the result handling from an IRQ handler to a threaded
IRQ handler, to improve the number of complete requests being handled at
once.
Suggested-by: Ofer Heifetz <oferh@marvell.com>
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch moves the request dequeueing into a workqueue to improve the
coalescing of interrupts when sending requests to the engine; as the
engine is capable of having one single interrupt for n requests sent.
Using a workqueue allows to send more request at once.
Suggested-by: Ofer Heifetz <oferh@marvell.com>
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The SafeXcel context isn't used in the cache invalidation function. This
cosmetic patch removes it (as well as from the function prototype in the
header file and when the function is called).
Signed-off-by: Ofer Heifetz <oferh@marvell.com>
[Antoine: commit message]
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The cipher direction can be different for requests within the same
transformation context. This patch moves the direction flag from the
context to the request scope.
Signed-off-by: Ofer Heifetz <oferh@marvell.com>
[Antoine: commit message]
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When initializing the IVs crypto_ahash_update() is called, which at some
point will call crypto_enqueue_request(). This function can return
-EBUSY when no resource is available and the request is queued. Since
this is a valid case, -EBUSY shouldn't be treated as an error.
Signed-off-by: Ofer Heifetz <oferh@marvell.com>
[Antoine: commit message]
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The check to know if an invalidation is needed (i.e. when the context
changes) is done even if the context does not exist yet. This happens
when first setting a key for ciphers and/or hmac operations.
This commits adds a check in the _setkey functions to only check if an
invalidation is needed when a context exists, as there is no need to
perform this check otherwise.
Signed-off-by: Ofer Heifetz <oferh@marvell.com>
[Antoine: commit message and added a comment and reworked one of the
checks]
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Cosmetic patch adding a few comments to the ahash caching function to
understand easily what calculations are made in the functions; and how
the function is working.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch removes an useless memset in the ahash_export function, as
the zeroed buffer will be entirely overridden the next line.
Suggested-by: Ofer Heifetz <oferh@marvell.com>
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Cosmetic patch fixing one typo in one of the driver's comments.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Cosmetic patch removing an extra empty line between header inclusions.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When I added generic-gcm-aes I didn't add a wrapper like the one
provided for rfc4106(gcm(aes)). We need to add a cryptd wrapper to fall
back on in case the FPU is not available, otherwise we might corrupt the
FPU state.
Fixes: cce2ea8d90 ("crypto: aesni - add generic gcm(aes)")
Cc: <stable@vger.kernel.org>
Reported-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
generic_gcmaes_decrypt needs to use generic_gcmaes_ctx, not
aesni_rfc4106_gcm_ctx. This is actually harmless because the fields in
struct generic_gcmaes_ctx share the layout of the same fields in
aesni_rfc4106_gcm_ctx.
Fixes: cce2ea8d90 ("crypto: aesni - add generic gcm(aes)")
Cc: <stable@vger.kernel.org>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch remove two unused variable and some dead "code" using it.
Fixes: 92932d03c2 ("crypto: seqiv - Remove AEAD compatibility code")
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch remove two unused variable and some dead "code" using it.
Fixes: 66008d4230 ("crypto: echainiv - Remove AEAD compatibility code")
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Hardware operations like reading random numbers and setting a seed need
to be conducted in a single thread. Therefore a mutex is required to
prevent multiple threads (processes) from accessing the hardware at the
same time.
The sequence of mutex_lock() and mutex_unlock() in the exynos_rng_reseed()
function enables switching between different threads waiting for the
driver to generate random numbers for them.
Signed-off-by: Łukasz Stelmach <l.stelmach@samsung.com>
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Reseed PRNG after reading 65 kB of randomness. Although this may reduce
performance, in most cases the loss is not noticeable. Also the time
based threshold for reseeding is changed to one second. Reseeding is
performed whenever either limit is exceeded.
Reseeding of a PRNG does not increase entropy, but it helps preventing
backtracking the internal state of the device from its output sequence,
and hence, prevents potential attacker from predicting numbers to be
generated.
Signed-off-by: Łukasz Stelmach <l.stelmach@samsung.com>
Reviewed-by: Stephan Mueller <smueller@chronox.de>
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use memcpy_fromio() instead of custom exynos_rng_copy_random() function
to retrieve generated numbers from the registers of PRNG.
Signed-off-by: Łukasz Stelmach <l.stelmach@samsung.com>
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add support for PRNG in Exynos5250+ SoCs.
Signed-off-by: Łukasz Stelmach <l.stelmach@samsung.com>
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The IV size should not include the 32 bit counter. Because we had the
IV size set as 16 the transform only worked when the IV input was zero
padded.
Fixes: a21eb94fc4 ("crypto: axis - add ARTPEC-6/7 crypto accelerator driver")
Signed-off-by: Lars Persson <larper@axis.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The comment in gf128mul_x8_ble() was copy-and-pasted from gf128mul.h and
makes no sense in the new context. Remove it.
Cc: Harsh Jain <harsh@chelsio.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Passing the register value by reference here leads a large amount of stack being
used when CONFIG_KASAN is enabled:
drivers/crypto/qat/qat_common/qat_hal.c: In function 'qat_hal_exec_micro_inst.constprop':
drivers/crypto/qat/qat_common/qat_hal.c:963:1: error: the frame size of 1792 bytes is larger than 1536 bytes [-Werror=frame-larger-than=]
Changing the register-read function to return the value instead reduces the stack
size to around 800 bytes, most of which is for the 'savuwords' array. The function
now no longer returns an error code, but nothing ever evaluated that anyway.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patches update the SafeXcel driver to stop using the crypto
ahash_request result field for partial results (i.e. on updates).
Instead the driver local safexcel_ahash_req state field is used, and
only on final operations the ahash_request result buffer is updated.
Fixes: 1b44c5a60c ("crypto: inside-secure - add SafeXcel EIP197 crypto engine driver")
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch makes use of the SKCIPHER_REQUEST_ON_STACK and
AHASH_REQUEST_ON_STACK helpers to allocate enough memory to contain both
the crypto request structures and their embedded context (__ctx).
Fixes: 1b44c5a60c ("crypto: inside-secure - add SafeXcel EIP197 crypto engine driver")
Suggested-by: Ofer Heifetz <oferh@marvell.com>
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch frees the request private data even if its handling failed,
as it would never be freed otherwise.
Fixes: 1b44c5a60c ("crypto: inside-secure - add SafeXcel EIP197 crypto engine driver")
Suggested-by: Ofer Heifetz <oferh@marvell.com>
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When an invalidation request is needed we currently override the context
.send and .handle_result helpers. This is wrong as under high load other
requests can already be queued and overriding the context helpers will
make them execute the wrong .send and .handle_result functions.
This commit fixes this by adding a needs_inv flag in the request to
choose the action to perform when sending requests or handling their
results. This flag will be set when needed (i.e. when the context flag
will be set).
Fixes: 1b44c5a60c ("crypto: inside-secure - add SafeXcel EIP197 crypto engine driver")
Signed-off-by: Ofer Heifetz <oferh@marvell.com>
[Antoine: commit message, and removed non related changes from the
original commit]
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Current MIPS64r6 toolchains aren't able to generate efficient
DMULU/DMUHU based code for the C implementation of umul_ppmm(), which
performs an unsigned 64 x 64 bit multiply and returns the upper and
lower 64-bit halves of the 128-bit result. Instead it widens the 64-bit
inputs to 128-bits and emits a __multi3 intrinsic call to perform a 128
x 128 multiply. This is both inefficient, and it results in a link error
since we don't include __multi3 in MIPS linux.
For example commit 90a53e4432 ("cfg80211: implement regdb signature
checking") merged in v4.15-rc1 recently broke the 64r6_defconfig and
64r6el_defconfig builds by indirectly selecting MPILIB. The same build
errors can be reproduced on older kernels by enabling e.g. CRYPTO_RSA:
lib/mpi/generic_mpih-mul1.o: In function `mpihelp_mul_1':
lib/mpi/generic_mpih-mul1.c:50: undefined reference to `__multi3'
lib/mpi/generic_mpih-mul2.o: In function `mpihelp_addmul_1':
lib/mpi/generic_mpih-mul2.c:49: undefined reference to `__multi3'
lib/mpi/generic_mpih-mul3.o: In function `mpihelp_submul_1':
lib/mpi/generic_mpih-mul3.c:49: undefined reference to `__multi3'
lib/mpi/mpih-div.o In function `mpihelp_divrem':
lib/mpi/mpih-div.c:205: undefined reference to `__multi3'
lib/mpi/mpih-div.c:142: undefined reference to `__multi3'
Therefore add an efficient MIPS64r6 implementation of umul_ppmm() using
inline assembly and the DMULU/DMUHU instructions, to prevent __multi3
calls being emitted.
Fixes: 7fd08ca58a ("MIPS: Add build support for the MIPS R6 ISA")
Signed-off-by: James Hogan <jhogan@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-mips@linux-mips.org
Cc: linux-crypto@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Since commit 499a66e6b6 ("crypto: null - Remove default null
blkcipher"), crypto_get_default_null_skcipher2() and
crypto_put_default_null_skcipher2() are the same as their non-2
equivalents. So switch callers of the "2" versions over to the original
versions and remove the "2" versions.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
crypto_larval_lookup() is not used outside of crypto/api.c, so unexport
it and mark it 'static'.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>