linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-27 07:25:14 +07:00

Author	SHA1	Message	Date
Ard Biesheuvel	4a70b52620	crypto: arm/chacha20 - remove cra_alignmask Remove the unnecessary alignmask: it is much more efficient to deal with the misalignment in the core algorithm than relying on the crypto API to copy the data to a suitably aligned buffer. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-02-03 18:16:19 +08:00
Ard Biesheuvel	1465fb13d3	crypto: arm/aes-ce - remove cra_alignmask Remove the unnecessary alignmask: it is much more efficient to deal with the misalignment in the core algorithm than relying on the crypto API to copy the data to a suitably aligned buffer. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-02-03 18:16:16 +08:00
Ard Biesheuvel	13954e788d	crypto: arm/aes-neonbs - fix issue with v2.22 and older assembler The GNU assembler for ARM version 2.22 or older fails to infer the element size from the vmov instructions, and aborts the build in the following way; .../aes-neonbs-core.S: Assembler messages: .../aes-neonbs-core.S:817: Error: bad type for scalar -- `vmov q1h[1],r10' .../aes-neonbs-core.S:817: Error: bad type for scalar -- `vmov q1h[0],r9' .../aes-neonbs-core.S:817: Error: bad type for scalar -- `vmov q1l[1],r8' .../aes-neonbs-core.S:817: Error: bad type for scalar -- `vmov q1l[0],r7' .../aes-neonbs-core.S:818: Error: bad type for scalar -- `vmov q2h[1],r10' .../aes-neonbs-core.S:818: Error: bad type for scalar -- `vmov q2h[0],r9' .../aes-neonbs-core.S:818: Error: bad type for scalar -- `vmov q2l[1],r8' .../aes-neonbs-core.S:818: Error: bad type for scalar -- `vmov q2l[0],r7' Fix this by setting the element size explicitly, by replacing vmov with vmov.32. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-01-23 22:50:25 +08:00
Ard Biesheuvel	658fa754cd	crypto: arm/aes - avoid reserved 'tt' mnemonic in asm code The ARMv8-M architecture introduces 'tt' and 'ttt' instructions, which means we can no longer use 'tt' as a register alias on recent versions of binutils for ARM. So replace the alias with 'ttab'. Fixes: `81edb42629` ("crypto: arm/aes - replace scalar AES cipher") Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-01-13 18:47:21 +08:00
Ard Biesheuvel	cc477bf645	crypto: arm/aes - replace bit-sliced OpenSSL NEON code This replaces the unwieldy generated implementation of bit-sliced AES in CBC/CTR/XTS modes that originated in the OpenSSL project with a new version that is heavily based on the OpenSSL implementation, but has a number of advantages over the old version: - it does not rely on the scalar AES cipher that also originated in the OpenSSL project and contains redundant lookup tables and key schedule generation routines (which we already have in crypto/aes_generic.) - it uses the same expanded key schedule for encryption and decryption, reducing the size of the per-key data structure by 1696 bytes - it adds an implementation of AES in ECB mode, which can be wrapped by other generic chaining mode implementations - it moves the handling of corner cases that are non critical to performance to the glue layer written in C - it was written directly in assembler rather than generated from a Perl script Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-01-13 18:27:31 +08:00
Ard Biesheuvel	81edb42629	crypto: arm/aes - replace scalar AES cipher This replaces the scalar AES cipher that originates in the OpenSSL project with a new implementation that is ~15% () faster (on modern cores), and reuses the lookup tables and the key schedule generation routines from the generic C implementation (which is usually compiled in anyway due to networking and other subsystems depending on it). Note that the bit sliced NEON code for AES still depends on the scalar cipher that this patch replaces, so it is not removed entirely yet. On Cortex-A57, the performance increases from 17.0 to 14.9 cycles per byte for 128-bit keys. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-01-13 00:26:50 +08:00
Ard Biesheuvel	afaf712e99	crypto: arm/chacha20 - implement NEON version based on SSE3 code This is a straight port to ARM/NEON of the x86 SSE3 implementation of the ChaCha20 stream cipher. It uses the new skcipher walksize attribute to process the input in strides of 4x the block size. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-01-13 00:26:48 +08:00
Herbert Xu	5386e5d1f8	Revert "crypto: arm64/ARM: NEON accelerated ChaCha20" This patch reverts the following commits: `8621caa0d4` `8096667273` I should not have applied them because they had already been obsoleted by a subsequent patch series. They also cause a build failure because of the subsequent commit `9ae433bc79`. Fixes: `9ae433bc79` ("crypto: chacha20 - convert generic and...") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-12-28 17:39:26 +08:00
Ard Biesheuvel	8096667273	crypto: arm/chacha20 - implement NEON version based on SSE3 code This is a straight port to ARM/NEON of the x86 SSE3 implementation of the ChaCha20 stream cipher. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-12-27 17:47:29 +08:00
Ard Biesheuvel	d0a3431a7b	crypto: arm/crc32 - accelerated support based on x86 SSE implementation This is a combination of the the Intel algorithm implemented using SSE and PCLMULQDQ instructions from arch/x86/crypto/crc32-pclmul_asm.S, and the new CRC32 extensions introduced for both 32-bit and 64-bit ARM in version 8 of the architecture. Two versions of the above combo are provided, one for CRC32 and one for CRC32C. The PMULL/NEON algorithm is faster, but operates on blocks of at least 64 bytes, and on multiples of 16 bytes only. For the remaining input, or for all input on systems that lack the PMULL 64x64->128 instructions, the CRC32 instructions will be used. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-12-07 20:01:24 +08:00
Ard Biesheuvel	1d481f1cd8	crypto: arm/crct10dif - port x86 SSE implementation to ARM This is a transliteration of the Intel algorithm implemented using SSE and PCLMULQDQ instructions that resides in the file arch/x86/crypto/crct10dif-pcl-asm_64.S, but simplified to only operate on buffers that are 16 byte aligned (but of any size) Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-12-07 20:01:21 +08:00
Herbert Xu	efad2b61ae	crypto: aes-ce - Make aes_simd_algs static The variable aes_simd_algs should be static. In fact if it isn't it causes build errors when multiple copies of aes-ce-glue.c are built into the kernel. Fixes: `da40e7a4ba` ("crypto: aes-ce - Convert to skcipher") Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-12-01 21:06:44 +08:00
Ard Biesheuvel	81126d1a8b	crypto: arm/aesbs - fix brokenness after skcipher conversion The CBC encryption routine should use the encryption round keys, not the decryption round keys. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-11-30 20:01:51 +08:00
Herbert Xu	6fdf436fd8	crypto: arm/aes - Add missing SIMD select for aesbs This patch adds one more missing SIMD select for AES_ARM_BS. It also changes selects on ALGAPI to BLKCIPHER. Fixes: `211f41af53` ("crypto: aesbs - Convert to skcipher") Reported-by: Horia Geantă <horia.geanta@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-11-30 20:01:40 +08:00
Herbert Xu	585b5fa63d	crypto: arm/aes - Select SIMD in Kconfig The skcipher conversion for ARM missed the select on CRYPTO_SIMD, causing build failures if SIMD was not otherwise enabled. Fixes: `da40e7a4ba` ("crypto: aes-ce - Convert to skcipher") Fixes: `211f41af53` ("crypto: aesbs - Convert to skcipher") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-11-29 16:11:14 +08:00
Herbert Xu	211f41af53	crypto: aesbs - Convert to skcipher This patch converts aesbs over to the skcipher interface. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-11-28 21:23:21 +08:00
Herbert Xu	da40e7a4ba	crypto: aes-ce - Convert to skcipher This patch converts aes-ce over to the skcipher interface. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-11-28 21:23:21 +08:00
Ard Biesheuvel	58010fa6f7	crypto: arm/aes-ce - fix for big endian The AES key schedule generation is mostly endian agnostic, with the exception of the rotation and the incorporation of the round constant at the start of each round. So implement a big endian specific version of that part to make the whole routine big endian compatible. Fixes: `86464859cc` ("crypto: arm - AES in ECB/CBC/CTR/XTS modes using ARMv8 Crypto Extensions") Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-10-21 11:03:46 +08:00
Herbert Xu	c3afafa478	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Merge the crypto tree to pull in vmx ghash fix.	2016-10-10 11:19:47 +08:00
Ard Biesheuvel	f82e90b286	crypto: arm/aes-ctr - fix NULL dereference in tail processing The AES-CTR glue code avoids calling into the blkcipher API for the tail portion of the walk, by comparing the remainder of walk.nbytes modulo AES_BLOCK_SIZE with the residual nbytes, and jumping straight into the tail processing block if they are equal. This tail processing block checks whether nbytes != 0, and does nothing otherwise. However, in case of an allocation failure in the blkcipher layer, we may enter this code with walk.nbytes == 0, while nbytes > 0. In this case, we should not dereference the source and destination pointers, since they may be NULL. So instead of checking for nbytes != 0, check for (walk.nbytes % AES_BLOCK_SIZE) != 0, which implies the former in non-error conditions. Fixes: `86464859cc` ("crypto: arm - AES in ECB/CBC/CTR/XTS modes using ARMv8 Crypto Extensions") Cc: stable@vger.kernel.org Reported-by: xiakaixu <xiakaixu@huawei.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-09-13 18:44:59 +08:00
Ard Biesheuvel	71f89917ff	crypto: arm/ghash - change internal cra_name to "__ghash" The fact that the internal synchrous hash implementation is called "ghash" like the publicly visible one is causing the testmgr code to misidentify it as an algorithm that requires testing at boottime. So rename it to "__ghash" to prevent this. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-09-07 21:10:19 +08:00
Ard Biesheuvel	ed4767d612	crypto: arm/ghash-ce - add missing async import/export Since commit `8996eafdcb` ("crypto: ahash - ensure statesize is non-zero"), all ahash drivers are required to implement import()/export(), and must have a non-zero statesize. Fix this for the ARM Crypto Extensions GHASH implementation. Fixes: `8996eafdcb` ("crypto: ahash - ensure statesize is non-zero") Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-09-07 21:08:30 +08:00
Ard Biesheuvel	2117eaa62a	crypto: arm/sha1-neon - add support for building in Thumb2 mode The ARMv7 NEON module is explicitly built in ARM mode, which is not supported by the Thumb2 kernel. So remove the explicit override, and leave it up to the build environment to decide whether the core SHA1 routines are assembled as ARM or as Thumb2 code. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-09-07 21:08:29 +08:00
Herbert Xu	820573ebd6	crypto: ghash-ce - Fix cryptd reordering This patch fixes an old bug where requests can be reordered because some are processed by cryptd while others are processed directly in softirq context. The fix is to always postpone to cryptd if there are currently requests outstanding from the same tfm. This patch also removes the redundant use of cryptd in the async init function as init never touches the FPU. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-06-23 18:29:54 +08:00
Linus Torvalds	70477371dc	Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto update from Herbert Xu: "Here is the crypto update for 4.6: API: - Convert remaining crypto_hash users to shash or ahash, also convert blkcipher/ablkcipher users to skcipher. - Remove crypto_hash interface. - Remove crypto_pcomp interface. - Add crypto engine for async cipher drivers. - Add akcipher documentation. - Add skcipher documentation. Algorithms: - Rename crypto/crc32 to avoid name clash with lib/crc32. - Fix bug in keywrap where we zero the wrong pointer. Drivers: - Support T5/M5, T7/M7 SPARC CPUs in n2 hwrng driver. - Add PIC32 hwrng driver. - Support BCM6368 in bcm63xx hwrng driver. - Pack structs for 32-bit compat users in qat. - Use crypto engine in omap-aes. - Add support for sama5d2x SoCs in atmel-sha. - Make atmel-sha available again. - Make sahara hashing available again. - Make ccp hashing available again. - Make sha1-mb available again. - Add support for multiple devices in ccp. - Improve DMA performance in caam. - Add hashing support to rockchip" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (116 commits) crypto: qat - remove redundant arbiter configuration crypto: ux500 - fix checks of error code returned by devm_ioremap_resource() crypto: atmel - fix checks of error code returned by devm_ioremap_resource() crypto: qat - Change the definition of icp_qat_uof_regtype hwrng: exynos - use __maybe_unused to hide pm functions crypto: ccp - Add abstraction for device-specific calls crypto: ccp - CCP versioning support crypto: ccp - Support for multiple CCPs crypto: ccp - Remove check for x86 family and model crypto: ccp - memset request context to zero during import lib/mpi: use "static inline" instead of "extern inline" lib/mpi: avoid assembler warning hwrng: bcm63xx - fix non device tree compatibility crypto: testmgr - allow rfc3686 aes-ctr variants in fips mode. crypto: qat - The AE id should be less than the maximal AE number lib/mpi: Endianness fix crypto: rockchip - add hash support for crypto engine in rk3288 crypto: xts - fix compile errors crypto: doc - add skcipher API documentation crypto: doc - update AEAD AD handling ...	2016-03-17 11:22:54 -07:00
Stephan Mueller	49abc0d2e1	crypto: xts - fix compile errors Commit `28856a9e52` missed the addition of the crypto/xts.h include file for different architecture-specific AES implementations. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-02-17 15:53:02 +08:00
Stephan Mueller	28856a9e52	crypto: xts - consolidate sanity check for keys The patch centralizes the XTS key check logic into the service function xts_check_key which is invoked from the different XTS implementations. With this, the XTS implementations in ARM, ARM64, PPC and S390 have now a sanity check for the XTS keys similar to the other arches. In addition, this service function received a check to ensure that the key != the tweak key which is mandated by FIPS 140-2 IG A.9. As the check is not present in the standards defining XTS, it is only enforced in FIPS mode of the kernel. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-02-17 04:07:51 +08:00
Jeremy Linton	bee038a4bd	arm/arm64: crypto: assure that ECB modes don't require an IV ECB modes don't use an initialization vector. The kernel /proc/crypto interface doesn't reflect this properly. Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2016-02-15 15:48:29 +00:00
Baruch Siach	4d666dbefc	crypto: arm - ignore generated SHA2 assembly files These files are generated since commits `f2f770d74a` (crypto: arm/sha256 - Add optimized SHA-256/224, 2015-04-03) and `c80ae7ca37` (crypto: arm/sha512 - accelerated SHA-512 using ARM generic ASM and NEON, 2015-05-08). Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Baruch Siach <baruch@tkos.co.il> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2015-07-06 16:32:03 +08:00
Ard Biesheuvel	6499e8cfaa	crypto: arm/aes - streamline AES-192 code path This trims off a couple of instructions of the total size of the core AES transform by reordering the final branch in the AES-192 code path with the rounds that are performed regardless of whether the branch is taken or not. Other than the slight size reduction, this has no performance benefit. Fix up a comment regarding the prototype of this function while we're at it. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2015-05-11 15:08:01 +08:00
Ard Biesheuvel	c80ae7ca37	crypto: arm/sha512 - accelerated SHA-512 using ARM generic ASM and NEON This replaces the SHA-512 NEON module with the faster and more versatile implementation from the OpenSSL project. It consists of both a NEON and a generic ASM version of the core SHA-512 transform, where the NEON version reverts to the ASM version when invoked in non-process context. This patch is based on the OpenSSL upstream version b1a5d1c65208 of sha512-armv4.pl, which can be found here: https://git.openssl.org/gitweb/?p=openssl.git;h=b1a5d1c65208 Performance relative to the generic implementation (measured using tcrypt.ko mode=306 sec=1 running on a Cortex-A57 under KVM): input size block size asm neon old neon 16 16 1.39 2.54 2.21 64 16 1.32 2.33 2.09 64 64 1.38 2.53 2.19 256 16 1.31 2.28 2.06 256 64 1.38 2.54 2.25 256 256 1.40 2.77 2.39 1024 16 1.29 2.22 2.01 1024 256 1.40 2.82 2.45 1024 1024 1.41 2.93 2.53 2048 16 1.33 2.21 2.00 2048 256 1.40 2.84 2.46 2048 1024 1.41 2.96 2.55 2048 2048 1.41 2.98 2.56 4096 16 1.34 2.20 1.99 4096 256 1.40 2.84 2.46 4096 1024 1.41 2.97 2.56 4096 4096 1.41 3.01 2.58 8192 16 1.34 2.19 1.99 8192 256 1.40 2.85 2.47 8192 1024 1.41 2.98 2.56 8192 4096 1.41 2.71 2.59 8192 8192 1.51 3.51 2.69 Acked-by: Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2015-05-11 15:08:01 +08:00
Linus Torvalds	cb906953d2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto update from Herbert Xu: "Here is the crypto update for 4.1: New interfaces: - user-space interface for AEAD - user-space interface for RNG (i.e., pseudo RNG) New hashes: - ARMv8 SHA1/256 - ARMv8 AES - ARMv8 GHASH - ARM assembler and NEON SHA256 - MIPS OCTEON SHA1/256/512 - MIPS img-hash SHA1/256 and MD5 - Power 8 VMX AES/CBC/CTR/GHASH - PPC assembler AES, SHA1/256 and MD5 - Broadcom IPROC RNG driver Cleanups/fixes: - prevent internal helper algos from being exposed to user-space - merge common code from assembly/C SHA implementations - misc fixes" * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (169 commits) crypto: arm - workaround for building with old binutils crypto: arm/sha256 - avoid sha256 code on ARMv7-M crypto: x86/sha512_ssse3 - move SHA-384/512 SSSE3 implementation to base layer crypto: x86/sha256_ssse3 - move SHA-224/256 SSSE3 implementation to base layer crypto: x86/sha1_ssse3 - move SHA-1 SSSE3 implementation to base layer crypto: arm64/sha2-ce - move SHA-224/256 ARMv8 implementation to base layer crypto: arm64/sha1-ce - move SHA-1 ARMv8 implementation to base layer crypto: arm/sha2-ce - move SHA-224/256 ARMv8 implementation to base layer crypto: arm/sha256 - move SHA-224/256 ASM/NEON implementation to base layer crypto: arm/sha1-ce - move SHA-1 ARMv8 implementation to base layer crypto: arm/sha1_neon - move SHA-1 NEON implementation to base layer crypto: arm/sha1 - move SHA-1 ARM asm implementation to base layer crypto: sha512-generic - move to generic glue implementation crypto: sha256-generic - move to generic glue implementation crypto: sha1-generic - move to generic glue implementation crypto: sha512 - implement base layer for SHA-512 crypto: sha256 - implement base layer for SHA-256 crypto: sha1 - implement base layer for SHA-1 crypto: api - remove instance when test failed crypto: api - Move alg ref count init to crypto_check_alg ...	2015-04-15 10:42:15 -07:00
Ard Biesheuvel	3abafaf219	crypto: arm - workaround for building with old binutils Old versions of binutils (before 2.23) do not yet understand the crypto-neon-fp-armv8 fpu instructions, and an attempt to build these files results in a build failure: arch/arm/crypto/aes-ce-core.S:133: Error: selected processor does not support ARM mode `vld1.8 {q10-q11},[ip]!' arch/arm/crypto/aes-ce-core.S:133: Error: bad instruction `aese.8 q0,q8' arch/arm/crypto/aes-ce-core.S:133: Error: bad instruction `aesmc.8 q0,q0' arch/arm/crypto/aes-ce-core.S:133: Error: bad instruction `aese.8 q0,q9' arch/arm/crypto/aes-ce-core.S:133: Error: bad instruction `aesmc.8 q0,q0' Since the affected versions are still in widespread use, and this breaks 'allmodconfig' builds, we should try to at least get a successful kernel build. Unfortunately, I could not come up with a way to make the Kconfig symbol depend on the binutils version, which would be the nicest solution. Instead, this patch uses the 'as-instr' Kbuild macro to find out whether the support is present in the assembler, and otherwise emits a non-fatal warning indicating which selected modules could not be built. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: http://storage.kernelci.org/next/next-20150410/arm-allmodconfig/build.log Fixes: `864cbeed4a` ("crypto: arm - add support for SHA1 using ARMv8 Crypto Instructions") [ard.biesheuvel: - omit modules entirely instead of building empty ones if binutils is too old - update commit log accordingly] Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2015-04-13 12:09:11 +08:00
Arnd Bergmann	b48321def4	crypto: arm/sha256 - avoid sha256 code on ARMv7-M The sha256 assembly implementation can deal with all architecture levels from ARMv4 to ARMv7-A, but not with ARMv7-M. Enabling it in an ARMv7-M kernel results in this build failure: arm-linux-gnueabi-ld: error: arch/arm/crypto/sha256_glue.o: Conflicting architecture profiles M/A arm-linux-gnueabi-ld: failed to merge target specific data of file arch/arm/crypto/sha256_glue.o This adds a Kconfig dependency to prevent the code from being disabled for ARMv7-M. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2015-04-13 12:07:13 +08:00
Ard Biesheuvel	9205b94923	crypto: arm/sha2-ce - move SHA-224/256 ARMv8 implementation to base layer This removes all the boilerplate from the existing implementation, and replaces it with calls into the base layer. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2015-04-10 21:39:45 +08:00
Ard Biesheuvel	b59e2ae369	crypto: arm/sha256 - move SHA-224/256 ASM/NEON implementation to base layer This removes all the boilerplate from the existing implementation, and replaces it with calls into the base layer. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2015-04-10 21:39:44 +08:00
Ard Biesheuvel	dde00981e6	crypto: arm/sha1-ce - move SHA-1 ARMv8 implementation to base layer This removes all the boilerplate from the existing implementation, and replaces it with calls into the base layer. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2015-04-10 21:39:44 +08:00
Ard Biesheuvel	51e515faa8	crypto: arm/sha1_neon - move SHA-1 NEON implementation to base layer This removes all the boilerplate from the existing implementation, and replaces it with calls into the base layer. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2015-04-10 21:39:43 +08:00
Ard Biesheuvel	90451d6bdb	crypto: arm/sha1 - move SHA-1 ARM asm implementation to base layer This removes all the boilerplate from the existing implementation, and replaces it with calls into the base layer. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2015-04-10 21:39:42 +08:00
Sami Tolvanen	f2f770d74a	crypto: arm/sha256 - Add optimized SHA-256/224 Add Andy Polyakov's optimized assembly and NEON implementations for SHA-256/224. The sha256-armv4.pl script for generating the assembly code is from OpenSSL commit 51f8d095562f36cdaa6893597b5c609e943b0565. Compared to sha256-generic these implementations have the following tcrypt speed improvements on Motorola Nexus 6 (Snapdragon 805): bs b/u sha256-neon sha256-asm 16 16 x1.32 x1.19 64 16 x1.27 x1.15 64 64 x1.36 x1.20 256 16 x1.22 x1.11 256 64 x1.36 x1.19 256 256 x1.59 x1.23 1024 16 x1.21 x1.10 1024 256 x1.65 x1.23 1024 1024 x1.76 x1.25 2048 16 x1.21 x1.10 2048 256 x1.66 x1.23 2048 1024 x1.78 x1.25 2048 2048 x1.79 x1.25 4096 16 x1.20 x1.09 4096 256 x1.66 x1.23 4096 1024 x1.79 x1.26 4096 4096 x1.82 x1.26 8192 16 x1.20 x1.09 8192 256 x1.67 x1.23 8192 1024 x1.80 x1.26 8192 4096 x1.85 x1.28 8192 8192 x1.85 x1.27 Where bs refers to block size and b/u to bytes per update. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Cc: Andy Polyakov <appro@openssl.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2015-04-03 18:03:40 +08:00
Stephan Mueller	94a7e5e8d8	crypto: aes-ce - mark ARMv8 AES helper ciphers Flag all ARMv8 AES helper ciphers as internal ciphers to prevent them from being called by normal users. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2015-03-31 21:21:12 +08:00
Stephan Mueller	76aa9d5f2c	crypto: aesbs - mark NEON bit sliced AES helper ciphers Flag all NEON bit sliced AES helper ciphers as internal ciphers to prevent them from being called by normal users. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2015-03-31 21:21:11 +08:00
Stephan Mueller	4b3f4e37ec	crypto: ghash-ce - mark GHASH ARMv8 vmull.p64 helper ciphers Flag all GHASH ARMv8 vmull.p64 helper ciphers as internal ciphers to prevent them from being called by normal users. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2015-03-31 21:21:07 +08:00
Ard Biesheuvel	fa50d7ee45	crypto: arm/ghash - fix big-endian bug in ghash This fixes a bug in the new v8 Crypto Extensions GHASH code that only manifests itself in big-endian mode. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2015-03-24 22:24:56 +11:00
Ard Biesheuvel	f1e866b106	crypto: arm - add support for GHASH using ARMv8 Crypto Extensions This implements the GHASH hash algorithm (as used by the GCM AEAD chaining mode) using the AArch32 version of the 64x64 to 128 bit polynomial multiplication instruction (vmull.p64) that is part of the ARMv8 Crypto Extensions. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2015-03-12 21:13:36 +11:00
Ard Biesheuvel	86464859cc	crypto: arm - AES in ECB/CBC/CTR/XTS modes using ARMv8 Crypto Extensions This implements the ECB, CBC, CTR and XTS asynchronous block ciphers using the AArch32 versions of the ARMv8 Crypto Extensions for AES. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2015-03-12 21:13:36 +11:00
Ard Biesheuvel	006d0624fa	crypto: arm - add support for SHA-224/256 using ARMv8 Crypto Extensions This implements the SHA-224/256 secure hash algorithm using the AArch32 versions of the ARMv8 Crypto Extensions for SHA2. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2015-03-12 21:13:35 +11:00
Ard Biesheuvel	864cbeed4a	crypto: arm - add support for SHA1 using ARMv8 Crypto Instructions This implements the SHA1 secure hash algorithm using the AArch32 versions of the ARMv8 Crypto Extensions for SHA1. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2015-03-12 21:13:35 +11:00
Ard Biesheuvel	652ccae5cc	crypto: arm - move ARM specific Kconfig definitions to a dedicated file This moves all Kconfig symbols defined in crypto/Kconfig that depend on CONFIG_ARM to a dedicated Kconfig file in arch/arm/crypto, which is where the code that implements those features resides as well. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2015-03-12 21:13:35 +11:00
Ard Biesheuvel	001eabfd54	crypto: arm/aes update NEON AES module to latest OpenSSL version This updates the bit sliced AES module to the latest version in the upstream OpenSSL repository (e620e5ae37bc). This is needed to fix a bug in the XTS decryption path, where data chunked in a certain way could trigger the ciphertext stealing code, which is not supposed to be active in the kernel build (The kernel implementation of XTS only supports round multiples of the AES block size of 16 bytes, whereas the conformant OpenSSL implementation of XTS supports inputs of arbitrary size by applying ciphertext stealing). This is fixed in the upstream version by adding the missing #ifndef XTS_CHAIN_TWEAK around the offending instructions. The upstream code also contains the change applied by Russell to build the code unconditionally, i.e., even if __LINUX_ARM_ARCH__ < 7, but implemented slightly differently. Cc: stable@vger.kernel.org Fixes: `e4e7f10bfc` ("ARM: add support for bit sliced AES using NEON instructions") Reported-by: Adrian Kotelba <adrian.kotelba@gmail.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2015-03-02 23:18:26 +13:00
Julia Lawall	f43c239407	crypto: arm - replace memset by memzero_explicit Memset on a local variable may be removed when it is called just before the variable goes out of scope. Using memzero_explicit defeats this optimization. A simplified version of the semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ identifier x; type T; @@ { ... when any T x[...]; ... when any when exists - memset + memzero_explicit (x, -0, ...) ... when != x when strict } // </smpl> This change was suggested by Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2014-12-02 22:55:51 +08:00
Kees Cook	5d26a105b5	crypto: prefix module autoloading with "crypto-" This prefixes all crypto module loading with "crypto-" so we never run the risk of exposing module auto-loading to userspace via a crypto API, as demonstrated by Mathias Krause: https://lkml.org/lkml/2013/3/4/70 Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2014-11-24 22:43:57 +08:00
Ard Biesheuvel	0777e3e172	ARM: 8125/1: crypto: enable NEON SHA-1 for big endian This tweaks the SHA-1 NEON code slightly so it works correctly under big endian, and removes the Kconfig condition preventing it from being selected if CONFIG_CPU_BIG_ENDIAN is set. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2014-08-27 15:44:11 +01:00
Linus Torvalds	c489d98c8c	Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm Pull ARM updates from Russell King: "Included in this update: - perf updates from Will Deacon: The main changes are callchain stability fixes from Jean Pihet and event mapping and PMU name rework from Mark Rutland The latter is preparatory work for enabling some code re-use with arm64 in the future. - updates for nommu from Uwe Kleine-König: Two different fixes for the same problem making some ARM nommu configurations not boot since 3.6-rc1. The problem is that user_addr_max returned the biggest available RAM address which makes some copy_from_user variants fail to read from XIP memory. - deprecate legacy OMAP DMA API, in preparation for it's removal. The popular drivers have been converted over, leaving a very small number of rarely used drivers, which hopefully can be converted during the next cycle with a bit more visibility (and hopefully people popping out of the woodwork to help test) - more tweaks for BE systems, particularly with the kernel image format. In connection with this, I've cleaned up the way we generate the linker script for the decompressor. - removal of hard-coded assumptions of the kernel stack size, making everywhere depend on the value of THREAD_SIZE_ORDER. - MCPM updates from Nicolas Pitre. - Make it easier for proper CPU part number checks (which should always include the vendor field). - Assembly code optimisation - use the "bx" instruction when returning from a function on ARMv6+ rather than "mov pc, reg". - Save the last kernel misaligned fault location and report it via the procfs alignment file. - Clean up the way we create the initial stack frame, which is a repeated pattern in several different locations. - Support for 8-byte get_user(), needed for some DRM implementations. - mcs locking from Will Deacon. - Save and restore a few more Cortex-A9 registers (for errata workarounds) - Fix various aspects of the SWP emulation, and the ELF hwcap for the SWP instruction. - Update LPAE logic for pte_write and pmd_write to make it more correct. - Support for Broadcom Brahma15 CPU cores. - ARM assembly crypto updates from Ard Biesheuvel" * 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: (53 commits) ARM: add comments to the early page table remap code ARM: 8122/1: smp_scu: enable SCU standby support ARM: 8121/1: smp_scu: use macro for SCU enable bit ARM: 8120/1: crypto: sha512: add ARM NEON implementation ARM: 8119/1: crypto: sha1: add ARM NEON implementation ARM: 8118/1: crypto: sha1/make use of common SHA-1 structures ARM: 8113/1: remove remaining definitions of PLAT_PHYS_OFFSET from <mach/memory.h> ARM: 8111/1: Enable erratum 798181 for Broadcom Brahma-B15 ARM: 8110/1: do CPU-specific init for Broadcom Brahma15 cores ARM: 8109/1: mm: Modify pte_write and pmd_write logic for LPAE ARM: 8108/1: mm: Introduce {pte,pmd}_isset and {pte,pmd}_isclear ARM: hwcap: disable HWCAP_SWP if the CPU advertises it has exclusives ARM: SWP emulation: only initialise on ARMv7 CPUs ARM: SWP emulation: always enable when SMP is enabled ARM: 8103/1: save/restore Cortex-A9 CP15 registers on suspend/resume ARM: 8098/1: mcs lock: implement wfe-based polling for MCS locking ARM: 8091/2: add get_user() support for 8 byte types ARM: 8097/1: unistd.h: relocate comments back to place ARM: 8096/1: Describe required sort order for textofs-y (TEXT_OFFSET) ARM: 8090/1: add revision info for PL310 errata 588369 and 727915 ...	2014-08-05 10:05:29 -07:00
Jussi Kivilinna	c8611d712a	ARM: 8120/1: crypto: sha512: add ARM NEON implementation This patch adds ARM NEON assembly implementation of SHA-512 and SHA-384 algorithms. tcrypt benchmark results on Cortex-A8, sha512-generic vs sha512-neon-asm: block-size bytes/update old-vs-new 16 16 2.99x 64 16 2.67x 64 64 3.00x 256 16 2.64x 256 64 3.06x 256 256 3.33x 1024 16 2.53x 1024 256 3.39x 1024 1024 3.52x 2048 16 2.50x 2048 256 3.41x 2048 1024 3.54x 2048 2048 3.57x 4096 16 2.49x 4096 256 3.42x 4096 1024 3.56x 4096 4096 3.59x 8192 16 2.48x 8192 256 3.42x 8192 1024 3.56x 8192 4096 3.60x 8192 8192 3.60x Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2014-08-02 08:51:50 +01:00
Jussi Kivilinna	604682551a	ARM: 8119/1: crypto: sha1: add ARM NEON implementation This patch adds ARM NEON assembly implementation of SHA-1 algorithm. tcrypt benchmark results on Cortex-A8, sha1-arm-asm vs sha1-neon-asm: block-size bytes/update old-vs-new 16 16 1.04x 64 16 1.02x 64 64 1.05x 256 16 1.03x 256 64 1.04x 256 256 1.30x 1024 16 1.03x 1024 256 1.36x 1024 1024 1.52x 2048 16 1.03x 2048 256 1.39x 2048 1024 1.55x 2048 2048 1.59x 4096 16 1.03x 4096 256 1.40x 4096 1024 1.57x 4096 4096 1.62x 8192 16 1.03x 8192 256 1.40x 8192 1024 1.58x 8192 4096 1.63x 8192 8192 1.63x Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2014-08-02 08:51:47 +01:00
Jussi Kivilinna	1f8673d31a	ARM: 8118/1: crypto: sha1/make use of common SHA-1 structures Common SHA-1 structures are defined in <crypto/sha.h> for code sharing. This patch changes SHA-1/ARM glue code to use these structures. Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2014-08-02 08:51:46 +01:00
Mikulas Patocka	f3c400ef47	crypto: arm-aes - fix encryption of unaligned data Fix the same alignment bug as in arm64 - we need to pass residue unprocessed bytes as the last argument to blkcipher_walk_done. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org # 3.13+ Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2014-07-28 22:01:03 +08:00
Russell King	6ebbf2ce43	ARM: convert all "mov.* pc, reg" to "bx reg" for ARMv6+ ARMv6 and greater introduced a new instruction ("bx") which can be used to return from function calls. Recent CPUs perform better when the "bx lr" instruction is used rather than the "mov pc, lr" instruction, and this sequence is strongly recommended to be used by the ARM architecture manual (section A.4.1.1). We provide a new macro "ret" with all its variants for the condition code which will resolve to the appropriate instruction. Rather than doing this piecemeal, and miss some instances, change all the "mov pc" instances to use the new macro, with the exception of the "movs" instruction and the kprobes code. This allows us to detect the "mov pc, lr" case and fix it up - and also gives us the possibility of deploying this for other registers depending on the CPU selection. Reported-by: Will Deacon <will.deacon@arm.com> Tested-by: Stephen Warren <swarren@nvidia.com> # Tegra Jetson TK1 Tested-by: Robert Jarzmik <robert.jarzmik@free.fr> # mioa701_bootresume.S Tested-by: Andrew Lunn <andrew@lunn.ch> # Kirkwood Tested-by: Shawn Guo <shawn.guo@freescale.com> Tested-by: Tony Lindgren <tony@atomide.com> # OMAPs Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com> # Armada XP, 375, 385 Acked-by: Sekhar Nori <nsekhar@ti.com> # DaVinci Acked-by: Christoffer Dall <christoffer.dall@linaro.org> # kvm/hyp Acked-by: Haojian Zhuang <haojian.zhuang@gmail.com> # PXA3xx Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> # Xen Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> # ARMv7M Tested-by: Simon Horman <horms+renesas@verge.net.au> # Shmobile Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2014-07-18 12:29:04 +01:00
Russell King	d2eca20d77	CRYPTO: Fix more AES build errors Building a multi-arch kernel results in: arch/arm/crypto/built-in.o: In function `aesbs_xts_decrypt': sha1_glue.c:(.text+0x15c8): undefined reference to `bsaes_xts_decrypt' arch/arm/crypto/built-in.o: In function `aesbs_xts_encrypt': sha1_glue.c:(.text+0x1664): undefined reference to `bsaes_xts_encrypt' arch/arm/crypto/built-in.o: In function `aesbs_ctr_encrypt': sha1_glue.c:(.text+0x184c): undefined reference to `bsaes_ctr32_encrypt_blocks' arch/arm/crypto/built-in.o: In function `aesbs_cbc_decrypt': sha1_glue.c:(.text+0x19b4): undefined reference to `bsaes_cbc_encrypt' This code is already runtime-conditional on NEON being supported, so there's no point compiling it out depending on the minimum build architecture. Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2014-01-05 13:59:56 +00:00
Russell King	75c912a367	ARM: add .gitignore entry for aesbs-core.S This avoids this file being incorrectly added to git. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2013-10-07 15:43:53 +01:00
Ard Biesheuvel	e4e7f10bfc	ARM: add support for bit sliced AES using NEON instructions Bit sliced AES gives around 45% speedup on Cortex-A15 for encryption and around 25% for decryption. This implementation of the AES algorithm does not rely on any lookup tables so it is believed to be invulnerable to cache timing attacks. This algorithm processes up to 8 blocks in parallel in constant time. This means that it is not usable by chaining modes that are strictly sequential in nature, such as CBC encryption. CBC decryption, however, can benefit from this implementation and runs about 25% faster. The other chaining modes implemented in this module, XTS and CTR, can execute fully in parallel in both directions. The core code has been adopted from the OpenSSL project (in collaboration with the original author, on cc). For ease of maintenance, this version is identical to the upstream OpenSSL code, i.e., all modifications that were required to make it suitable for inclusion into the kernel have been made upstream. The original can be found here: http://git.openssl.org/gitweb/?p=openssl.git;a=commit;h=6f6a6130 Note to integrators: While this implementation is significantly faster than the existing table based ones (generic or ARM asm), especially in CTR mode, the effects on power efficiency are unclear as of yet. This code does fundamentally more work, by calculating values that the table based code obtains by a simple lookup; only by doing all of that work in a SIMD fashion, it manages to perform better. Cc: Andy Polyakov <appro@openssl.org> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>	2013-10-04 20:48:38 +02:00
Ard Biesheuvel	5ce26f3b5a	ARM: move AES typedefs and function prototypes to separate header Put the struct definitions for AES keys and the asm function prototypes in a separate header and export the asm functions from the module. This allows other drivers to use them directly. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>	2013-10-04 09:26:54 +02:00
Ard Biesheuvel	40190c85f4	ARM: 7837/3: fix Thumb-2 bug in AES assembler code Patch `638591c` enabled building the AES assembler code in Thumb2 mode. However, this code used arithmetic involving PC rather than adr{l} instructions to generate PC-relative references to the lookup tables, and this needs to take into account the different PC offset when running in Thumb mode. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Nicolas Pitre <nico@linaro.org> Cc: stable@vger.kernel.org Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2013-09-22 11:43:38 +01:00
Ard Biesheuvel	934fc24df1	ARM: 7723/1: crypto: sha1-armv4-large.S: fix SP handling Make the SHA1 asm code ABI conformant by making sure all stack accesses occur above the stack pointer. Origin: http://git.openssl.org/gitweb/?p=openssl.git;a=commit;h=1a9d60d2 Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Nicolas Pitre <nico@linaro.org> Cc: stable@vger.kernel.org Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2013-05-22 22:01:35 +01:00
Dave Martin	638591cd7b	ARM: 7626/1: arm/crypto: Make asm SHA-1 and AES code Thumb-2 compatible This patch fixes aes-armv4.S and sha1-armv4-large.S to work natively in Thumb. This allows ARM/Thumb interworking workarounds to be removed. I also take the opportunity to convert some explicit assembler directives for exported functions to the standard ENTRY()/ENDPROC(). For the code itself: * In sha1_block_data_order, use of TEQ with sp is deprecated in ARMv7 and not supported in Thumb. For the branches back to .L_00_15 and .L_40_59, the TEQ is converted to a CMP, under the assumption that clobbering the C flag here will not cause incorrect behaviour. For the first branch back to .L_20_39_or_60_79 the C flag is important, so sp is moved temporarily into another register so that TEQ can be used for the comparison. * In the AES code, most forms of register-indexed addressing with shifts and rotates are not permitted for loads and stores in Thumb, so the address calculation is done using a separate instruction for the Thumb case. The resulting code is unlikely to be optimally scheduled, but it should not have a large impact given the overall size of the code. I haven't run any benchmarks. Signed-off-by: Dave Martin <dave.martin@linaro.org> Tested-by: David McCullough <ucdevel@gmail.com> (ARM only) Acked-by: David McCullough <ucdevel@gmail.com> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2013-01-13 12:41:22 +00:00
David McCullough	f0be44f4fb	arm/crypto: Add optimized AES and SHA1 routines Add assembler versions of AES and SHA1 for ARM platforms. This has provided up to a 50% improvement in IPsec/TCP throughout for tunnels using AES128/SHA1. Platform CPU SPeed Endian Before (bps) After (bps) Improvement IXP425 533 MHz big 11217042 15566294 ~38% KS8695 166 MHz little 3828549 5795373 ~51% Signed-off-by: David McCullough <ucdevel@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2012-09-07 04:17:02 +08:00

1 2 3 4

167 Commits