Fix a memory leak in an error path in uc loader.
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The crypto engine must be initialized before registering algorithms,
otherwise the test manager will crash as it attempts to execute
tests for the algos while they are being registered.
Fixes: f1b77aaca8 ("crypto: omap-des - Integrate with the crypto engine framework")
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The crypto engine must be initialized before registering algorithms,
otherwise the test manager will crash as it attempts to execute
tests for the algos while they are being registered.
Fixes: 0529900a01 ("crypto: omap-aes - Support crypto engine framework")
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
As setting up the DMA operations is quite costly, add software fallback
support for requests smaller than 200 bytes. This change gives some 10%
extra performance in ipsec use case.
Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
[t-kristo@ti.com: udpated against latest upstream, to use skcipher mainly]
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Some SoCs like omap4/omap5/dra7 contain multiple AES crypto accelerator
cores. Adapt the driver to support this. The driver picks the last used
device from a list of AES devices.
Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
[t-kristo@ti.com: forward ported to 4.7 kernel]
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Calling runtime PM API at the cra_init/exit is bad for power management
purposes, as the lifetime for a CRA can be very long. Instead, use
pm_runtime autosuspend approach for handling the device clocks. Clocks
are enabled when they are actually required, and autosuspend disables
these if they have not been used for a sufficiently long time period.
By default, the timeout value is 1 second.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
If software fallback is used on older hardware accelerator setup (OMAP2/
OMAP3), the first block of data must be purged from the buffer. The
first block contains the pre-generated ipad value required by the HW,
but the software fallback algorithm generates its own, causing wrong
results.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
If we have processed any data with the hardware accelerator (digcnt > 0),
we must complete the entire hash by using it. This is because the current
hash value can't be imported to the software fallback algorithm. Otherwise
we end up with wrong hash results.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Some of the call paths of OMAP SHA driver can avoid executing the next
step of the crypto queue under tasklet; instead, execute the next step
directly via function call. This avoids a costly round-trip via the
scheduler giving a slight performance boost.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Conflicts:
drivers/net/ethernet/mediatek/mtk_eth_soc.c
drivers/net/ethernet/qlogic/qed/qed_dcbx.c
drivers/net/phy/Kconfig
All conflicts were cases of overlapping commits.
Signed-off-by: David S. Miller <davem@davemloft.net>
Drivers should not use NO_IRQ, as we are trying to get rid of that.
In this case, the call to irq_of_parse_and_map() is both wrong
(as it returns '0' on failure, not NO_IRQ) and unnecessary
(as platform_get_irq() does the same thing)
This removes the call to irq_of_parse_and_map() and checks for
the error code correctly.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
ccp_dmaengine_register used to return with an error code before
releasing all resource. This patch adds a jump to the appropriate label
ensuring that the resources are properly released before returning.
This issue was found with Hector.
Signed-off-by: Quentin Lambert <lambert.quentin@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-nonce is being loaded using append_load_imm_u32() instead of
append_load_as_imm() (nonce is a byte array / stream, not a 4-byte
variable)
-counter is not being added in big endian format, as mandatated by
RFC3686 and expected by the crypto engine
Signed-off-by: Catalin Vasile <cata.vasile@nxp.com>
Reviewed-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The current crypto engine allow only ablkcipher_request to be enqueued.
Thus denying any use of it for hardware that also handle hash algo.
This patch modify the API for allowing to enqueue ciphers and hash.
Since omap-aes/omap-des are the only users, this patch also convert them
to the new cryptoengine API.
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch move the whole crypto engine API to its own header
crypto/engine.h.
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Fix incorrect value of ADF_C3XXX_ACCELERATORS_MASK.
Signed-off-by: Maksim Lukoshkov <maksim.lukoshkov@intel.com>
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Copy const_tab array into DMA-able memory (accesible by qat hw).
Signed-off-by: Maksim Lukoshkov <maksim.lukoshkov@intel.com>
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
We get 1 warning when biuld kernel with W=1:
drivers/crypto/caam/ctrl.c:398:5: warning: no previous prototype for 'caam_get_era' [-Wmissing-prototypes]
In fact, this function is declared in drivers/crypto/caam/ctrl.h,
so this patch add missing header dependencies.
Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
For algorithms that implement IV generators before the crypto ops,
the IV needed for decryption is initially located in req->src
scatterlist, not in req->iv.
Avoid copying the IV into req->iv by modifying the (givdecrypt)
descriptors to load it directly from req->src.
aead_givdecrypt() is no longer needed and goes away.
Cc: <stable@vger.kernel.org> # 4.3+
Fixes: 479bcc7c5b ("crypto: caam - Convert authenc to new AEAD interface")
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Fixes the following sparse warning:
drivers/crypto/chelsio/chcr_algo.c:593:5: warning:
symbol 'cxgb4_is_crypto_q_full' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
If devm_add_action() fails we are explicitly calling the cleanup to free
the resources allocated. Lets use the helper devm_add_action_or_reset()
and return directly in case of error, as we know that the cleanup function
has been already called by the helper if there was any error.
Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
clk_prepare_enable() may fail, so we should better check its return
value and propagate it in the case of failure.
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add the missing unlock before return from function sun4i_hash()
in the error handling case.
Fixes: 477d9b2e59 ("crypto: sun4i-ss - unify update/final function")
Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
Acked-by: Corentin LABBE <clabbe.montjoie@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
walk.iv is not assigned a value in blkcipher_walk_init. It makes iv uninitialized.
It is possibly a null value(as shown below), which is then used by aes_p8_encrypt.
This patch moves iv = walk.iv after blkcipher_walk_virt, in which walk.iv is set.
[17856.268050] Unable to handle kernel paging request for data at address 0x00000000
[17856.268212] Faulting instruction address: 0xd000000002ff04bc
7:mon> t
[link register ] d000000002ff47b8 p8_aes_xts_crypt+0x168/0x2a0 [vmx_crypto] (938)
[c000000013b77960] d000000002ff4794 p8_aes_xts_crypt+0x144/0x2a0 [vmx_crypto] (unreliable)
[c000000013b77a70] c000000000544d64 skcipher_decrypt_blkcipher+0x64/0x80
[c000000013b77ac0] d000000003c0175c crypt_convert+0x53c/0x620 [dm_crypt]
[c000000013b77ba0] d000000003c043fc kcryptd_crypt+0x3cc/0x440 [dm_crypt]
[c000000013b77c50] c0000000000f3070 process_one_work+0x1e0/0x590
[c000000013b77ce0] c0000000000f34c8 worker_thread+0xa8/0x660
[c000000013b77d80] c0000000000fc0b0 kthread+0x110/0x130
[c000000013b77e30] c0000000000098f0 ret_from_kernel_thread+0x5c/0x6c
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Increase value of supported key sizes for qat_aes_xts.
aes-xts keys consists of keys of equal size concatenated.
Fixes: def14bfaf3 ("crypto: qat - add support for ctr(aes) and xts(aes)")
Cc: stable@vger.kernel.org
Reported-by: Wenqian Yu <wenqian.yu@intel.com>
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Adds the config entry for the Chelsio Crypto Driver, Makefile changes
for the same.
Signed-off-by: Atul Gupta <atul.gupta@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Chelsio's Crypto Hardware can perform the following operations:
SHA1, SHA224, SHA256, SHA384 and SHA512, HMAC(SHA1), HMAC(SHA224),
HMAC(SHA256), HMAC(SHA384), HAMC(SHA512), AES-128-CBC, AES-192-CBC,
AES-256-CBC, AES-128-XTS, AES-256-XTS
This patch implements the driver for above mentioned features. This
driver is an Upper Layer Driver which is attached to Chelsio's LLD
(cxgb4) and uses the queue allocated by the LLD for sending the crypto
requests to the Hardware and receiving the responses from it.
The crypto operations can be performed by Chelsio's hardware from the
userspace applications and/or from within the kernel space using the
kernel's crypto API.
The above mentioned crypto features have been tested using kernel's
tests mentioned in testmgr.h. They also have been tested from user
space using libkcapi and Openssl.
Signed-off-by: Atul Gupta <atul.gupta@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes the following sparse warning:
drivers/crypto/ccp/ccp-dev.c:62:14: warning:
symbol 'ccp_increment_unit_ordinal' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Two crypto alg are badly indented, this patch fix this style issue.
Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The dev *ss is stored both in sun4i_tfm_ctx and sun4i_req_ctx.
Since this pointer will never be changed during tfm life, it is better
to remove it from sun4i_req_ctx.
Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Two words are badly spelled, this patch respell them.
Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The ss variable is never used, remove it.
Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The update and final functions have lots of common action.
This patch mix them in one function.
This will give some improvements:
- This will permit asynchronous support more easily
- This will permit to use finup/digest functions with some performance
improvements
Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The variable i is always checked against unsigned value and cannot be
negative.
This patch set it as unsigned.
Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Don't use 64 'as is', as max block size in mv_cesa_ahash_cache_req. Use
CESA_MAX_HASH_BLOCK_SIZE instead, this is better for readability.
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently, in mv_cesa_{md5,sha1,sha256}_init creq->state is initialized
before the call to mv_cesa_ahash_init. This is wrong because this
function fills creq with zero by using memset, so its 'state' that
contains the default DIGEST is overwritten. This commit fixes the issue
by initializing creq->state just after the call to mv_cesa_ahash_init.
Fixes: commit b0ef51067c ("crypto: marvell/cesa - initialize hash...")
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
So far, sub part of mv_cesa_int was responsible of dequeuing complete
requests, then call the 'cleanup' operation on these reqs and call the
crypto api callback 'complete'. The problem is that the transformation
context 'ctx' is retrieved only once before the while loop. Which means
that the wrong 'cleanup' operation might be called on the wrong type of
cesa requests, it can lead to memory corruptions with this message:
marvell-cesa f1090000.crypto: dma_pool_free cesa_padding, 5a5a5a5a/5a5a5a5a (bad dma)
This commit fixes the issue, by updating the transformation context for
each dequeued cesa request.
Fixes: commit 85030c5168 ("crypto: marvell - Add support for chai...")
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The mv_cesa_ahash_cache_req() function always returns 0, which makes
its return value pretty much useless. However, in addition to
returning a useless value, it also returns a boolean in a variable
passed by reference to indicate if the request was already cached.
So, this commit changes mv_cesa_ahash_cache_req() to return this
boolean. It consequently simplifies the only call site of
mv_cesa_ahash_cache_req(), where the "ret" variable is no longer
needed.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The mv_cesa_ahash_init() function always returns 0, and the return
value is anyway never checked. Turn it into a function returning void.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The dma_iter parameter of mv_cesa_ahash_dma_add_cache() is never used,
so get rid of it.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The mv_cesa_dma_add_op() function builds a mv_cesa_tdma_desc structure
to copy the operation description to the SRAM, but doesn't explicitly
initialize the destination of the copy. It works fine because the
operatin description must be copied at the beginning of the SRAM, and
the mv_cesa_tdma_desc structure is initialized to zero when
allocated. However, it is somewhat confusing to not have a destination
defined.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Threaded interrupts can perform the function of the tasklet, and much
more safely too - without races when trying to take the tasklet and
interrupt down on device removal.
With the old code, there is a window where we call tasklet_kill(). If
the interrupt handler happens to be running on a different CPU, and
subsequently calls tasklet_schedule(), the tasklet will be re-scheduled
for execution.
Switching to a hardirq/threadirq combination implementation avoids this,
and it also means generic code deals with the teardown sequencing of the
threaded and non-threaded parts.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add a helper to map the source scatterlist into the descriptor.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add a helper function to perform the descriptor allocation.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Strictly, dma_map_sg() may coalesce SG entries, but in practise on iMX
hardware, this will never happen. However, dma_map_sg() can fail, and
we completely fail to check its return value. So, fix this properly.
Arrange the code to map the scatterlist early, so we know how many
scatter table entries to allocate, and then fill them in. This allows
us to keep relatively simple error cleanup paths.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Ensure that we clean up allocations and DMA mappings after encountering
an error rather than just giving up and leaking memory and resources.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Since the extended descriptor includes the hardware descriptor, and the
sec4 scatterlist immediately follows this, we can declare it as a array
at the very end of the extended descriptor. This allows us to get rid
of an initialiser for every site where we allocate an extended
descriptor.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Mark the hardware descriptor as being cache line aligned; on DMA
incoherent architectures, the hardware descriptor should sit in a
separate cache line from the CPU accessed data to avoid polluting
the caches.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Rather than giving the descriptor as hw_desc[0], give it's real size.
All places where we allocate an ahash_edesc incorporate DESC_JOB_IO_LEN
bytes of job descriptor.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
caamhash contains this weird code:
src_nents = sg_count(req->src, req->nbytes);
dma_map_sg(jrdev, req->src, src_nents ? : 1, DMA_TO_DEVICE);
...
edesc->src_nents = src_nents;
sg_count() returns zero when sg_nents_for_len() returns zero or one.
This means we don't need to use a hardware scatterlist. However,
setting src_nents to zero causes problems when we unmap:
if (edesc->src_nents)
dma_unmap_sg_chained(dev, req->src, edesc->src_nents,
DMA_TO_DEVICE, edesc->chained);
as zero here means that we have no entries to unmap. This causes us
to leak DMA mappings, where we map one scatterlist entry and then
fail to unmap it.
This can be fixed in two ways: either by writing the number of entries
that were requested of dma_map_sg(), or by reworking the "no SG
required" case.
We adopt the re-work solution here - we replace sg_count() with
sg_nents_for_len(), so src_nents now contains the real number of
scatterlist entries, and we then change the test for using the
hardware scatterlist to src_nents > 1 rather than just non-zero.
This change passes my sshd, openssl tests hashing /bin and tcrypt
tests.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Properly allocate enough memory to respect the fallback.
Signed-off-by: Will Thomas <will.thomas@imgtec.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently the probe function only emits an output on success
when debug is specifically enabled. It would be more useful
if this happens by default.
Signed-off-by: James Hartley <james.hartley@imgtec.com>
Reviewed-by: Will Thomas <will.thomas@imgtec.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently the img-hash accelerator does not probe
successfully due to a change in the checks made during
registration with the crypto framework. This is due to
import and export functions not being defined. Correct
this.
Signed-off-by: James Hartley <james.hartley@imgtec.com>
Signed-off-by: Will Thomas <will.thomas@imgtec.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Current img hash claims sys and periph gate clocks
and this can be gated in system suspend scenarios.
Add support for Device pm ops for img hash to gate
the clocks claimed by img hash.
Signed-off-by: Govindraj Raja <Govindraj.Raja@imgtec.com>
Reviewed-by: Will Thomas <will.thomas@imgtec.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Burst length of 16 drives the hash accelerator out of spec
and causes stability issues in some cases. Reduce this to
stop data being lost.
Signed-off-by: Will Thomas <will.thomas@imgtec.com>
Reviewed-by: James Hartley <james.hartley@imgtec.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Move 0 length buffer to end of structure to stop overwriting
fallback request data. This doesn't cause a bug itself as the
buffer is never used alongside the fallback but should be
changed.
Signed-off-by: Will Thomas <will.thomas@imgtec.com>
Reviewed-by: James Hartley <james.hartley@imgtec.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Sporadic null pointer exceptions came from here. Fix them.
Signed-off-by: Will Thomas <will.thomas@imgtec.com>
Reviewed-by: James Hartley <james.hartley@imgtec.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
A second CCP is available, identical to the first, with
its ownn PCI ID. Make it available for use by the crypto
subsystem, as well as for DMA activity and random
number generation.
This device is not pre-configured at at boot time. The
driver must configure it (during the probe) for use.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Every CCP is capable of providing general DMA services.
Register the device as a provider.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Enable equivalent function on a v5 CCP. Add support for a
version 5 CCP which enables AES/XTS/SHA services. Also,
more work on the data structures to virtualize
functionality.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Available queue space is used to decide (by counting free slots)
if we have to put a command on hold or if it can be sent
to the engine immediately.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Make the RNG support code common (where possible) in
preparation for adding a v5 device.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Move the KSB access/management functions to the v3
device file, and add function pointers to the actions
structure. At the operations layer all of the references
to the storage block will be generic (virtual). This is
in preparation for a version 5 device, in which the
private storage block is managed differently.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Form and use of the local storage block in the CCP is
particular to the device version. Much of the code that
accesses the storage block can treat it as a virtual
resource, and will under go some renaming. Device-specific
access to the memory will be moved into device file.
Service functions will be added to the actions
structure.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use more concise field names; "perform_" is too verbose.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Device-specific values for the BAR and offset should be found
in the version data structure.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Most error branches following the call to npe_request contain a call to
npe_request. This patch add a call to npe_release to error branches
following the call to npe_request that do not have it.
This issue was found with Hector.
Signed-off-by: Quentin Lambert <lambert.quentin@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Since 6de62f15b5 ("crypto: algif_hash - Require setkey before
accept(2)"), the AF_ALG interface requires userspace to provide a key
to any algorithm that has a setkey method. However, the non-HMAC
algorithms are not keyed, so setting a key is unnecessary.
Fix this by removing the setkey method from the non-keyed hash
algorithms.
Fixes: 6de62f15b5 ("crypto: algif_hash - Require setkey before accept(2)")
Cc: <stable@vger.kernel.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
To be able to generate shared descriptors for AEAD, the authentication size
needs to be known. However, there is no imposed order of calling .setkey,
.setauthsize callbacks.
Thus, in case authentication size is not known at .setkey time, defer it
until .setauthsize is called.
The authsize != 0 check was incorrectly removed when converting the driver
to the new AEAD interface.
Cc: <stable@vger.kernel.org> # 4.3+
Fixes: 479bcc7c5b ("crypto: caam - Convert authenc to new AEAD interface")
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
There are a few things missed by the conversion to the
new AEAD interface:
1 - echainiv(authenc) encrypt shared descriptor
The shared descriptor is incorrect: due to the order of operations,
at some point in time MATH3 register is being overwritten.
2 - buffer used for echainiv(authenc) encrypt shared descriptor
Encrypt and givencrypt shared descriptors (for AEAD ops) are mutually
exclusive and thus use the same buffer in context state: sh_desc_enc.
However, there's one place missed by s/sh_desc_givenc/sh_desc_enc,
leading to errors when echainiv(authenc(...)) algorithms are used:
DECO: desc idx 14: Header Error. Invalid length or parity, or
certain other problems.
While here, also fix a typo: dma_mapping_error() is checking
for validity of sh_desc_givenc_dma instead of sh_desc_enc_dma.
Cc: <stable@vger.kernel.org> # 4.3+
Fixes: 479bcc7c5b ("crypto: caam - Convert authenc to new AEAD interface")
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Pull crypto fixes from Herbert Xu:
"This fixes a number of regressions in the marvell cesa driver caused
by the chaining work, and a regression in lib/mpi that leads to a
GFP_KERNEL allocation with preemption disabled"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: marvell - Don't copy IV vectors from the _process op for ciphers
lib/mpi: Fix SG miter leak
crypto: marvell - Update cache with input sg only when it is unmapped
crypto: marvell - Don't chain at DMA level when backlog is disabled
crypto: marvell - Fix memory leaks in TDMA chain for cipher requests
Highlights:
- PowerNV PCI hotplug support.
- Lots more Power9 support.
- eBPF JIT support on ppc64le.
- Lots of cxl updates.
- Boot code consolidation.
Bug fixes:
- Fix spin_unlock_wait() from Boqun Feng
- Fix stack pointer corruption in __tm_recheckpoint() from Michael Neuling
- Fix multiple bugs in memory_hotplug_max() from Bharata B Rao
- mm: Ensure "special" zones are empty from Oliver O'Halloran
- ftrace: Separate the heuristics for checking call sites from Michael Ellerman
- modules: Never restore r2 for a mprofile-kernel style mcount() call from Michael Ellerman
- Fix endianness when reading TCEs from Alexey Kardashevskiy
- start rtasd before PCI probing from Greg Kurz
- PCI: rpaphp: Fix slot registration for multiple slots under a PHB from Tyrel Datwyler
- powerpc/mm: Add memory barrier in __hugepte_alloc() from Sukadev Bhattiprolu
Cleanups & fixes:
- Drop support for MPIC in pseries from Rashmica Gupta
- Define and use PPC64_ELF_ABI_v2/v1 from Michael Ellerman
- Remove unused symbols in asm-offsets.c from Rashmica Gupta
- Fix SRIOV not building without EEH enabled from Russell Currey
- Remove kretprobe_trampoline_holder. from Thiago Jung Bauermann
- Reduce log level of PCI I/O space warning from Benjamin Herrenschmidt
- Add array bounds checking to crash_shutdown_handlers from Suraj Jitindar Singh
- Avoid -maltivec when using clang integrated assembler from Anton Blanchard
- Fix array overrun in ppc_rtas() syscall from Andrew Donnellan
- Fix error return value in cmm_mem_going_offline() from Rasmus Villemoes
- export cpu_to_core_id() from Mauricio Faria de Oliveira
- Remove old symbols from defconfigs from Andrew Donnellan
- Update obsolete comments in setup_32.c about entry conditions from Benjamin Herrenschmidt
- Add comment explaining the purpose of setup_kdump_trampoline() from Benjamin Herrenschmidt
- Merge the RELOCATABLE config entries for ppc32 and ppc64 from Kevin Hao
- Remove RELOCATABLE_PPC32 from Kevin Hao
- Fix .long's in tlb-radix.c to more meaningful from Balbir Singh
Minor cleanups & fixes:
- Andrew Donnellan, Anna-Maria Gleixner, Anton Blanchard, Benjamin
Herrenschmidt, Bharata B Rao, Christophe Leroy, Colin Ian King, Geliang
Tang, Greg Kurz, Madhavan Srinivasan, Michael Ellerman, Michael Ellerman,
Stephen Rothwell, Stewart Smith.
Freescale updates from Scott:
- "Highlights include more 8xx optimizations, device tree updates,
and MVME7100 support."
PowerNV PCI hotplug from Gavin Shan:
- PCI: Add pcibios_setup_bridge()
- Override pcibios_setup_bridge()
- Remove PCI_RESET_DELAY_US
- Move pnv_pci_ioda_setup_opal_tce_kill() around
- Increase PE# capacity
- Allocate PE# in reverse order
- Create PEs in pcibios_setup_bridge()
- Setup PE for root bus
- Extend PCI bridge resources
- Make pnv_ioda_deconfigure_pe() visible
- Dynamically release PE
- Update bridge windows on PCI plug
- Delay populating pdn
- Support PCI slot ID
- Use PCI slot reset infrastructure
- Introduce pnv_pci_get_slot_id()
- Functions to get/set PCI slot state
- PCI/hotplug: PowerPC PowerNV PCI hotplug driver
- Print correct PHB type names
Power9 idle support from Shreyas B. Prabhu:
- set power_save func after the idle states are initialized
- Use PNV_THREAD_WINKLE macro while requesting for winkle
- make hypervisor state restore a function
- Rename idle_power7.S to idle_book3s.S
- Rename reusable idle functions to hardware agnostic names
- Make pnv_powersave_common more generic
- abstraction for saving SPRs before entering deep idle states
- Add platform support for stop instruction
- cpuidle/powernv: Use CPUIDLE_STATE_MAX instead of MAX_POWERNV_IDLE_STATES
- cpuidle/powernv: cleanup cpuidle-powernv.c
- cpuidle/powernv: Add support for POWER ISA v3 idle states
- Use deepest stop state when cpu is offlined
Power9 PMU from Madhavan Srinivasan:
- factor out power8 pmu macros and defines
- factor out power8 pmu functions
- factor out power8 __init_pmu code
- Add power9 event list macros for generic and cache events
- Power9 PMU support
- Export Power9 generic and cache events to sysfs
Power9 preliminary interrupt & PCI support from Benjamin Herrenschmidt:
- Add XICS emulation APIs
- Move a few exception common handlers to make room
- Add support for HV virtualization interrupts
- Add mechanism to force a replay of interrupts
- Add ICP OPAL backend
- Discover IODA3 PHBs
- pci: Remove obsolete SW invalidate
- opal: Add real mode call wrappers
- Rename TCE invalidation calls
- Remove SWINV constants and obsolete TCE code
- Rework accessing the TCE invalidate register
- Fallback to OPAL for TCE invalidations
- Use the device-tree to get available range of M64's
- Check status of a PHB before using it
- pci: Don't try to allocate resources that will be reassigned
Other Power9:
- Send SIGBUS on unaligned copy and paste from Chris Smart
- Large Decrementer support from Oliver O'Halloran
- Load Monitor Register Support from Jack Miller
Performance improvements from Anton Blanchard:
- Avoid load hit store in __giveup_fpu() and __giveup_altivec()
- Avoid load hit store in setup_sigcontext()
- Remove assembly versions of strcpy, strcat, strlen and strcmp
- Align hot loops of some string functions
eBPF JIT from Naveen N. Rao:
- Fix/enhance 32-bit Load Immediate implementation
- Optimize 64-bit Immediate loads
- Introduce rotate immediate instructions
- A few cleanups
- Isolate classic BPF JIT specifics into a separate header
- Implement JIT compiler for extended BPF
Operator Panel driver from Suraj Jitindar Singh:
- devicetree/bindings: Add binding for operator panel on FSP machines
- Add inline function to get rc from an ASYNC_COMP opal_msg
- Add driver for operator panel on FSP machines
Sparse fixes from Daniel Axtens:
- make some things static
- Introduce asm-prototypes.h
- Include headers containing prototypes
- Use #ifdef __BIG_ENDIAN__ #else for REG_BYTE
- kvm: Clarify __user annotations
- Pass endianness to sparse
- Make ppc_md.{halt, restart} __noreturn
MM fixes & cleanups from Aneesh Kumar K.V:
- radix: Update LPCR HR bit as per ISA
- use _raw variant of page table accessors
- Compile out radix related functions if RADIX_MMU is disabled
- Clear top 16 bits of va only on older cpus
- Print formation regarding the the MMU mode
- hash: Update SDR1 size encoding as documented in ISA 3.0
- radix: Update PID switch sequence
- radix: Update machine call back to support new HCALL.
- radix: Add LPID based tlb flush helpers
- radix: Add a kernel command line to disable radix
- Cleanup LPCR defines
Boot code consolidation from Benjamin Herrenschmidt:
- Move epapr_paravirt_early_init() to early_init_devtree()
- cell: Don't use flat device-tree after boot
- ge_imp3a: Don't use the flat device-tree after boot
- mpc85xx_ds: Don't use the flat device-tree after boot
- mpc85xx_rdb: Don't use the flat device-tree after boot
- Don't test for machine type in rtas_initialize()
- Don't test for machine type in smp_setup_cpu_maps()
- dt: Add of_device_compatible_match()
- Factor do_feature_fixup calls
- Move 64-bit feature fixup earlier
- Move 64-bit memory reserves to setup_arch()
- Use a cachable DART
- Move FW feature probing out of pseries probe()
- Put exception configuration in a common place
- Remove early allocation of the SMU command buffer
- Move MMU backend selection out of platform code
- pasemi: Remove IOBMAP allocation from platform probe()
- mm/hash: Don't use machine_is() early during boot
- Don't test for machine type to detect HEA special case
- pmac: Remove spurrious machine type test
- Move hash table ops to a separate structure
- Ensure that ppc_md is empty before probing for machine type
- Move 64-bit probe_machine() to later in the boot process
- Move 32-bit probe() machine to later in the boot process
- Get rid of ppc_md.init_early()
- Move the boot time info banner to a separate function
- Move setting of {i,d}cache_bsize to initialize_cache_info()
- Move the content of setup_system() to setup_arch()
- Move cache info inits to a separate function
- Re-order the call to smp_setup_cpu_maps()
- Re-order setup_panic()
- Make a few boot functions __init
- Merge 32-bit and 64-bit setup_arch()
Other new features:
- tty/hvc: Use IRQF_SHARED for OPAL hvc consoles from Sam Mendoza-Jonas
- tty/hvc: Use opal irqchip interface if available from Sam Mendoza-Jonas
- powerpc: Add module autoloading based on CPU features from Alastair D'Silva
- crypto: vmx - Convert to CPU feature based module autoloading from Alastair D'Silva
- Wake up kopald polling thread before waiting for events from Benjamin Herrenschmidt
- xmon: Dump ISA 2.06 SPRs from Michael Ellerman
- xmon: Dump ISA 2.07 SPRs from Michael Ellerman
- Add a parameter to disable 1TB segs from Oliver O'Halloran
- powerpc/boot: Add OPAL console to epapr wrappers from Oliver O'Halloran
- Assign fixed PHB number based on device-tree properties from Guilherme G. Piccoli
- pseries: Add pseries hotplug workqueue from John Allen
- pseries: Add support for hotplug interrupt source from John Allen
- pseries: Use kernel hotplug queue for PowerVM hotplug events from John Allen
- pseries: Move property cloning into its own routine from Nathan Fontenot
- pseries: Dynamic add entires to associativity lookup array from Nathan Fontenot
- pseries: Auto-online hotplugged memory from Nathan Fontenot
- pseries: Remove call to memblock_add() from Nathan Fontenot
cxl:
- Add set and get private data to context struct from Michael Neuling
- make base more explicitly non-modular from Paul Gortmaker
- Use for_each_compatible_node() macro from Wei Yongjun
- Frederic Barrat
- Abstract the differences between the PSL and XSL
- Make vPHB device node match adapter's
- Philippe Bergheaud
- Add mechanism for delivering AFU driver specific events
- Ignore CAPI adapters misplaced in switched slots
- Refine slice error debug messages
- Andrew Donnellan
- static-ify variables to fix sparse warnings
- PCI/hotplug: pnv_php: export symbols and move struct types needed by cxl
- PCI/hotplug: pnv_php: handle OPAL_PCI_SLOT_OFFLINE power state
- Add cxl_check_and_switch_mode() API to switch bi-modal cards
- remove dead Kconfig options
- fix potential NULL dereference in free_adapter()
- Ian Munsie
- Update process element after allocating interrupts
- Add support for CAPP DMA mode
- Fix allowing bogus AFU descriptors with 0 maximum processes
- Fix allocating a minimum of 2 pages for the SPA
- Fix bug where AFU disable operation had no effect
- Workaround XSL bug that does not clear the RA bit after a reset
- Fix NULL pointer dereference on kernel contexts with no AFU interrupts
- powerpc/powernv: Split cxl code out into a separate file
- Add cxl_slot_is_supported API
- Enable bus mastering for devices using CAPP DMA mode
- Move cxl_afu_get / cxl_afu_put to base
- Allow a default context to be associated with an external pci_dev
- Do not create vPHB if there are no AFU configuration records
- powerpc/powernv: Add support for the cxl kernel api on the real phb
- Add support for using the kernel API with a real PHB
- Add kernel APIs to get & set the max irqs per context
- Add preliminary workaround for CX4 interrupt limitation
- Add support for interrupts on the Mellanox CX4
- Workaround PE=0 hardware limitation in Mellanox CX4
- powerpc/powernv: Fix pci-cxl.c build when CONFIG_MODULES=n
selftests:
- Test unaligned copy and paste from Chris Smart
- Load Monitor Register Tests from Jack Miller
- Cyril Bur
- exec() with suspended transaction
- Use signed long to read perf_event_paranoid
- Fix usage message in context_switch
- Fix generation of vector instructions/types in context_switch
- Michael Ellerman
- Use "Delta" rather than "Error" in normal output
- Import Anton's mmap & futex micro benchmarks
- Add a test for PROT_SAO
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXnWchAAoJEFHr6jzI4aWAe64P/36Vd9yJLptjkoyZp8/IQtu1
Cv8buQwGdKuSMzdkcUAOXcC3fe2u70ZWXMKKLfY3koIV1IAiqdWk5/XWRKMP2XmE
dG0LhSf0uu7uh+mE0WvQnRu46ImeKtQ+mPp4Hbs/s9SxMSeYjruv3vdWWmgUq0cl
Gac2qJSRtAMmgLuHWMjf7N5mxOTOnKejU4o2i9cJ+YHmWKOdCigv2Ge1UadOQFlC
E7tRPiUR3asfDfj+e+LVTTdToH6p8pk+mOUzIoZ8jIkQ+IXzi62UDl5+Rw9mqiuX
1CtqEMUXxo2qwX+d4TcV/QUOp0YKPuIcUZ9NMMS+S3lOyJ4NFt+j2Izk7QJp5kNP
gKVqB68TjDQsBuDr3P9ynlHbduxTIhZAqopbTrLe0FIg48nUe4n1yHJBVzqaVajX
rFBJSsSUffBLAARNPSXJJhIgc2C1/qOC8dgMeDMcR2kPirDHaQZ/lY1yEpq1yiqR
q6e3v5hvIAm4IjbYk0mF7TUxBrPGVE/ExyBINyASRoYxAJ1PyeD/iljZ9vI3asRA
s+hhxT8H3f7lnqTrmJqMjHgAdGkmag07EdmvFNX4xK4aADSy7Y6g4dw25ffRopo9
p9Jf9HX+dZv65Y3UjbV/6HuXcaSEBJJLSVWvii65PebqSN0LuHEFvNeIJ6Iblx0B
AWh/hd0Iin2gdkcG39Mr
=Z5kM
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
"Highlights:
- PowerNV PCI hotplug support.
- Lots more Power9 support.
- eBPF JIT support on ppc64le.
- Lots of cxl updates.
- Boot code consolidation.
Bug fixes:
- Fix spin_unlock_wait() from Boqun Feng
- Fix stack pointer corruption in __tm_recheckpoint() from Michael
Neuling
- Fix multiple bugs in memory_hotplug_max() from Bharata B Rao
- mm: Ensure "special" zones are empty from Oliver O'Halloran
- ftrace: Separate the heuristics for checking call sites from
Michael Ellerman
- modules: Never restore r2 for a mprofile-kernel style mcount() call
from Michael Ellerman
- Fix endianness when reading TCEs from Alexey Kardashevskiy
- start rtasd before PCI probing from Greg Kurz
- PCI: rpaphp: Fix slot registration for multiple slots under a PHB
from Tyrel Datwyler
- powerpc/mm: Add memory barrier in __hugepte_alloc() from Sukadev
Bhattiprolu
Cleanups & fixes:
- Drop support for MPIC in pseries from Rashmica Gupta
- Define and use PPC64_ELF_ABI_v2/v1 from Michael Ellerman
- Remove unused symbols in asm-offsets.c from Rashmica Gupta
- Fix SRIOV not building without EEH enabled from Russell Currey
- Remove kretprobe_trampoline_holder from Thiago Jung Bauermann
- Reduce log level of PCI I/O space warning from Benjamin
Herrenschmidt
- Add array bounds checking to crash_shutdown_handlers from Suraj
Jitindar Singh
- Avoid -maltivec when using clang integrated assembler from Anton
Blanchard
- Fix array overrun in ppc_rtas() syscall from Andrew Donnellan
- Fix error return value in cmm_mem_going_offline() from Rasmus
Villemoes
- export cpu_to_core_id() from Mauricio Faria de Oliveira
- Remove old symbols from defconfigs from Andrew Donnellan
- Update obsolete comments in setup_32.c about entry conditions from
Benjamin Herrenschmidt
- Add comment explaining the purpose of setup_kdump_trampoline() from
Benjamin Herrenschmidt
- Merge the RELOCATABLE config entries for ppc32 and ppc64 from Kevin
Hao
- Remove RELOCATABLE_PPC32 from Kevin Hao
- Fix .long's in tlb-radix.c to more meaningful from Balbir Singh
Minor cleanups & fixes:
- Andrew Donnellan, Anna-Maria Gleixner, Anton Blanchard, Benjamin
Herrenschmidt, Bharata B Rao, Christophe Leroy, Colin Ian King,
Geliang Tang, Greg Kurz, Madhavan Srinivasan, Michael Ellerman,
Michael Ellerman, Stephen Rothwell, Stewart Smith.
Freescale updates from Scott:
- "Highlights include more 8xx optimizations, device tree updates,
and MVME7100 support."
PowerNV PCI hotplug from Gavin Shan:
- PCI: Add pcibios_setup_bridge()
- Override pcibios_setup_bridge()
- Remove PCI_RESET_DELAY_US
- Move pnv_pci_ioda_setup_opal_tce_kill() around
- Increase PE# capacity
- Allocate PE# in reverse order
- Create PEs in pcibios_setup_bridge()
- Setup PE for root bus
- Extend PCI bridge resources
- Make pnv_ioda_deconfigure_pe() visible
- Dynamically release PE
- Update bridge windows on PCI plug
- Delay populating pdn
- Support PCI slot ID
- Use PCI slot reset infrastructure
- Introduce pnv_pci_get_slot_id()
- Functions to get/set PCI slot state
- PCI/hotplug: PowerPC PowerNV PCI hotplug driver
- Print correct PHB type names
Power9 idle support from Shreyas B. Prabhu:
- set power_save func after the idle states are initialized
- Use PNV_THREAD_WINKLE macro while requesting for winkle
- make hypervisor state restore a function
- Rename idle_power7.S to idle_book3s.S
- Rename reusable idle functions to hardware agnostic names
- Make pnv_powersave_common more generic
- abstraction for saving SPRs before entering deep idle states
- Add platform support for stop instruction
- cpuidle/powernv: Use CPUIDLE_STATE_MAX instead of MAX_POWERNV_IDLE_STATES
- cpuidle/powernv: cleanup cpuidle-powernv.c
- cpuidle/powernv: Add support for POWER ISA v3 idle states
- Use deepest stop state when cpu is offlined
Power9 PMU from Madhavan Srinivasan:
- factor out power8 pmu macros and defines
- factor out power8 pmu functions
- factor out power8 __init_pmu code
- Add power9 event list macros for generic and cache events
- Power9 PMU support
- Export Power9 generic and cache events to sysfs
Power9 preliminary interrupt & PCI support from Benjamin Herrenschmidt:
- Add XICS emulation APIs
- Move a few exception common handlers to make room
- Add support for HV virtualization interrupts
- Add mechanism to force a replay of interrupts
- Add ICP OPAL backend
- Discover IODA3 PHBs
- pci: Remove obsolete SW invalidate
- opal: Add real mode call wrappers
- Rename TCE invalidation calls
- Remove SWINV constants and obsolete TCE code
- Rework accessing the TCE invalidate register
- Fallback to OPAL for TCE invalidations
- Use the device-tree to get available range of M64's
- Check status of a PHB before using it
- pci: Don't try to allocate resources that will be reassigned
Other Power9:
- Send SIGBUS on unaligned copy and paste from Chris Smart
- Large Decrementer support from Oliver O'Halloran
- Load Monitor Register Support from Jack Miller
Performance improvements from Anton Blanchard:
- Avoid load hit store in __giveup_fpu() and __giveup_altivec()
- Avoid load hit store in setup_sigcontext()
- Remove assembly versions of strcpy, strcat, strlen and strcmp
- Align hot loops of some string functions
eBPF JIT from Naveen N. Rao:
- Fix/enhance 32-bit Load Immediate implementation
- Optimize 64-bit Immediate loads
- Introduce rotate immediate instructions
- A few cleanups
- Isolate classic BPF JIT specifics into a separate header
- Implement JIT compiler for extended BPF
Operator Panel driver from Suraj Jitindar Singh:
- devicetree/bindings: Add binding for operator panel on FSP machines
- Add inline function to get rc from an ASYNC_COMP opal_msg
- Add driver for operator panel on FSP machines
Sparse fixes from Daniel Axtens:
- make some things static
- Introduce asm-prototypes.h
- Include headers containing prototypes
- Use #ifdef __BIG_ENDIAN__ #else for REG_BYTE
- kvm: Clarify __user annotations
- Pass endianness to sparse
- Make ppc_md.{halt, restart} __noreturn
MM fixes & cleanups from Aneesh Kumar K.V:
- radix: Update LPCR HR bit as per ISA
- use _raw variant of page table accessors
- Compile out radix related functions if RADIX_MMU is disabled
- Clear top 16 bits of va only on older cpus
- Print formation regarding the the MMU mode
- hash: Update SDR1 size encoding as documented in ISA 3.0
- radix: Update PID switch sequence
- radix: Update machine call back to support new HCALL.
- radix: Add LPID based tlb flush helpers
- radix: Add a kernel command line to disable radix
- Cleanup LPCR defines
Boot code consolidation from Benjamin Herrenschmidt:
- Move epapr_paravirt_early_init() to early_init_devtree()
- cell: Don't use flat device-tree after boot
- ge_imp3a: Don't use the flat device-tree after boot
- mpc85xx_ds: Don't use the flat device-tree after boot
- mpc85xx_rdb: Don't use the flat device-tree after boot
- Don't test for machine type in rtas_initialize()
- Don't test for machine type in smp_setup_cpu_maps()
- dt: Add of_device_compatible_match()
- Factor do_feature_fixup calls
- Move 64-bit feature fixup earlier
- Move 64-bit memory reserves to setup_arch()
- Use a cachable DART
- Move FW feature probing out of pseries probe()
- Put exception configuration in a common place
- Remove early allocation of the SMU command buffer
- Move MMU backend selection out of platform code
- pasemi: Remove IOBMAP allocation from platform probe()
- mm/hash: Don't use machine_is() early during boot
- Don't test for machine type to detect HEA special case
- pmac: Remove spurrious machine type test
- Move hash table ops to a separate structure
- Ensure that ppc_md is empty before probing for machine type
- Move 64-bit probe_machine() to later in the boot process
- Move 32-bit probe() machine to later in the boot process
- Get rid of ppc_md.init_early()
- Move the boot time info banner to a separate function
- Move setting of {i,d}cache_bsize to initialize_cache_info()
- Move the content of setup_system() to setup_arch()
- Move cache info inits to a separate function
- Re-order the call to smp_setup_cpu_maps()
- Re-order setup_panic()
- Make a few boot functions __init
- Merge 32-bit and 64-bit setup_arch()
Other new features:
- tty/hvc: Use IRQF_SHARED for OPAL hvc consoles from Sam Mendoza-Jonas
- tty/hvc: Use opal irqchip interface if available from Sam Mendoza-Jonas
- powerpc: Add module autoloading based on CPU features from Alastair D'Silva
- crypto: vmx - Convert to CPU feature based module autoloading from Alastair D'Silva
- Wake up kopald polling thread before waiting for events from Benjamin Herrenschmidt
- xmon: Dump ISA 2.06 SPRs from Michael Ellerman
- xmon: Dump ISA 2.07 SPRs from Michael Ellerman
- Add a parameter to disable 1TB segs from Oliver O'Halloran
- powerpc/boot: Add OPAL console to epapr wrappers from Oliver O'Halloran
- Assign fixed PHB number based on device-tree properties from Guilherme G. Piccoli
- pseries: Add pseries hotplug workqueue from John Allen
- pseries: Add support for hotplug interrupt source from John Allen
- pseries: Use kernel hotplug queue for PowerVM hotplug events from John Allen
- pseries: Move property cloning into its own routine from Nathan Fontenot
- pseries: Dynamic add entires to associativity lookup array from Nathan Fontenot
- pseries: Auto-online hotplugged memory from Nathan Fontenot
- pseries: Remove call to memblock_add() from Nathan Fontenot
cxl:
- Add set and get private data to context struct from Michael Neuling
- make base more explicitly non-modular from Paul Gortmaker
- Use for_each_compatible_node() macro from Wei Yongjun
- Frederic Barrat
- Abstract the differences between the PSL and XSL
- Make vPHB device node match adapter's
- Philippe Bergheaud
- Add mechanism for delivering AFU driver specific events
- Ignore CAPI adapters misplaced in switched slots
- Refine slice error debug messages
- Andrew Donnellan
- static-ify variables to fix sparse warnings
- PCI/hotplug: pnv_php: export symbols and move struct types needed by cxl
- PCI/hotplug: pnv_php: handle OPAL_PCI_SLOT_OFFLINE power state
- Add cxl_check_and_switch_mode() API to switch bi-modal cards
- remove dead Kconfig options
- fix potential NULL dereference in free_adapter()
- Ian Munsie
- Update process element after allocating interrupts
- Add support for CAPP DMA mode
- Fix allowing bogus AFU descriptors with 0 maximum processes
- Fix allocating a minimum of 2 pages for the SPA
- Fix bug where AFU disable operation had no effect
- Workaround XSL bug that does not clear the RA bit after a reset
- Fix NULL pointer dereference on kernel contexts with no AFU interrupts
- powerpc/powernv: Split cxl code out into a separate file
- Add cxl_slot_is_supported API
- Enable bus mastering for devices using CAPP DMA mode
- Move cxl_afu_get / cxl_afu_put to base
- Allow a default context to be associated with an external pci_dev
- Do not create vPHB if there are no AFU configuration records
- powerpc/powernv: Add support for the cxl kernel api on the real phb
- Add support for using the kernel API with a real PHB
- Add kernel APIs to get & set the max irqs per context
- Add preliminary workaround for CX4 interrupt limitation
- Add support for interrupts on the Mellanox CX4
- Workaround PE=0 hardware limitation in Mellanox CX4
- powerpc/powernv: Fix pci-cxl.c build when CONFIG_MODULES=n
selftests:
- Test unaligned copy and paste from Chris Smart
- Load Monitor Register Tests from Jack Miller
- Cyril Bur
- exec() with suspended transaction
- Use signed long to read perf_event_paranoid
- Fix usage message in context_switch
- Fix generation of vector instructions/types in context_switch
- Michael Ellerman
- Use "Delta" rather than "Error" in normal output
- Import Anton's mmap & futex micro benchmarks
- Add a test for PROT_SAO"
* tag 'powerpc-4.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (263 commits)
powerpc/mm: Parenthesise IS_ENABLED() in if condition
tty/hvc: Use opal irqchip interface if available
tty/hvc: Use IRQF_SHARED for OPAL hvc consoles
selftests/powerpc: exec() with suspended transaction
powerpc: Improve comment explaining why we modify VRSAVE
powerpc/mm: Drop unused externs for hpte_init_beat[_v3]()
powerpc/mm: Rename hpte_init_lpar() and move the fallback to a header
powerpc/mm: Fix build break when PPC_NATIVE=n
crypto: vmx - Convert to CPU feature based module autoloading
powerpc: Add module autoloading based on CPU features
powerpc/powernv/ioda: Fix endianness when reading TCEs
powerpc/mm: Add memory barrier in __hugepte_alloc()
powerpc/modules: Never restore r2 for a mprofile-kernel style mcount() call
powerpc/ftrace: Separate the heuristics for checking call sites
powerpc: Merge 32-bit and 64-bit setup_arch()
powerpc/64: Make a few boot functions __init
powerpc: Re-order setup_panic()
powerpc: Re-order the call to smp_setup_cpu_maps()
powerpc/32: Move cache info inits to a separate function
powerpc/64: Move the content of setup_system() to setup_arch()
...
The IV output vectors should only be copied from the _complete operation
and not from the _process operation, i.e only from the operation that is
designed to copy the result of the request to the right location. This
copy is already done in the _complete operation, so this commit removes
the duplicated code in the _process op.
Fixes: 3610d6cd5231 ("crypto: marvell - Add a complete...")
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
So far, the cache of the ahash requests was updated from the 'complete'
operation. This complete operation is called from mv_cesa_tdma_process
before the cleanup operation, which means that the content of req->src
can be read and copied when it is still mapped. This commit fixes the
issue by moving this cache update from mv_cesa_ahash_complete to
mv_cesa_ahash_req_cleanup, so the copy is done once the sglist is
unmapped.
Fixes: 1bf6682cb3 ("crypto: marvell - Add a complete operation for..")
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The flag CRYPTO_TFM_REQ_MAY_BACKLOG is optional and can be set from the
user to put requests into the backlog queue when the main cryptographic
queue is full. Before calling mv_cesa_tdma_chain we must check the value
of the return status to be sure that the current request has been
correctly queued or added to the backlog.
Fixes: 85030c5168 ("crypto: marvell - Add support for chaining...")
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
So far in mv_cesa_ablkcipher_dma_req_init, if an error is thrown while
the tdma chain is built there is a memory leak. This issue exists
because the chain is assigned later at the end of the function, so the
cleanup function is called with the wrong version of the chain.
Fixes: db509a4533 ("crypto: marvell/cesa - add TDMA support")
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Pull crypto updates from Herbert Xu:
"Here is the crypto update for 4.8:
API:
- first part of skcipher low-level conversions
- add KPP (Key-agreement Protocol Primitives) interface.
Algorithms:
- fix IPsec/cryptd reordering issues that affects aesni
- RSA no longer does explicit leading zero removal
- add SHA3
- add DH
- add ECDH
- improve DRBG performance by not doing CTR by hand
Drivers:
- add x86 AVX2 multibuffer SHA256/512
- add POWER8 optimised crc32c
- add xts support to vmx
- add DH support to qat
- add RSA support to caam
- add Layerscape support to caam
- add SEC1 AEAD support to talitos
- improve performance by chaining requests in marvell/cesa
- add support for Araneus Alea I USB RNG
- add support for Broadcom BCM5301 RNG
- add support for Amlogic Meson RNG
- add support Broadcom NSP SoC RNG"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (180 commits)
crypto: vmx - Fix aes_p8_xts_decrypt build failure
crypto: vmx - Ignore generated files
crypto: vmx - Adding support for XTS
crypto: vmx - Adding asm subroutines for XTS
crypto: skcipher - add comment for skcipher_alg->base
crypto: testmgr - Print akcipher algorithm name
crypto: marvell - Fix wrong flag used for GFP in mv_cesa_dma_add_iv_op
crypto: nx - off by one bug in nx_of_update_msc()
crypto: rsa-pkcs1pad - fix rsa-pkcs1pad request struct
crypto: scatterwalk - Inline start/map/done
crypto: scatterwalk - Remove unnecessary BUG in scatterwalk_start
crypto: scatterwalk - Remove unnecessary advance in scatterwalk_pagedone
crypto: scatterwalk - Fix test in scatterwalk_done
crypto: api - Optimise away crypto_yield when hard preemption is on
crypto: scatterwalk - add no-copy support to copychunks
crypto: scatterwalk - Remove scatterwalk_bytes_sglen
crypto: omap - Stop using crypto scatterwalk_bytes_sglen
crypto: skcipher - Remove top-level givcipher interface
crypto: user - Remove crypto_lookup_skcipher call
crypto: cts - Convert to skcipher
...
Pull s390 updates from Martin Schwidefsky:
"There are a couple of new things for s390 with this merge request:
- a new scheduling domain "drawer" is added to reflect the unusual
topology found on z13 machines. Performance tests showed up to 8
percent gain with the additional domain.
- the new crc-32 checksum crypto module uses the vector-galois-field
multiply and sum SIMD instruction to speed up crc-32 and crc-32c.
- proper __ro_after_init support, this requires RO_AFTER_INIT_DATA in
the generic vmlinux.lds linker script definitions.
- kcov instrumentation support. A prerequisite for that is the
inline assembly basic block cleanup, which is the reason for the
net/iucv/iucv.c change.
- support for 2GB pages is added to the hugetlbfs backend.
Then there are two removals:
- the oprofile hardware sampling support is dead code and is removed.
The oprofile user space uses the perf interface nowadays.
- the ETR clock synchronization is removed, this has been superseeded
be the STP clock synchronization. And it always has been
"interesting" code..
And the usual bug fixes and cleanups"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (82 commits)
s390/pci: Delete an unnecessary check before the function call "pci_dev_put"
s390/smp: clean up a condition
s390/cio/chp : Remove deprecated create_singlethread_workqueue
s390/chsc: improve channel path descriptor determination
s390/chsc: sanitize fmt check for chp_desc determination
s390/cio: make fmt1 channel path descriptor optional
s390/chsc: fix ioctl CHSC_INFO_CU command
s390/cio/device_ops: fix kernel doc
s390/cio: allow to reset channel measurement block
s390/console: Make preferred console handling more consistent
s390/mm: fix gmap tlb flush issues
s390/mm: add support for 2GB hugepages
s390: have unique symbol for __switch_to address
s390/cpuinfo: show maximum thread id
s390/ptrace: clarify bits in the per_struct
s390: stack address vs thread_info
s390: remove pointless load within __switch_to
s390: enable kcov support
s390/cpumf: use basic block for ecctr inline assembly
s390/hypfs: use basic block for diag inline assembly
...
This patch utilises the GENERIC_CPU_AUTOPROBE infrastructure
to automatically load the vmx_crypto module if the CPU supports
it.
Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Parallel build can sporadically fail because asn1 headers may
not be built yet by the time qat_asym_algs.o is compiled:
drivers/crypto/qat/qat_common/qat_asym_algs.c:55:32: fatal error: qat_rsapubkey-asn1.h: No such file or directory
#include "qat_rsapubkey-asn1.h"
Cc: stable@vger.kernel.org
Signed-off-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
We use _GLOBAL so there is no need to do the manual alignment,
in fact it causes a build failure.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Ignore assembly files generated by the perl script.
Signed-off-by: Paulo Flabiano Smorigo <pfsmorigo@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch add XTS support using VMX-crypto driver.
Signed-off-by: Leonidas S. Barbosa <leosilva@linux.vnet.ibm.com>
Signed-off-by: Paulo Flabiano Smorigo <pfsmorigo@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch add XTS subroutines using VMX-crypto driver.
It gives a boost of 20 times using XTS.
These code has been adopted from OpenSSL project in collaboration
with the original author (Andy Polyakov <appro@openssl.org>).
Signed-off-by: Leonidas S. Barbosa <leosilva@linux.vnet.ibm.com>
Signed-off-by: Paulo Flabiano Smorigo <pfsmorigo@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use the parameter 'gfp_flags' instead of 'flag' as second argument of
dma_pool_alloc(). The parameter 'flag' is for the TDMA descriptor, its
content has no sense for the allocator.
Fixes: bac8e805a3 ("crypto: marvell - Copy IV vectors by DMA...")
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The props->ap[] array is defined like this:
struct alg_props ap[NX_MAX_FC][NX_MAX_MODE][3];
So we can see that if msc->fc and msc->mode are == to NX_MAX_FC or
NX_MAX_MODE then we're off by one.
Fixes: ae0222b728 ('powerpc/crypto: nx driver code supporting nx encryption')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
We already have a generic function sg_nents_for_len which does
the same thing. This patch switches omap over to it and also
adds error handling in case the SG list is short.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
There is not need to drop leading zeros from the RSA output
operations results.
Signed-off-by: Salvatore Benedetto <salvatore.benedetto@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add DH support under kpp api. Drop struct qat_rsa_request and
introduce a more generic struct qat_asym_request and share it
between RSA and DH requests.
Signed-off-by: Salvatore Benedetto <salvatore.benedetto@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Extend qat driver to use RSA CRT mode when all CRT related components are
present in the private key. Simplify code in qat_rsa_setkey by adding
qat_rsa_clear_ctx.
Signed-off-by: Salvatore Benedetto <salvatore.benedetto@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Different product families will use FLR or SBR.
Virtual Function devices have no reset method.
Signed-off-by: Conor McLoughlin <conor.mcloughlin@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Remove unneeded error handling on the result of a call to
platform_get_resource when the value is passed to
devm_ioremap_resource.
The Coccinelle semantic patch that makes this change is as follows:
// <smpl>
@@
expression pdev,res,n,e,e1;
expression ret != 0;
identifier l;
@@
- res = platform_get_resource(pdev, IORESOURCE_MEM, n);
... when != res
- if (res == NULL) { ... \(goto l;\|return ret;\) }
... when != res
+ res = platform_get_resource(pdev, IORESOURCE_MEM, n);
e = devm_ioremap_resource(e1, res);
// </smpl>
Signed-off-by: Amitoj Kaur Chawla <amitoj1606@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add RSA support to caam driver.
Initial author is Yashpal Dutta <yashpal.dutta@freescale.com>.
Signed-off-by: Tudor Ambarus <tudor-dan.ambarus@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Drop all asn1 related code and use the new rsa_helper
functions rsa_parse_[pub|priv]_key for parsing the key
Signed-off-by: Salvatore Benedetto <salvatore.benedetto@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The arm-neon-sha implementations have cra_priority of 150...300, so
increase omap-sham priority to 400 to ensure it is on top of any
software alg.
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch replaces use of the obsolete ablkcipher with skcipher.
It also removes shash_fallback which is totally unused.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The ARM allmodconfig build currently warngs because of the
ux500 crypto driver not working well with the jump label
implementation that we started using for dynamic debug, which
breaks building with 'gcc -O0':
In file included from /git/arm-soc/include/linux/jump_label.h:105:0,
from /git/arm-soc/include/linux/dynamic_debug.h:5,
from /git/arm-soc/include/linux/printk.h:289,
from /git/arm-soc/include/linux/kernel.h:13,
from /git/arm-soc/include/linux/clk.h:16,
from /git/arm-soc/drivers/crypto/ux500/hash/hash_core.c:16:
/git/arm-soc/arch/arm/include/asm/jump_label.h: In function 'hash_set_dma_transfer':
/git/arm-soc/arch/arm/include/asm/jump_label.h:13:7: error: asm operand 0 probably doesn't match constraints [-Werror]
asm_volatile_goto("1:\n\t"
Turning off compiler optimizations has never really been supported
here, and it's only used when debugging the driver. I have not found
a good reason for doing this here, other than a misguided attempt
to produce more readable assembly output. Also, the driver is only
used in obsolete hardware that almost certainly nobody will spend
time debugging any more.
This just removes the -O0 flag from the compiler options.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Adds software fallback support for small crypto requests. In these cases,
it is undesirable to use DMA, as setting it up itself is rather heavy
operation. Gives about 40% extra performance in ipsec usecase.
Signed-off-by: Bin Liu <b-liu@ti.com>
[t-kristo@ti.com: dropped the extra traces, updated some comments
on the code]
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The extra call to dmaengine_terminate_all is not needed, as the DMA
is not running at this point. This improves performance slightly.
Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Change crypto queue size from 1 to 10 for omap SHA driver. This should
allow clients to enqueue requests more effectively to avoid serializing
whole crypto sequences, giving extra performance.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Calling runtime PM API for every block causes serious performance hit to
crypto operations that are done on a long buffer. As crypto is performed
on a page boundary, encrypting large buffers can cause a series of crypto
operations divided by page. The runtime PM API is also called those many
times.
Convert the driver to use runtime_pm autosuspend instead, with a default
timeout value of 1 second. This results in upto ~50% speedup.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Now that crypto requests are chained together at the DMA level, we
increase the size of the crypto queue for each engine. The result is
that as the backlog list is reached later, it does not stop the crypto
stack from sending asychronous requests, so more cryptographic tasks
are processed by the engines.
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The Cryptographic Engines and Security Accelerators (CESA) supports the
Multi-Packet Chain Mode. With this mode enabled, multiple tdma requests
can be chained and processed by the hardware without software
intervention. This mode was already activated, however the crypto
requests were not chained together. By doing so, we reduce significantly
the number of IRQs. Instead of being interrupted at the end of each
crypto request, we are interrupted at the end of the last cryptographic
request processed by the engine.
This commits re-factorizes the code, changes the code architecture and
adds the required data structures to chain cryptographic requests
together before sending them to an engine (stopped or possibly already
running).
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This commits adds support for fine grained load balancing on
multi-engine IPs. The engine is pre-selected based on its current load
and on the weight of the crypto request that is about to be processed.
The global crypto queue is also moved to each engine. These changes are
required to allow chaining crypto requests at the DMA level. By using
a crypto queue per engine, we make sure that we keep the state of the
tdma chain synchronized with the crypto queue. We also reduce contention
on 'cesa_dev->lock' and improve parallelism.
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently the crypto requests were sent to engines sequentially.
This commit moves the SRAM I/O operations from the prepare to the step
functions. It provides flexibility for future works and allow to prepare
a request while the engine is running.
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
So far, the 'process' operation was used to check if the current request
was correctly handled by the engine, if it was the case it copied
information from the SRAM to the main memory. Now, we split this
operation. We keep the 'process' operation, which still checks if the
request was correctly handled by the engine or not, then we add a new
operation for completion. The 'complete' method copies the content of
the SRAM to memory. This will soon become useful if we want to call
the process and the complete operations from different locations
depending on the type of the request (different cleanup logic).
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently, the only way to access the tdma chain is to use the 'req'
union from a mv_cesa_{ablkcipher,ahash}. This will soon become a problem
if we want to handle the TDMA chaining vs standard/non-DMA processing in
a generic way (with generic functions at the cesa.c level detecting
whether the request should be queued at the DMA level or not). Hence the
decision to move the chain field a the mv_cesa_req level at the expense
of adding 2 void * fields to all request contexts (including non-DMA
ones) and to remove the type completly. To limit the overhead, we get
rid of the type field, which can now be deduced from the req->chain.first
value. Once these changes are done the union is no longer needed, so
remove it and move mv_cesa_ablkcipher_std_req and mv_cesa_req
to mv_cesa_ablkcipher_req directly. There are also no needs to keep the
'base' field into the union of mv_cesa_ahash_req, so move it into the
upper structure.
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add a TDMA descriptor at the end of the request for copying the
output IV vector via a DMA transfer. This is a good way for offloading
as much as processing as possible to the DMA and the crypto engine.
This is also required for processing multiple cipher requests
in chained mode, otherwise the content of the IV vector would be
overwritten by the last processed request.
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
So far, the way that the type of a TDMA operation was checked was wrong.
We have to use the type mask in order to get the right part of the flag
containing the type of the operation.
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add a BUG_ON() call when the driver tries to launch a crypto request
while the engine is still processing the previous one. This replaces
a silent system hang by a verbose kernel panic with the associated
backtrace to let the user know that something went wrong in the CESA
driver.
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Adding a macro constant to be used for the size of the crypto queue,
instead of using a numeric value directly. It will be easier to
maintain in case we add more than one crypto queue of the same size.
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
EXTRA_CFLAGS is still supported but its usage is deprecated.
Signed-off-by: Tudor Ambarus <tudor-dan.ambarus@nxp.com>
Reviewed-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add a crypto API module to access the vector extension based CRC-32
implementations. Users can request the optimized implementation through
the shash crypto API interface.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
alloc_workqueue replaces deprecated create_workqueue().
The workqueue device_reset_wq has workitem &reset_data->reset_work per
adf_reset_dev_data. The workqueue pf2vf_resp_wq is a workqueue for
PF2VF responses has workitem &pf2vf_resp->pf2vf_resp_work per pf2vf_resp.
The workqueue adf_vf_stop_wq is used to call adf_dev_stop()
asynchronously.
Dedicated workqueues have been used in all cases since the workitems
on the workqueues are involved in operation of crypto which can be used in
the IO path which is depended upon during memory reclaim. Hence,
WQ_MEM_RECLAIM has been set to gurantee forward progress under memory
pressure.
Since there are only a fixed number of work items, explicit concurrency
limit is unnecessary.
Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The hash buffer is really HASH_BLOCK_SIZE bytes, someone
must have thought that memmove takes n*u32 words by mistake.
Tests work as good/bad as before after this patch.
Cc: Joakim Bech <joakim.bech@linaro.org>
Cc: stable@vger.kernel.org
Reported-by: David Binderman <linuxdev.baldrick@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
All of the VMX AES ciphers (AES, AES-CBC and AES-CTR) are set at
priority 1000. Unfortunately this means we never use AES-CBC and
AES-CTR, because the base AES-CBC cipher that is implemented on
top of AES inherits its priority.
To fix this, AES-CBC and AES-CTR have to be a higher priority. Set
them to 2000.
Testing on a POWER8 with:
cryptsetup benchmark --cipher aes --key-size 256
Shows decryption speed increase from 402.4 MB/s to 3069.2 MB/s,
over 7x faster. Thanks to Mike Strosaker for helping me debug
this issue.
Fixes: 8c755ace35 ("crypto: vmx - Adding CBC routines for VMX module")
Cc: stable@vger.kernel.org
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When calling ppc-xlate.pl, we pass it either linux-ppc64 or
linux-ppc64le. The script however was expecting linux64le, a result
of its OpenSSL origins. This means we aren't obeying the ppc64le
ABIv2 rules.
Fix this by checking for linux-ppc64le.
Fixes: 5ca5573820 ("crypto: vmx - comply with ABIs that specify vrsave as reserved.")
Cc: stable@vger.kernel.org
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
SEC1 doesn't have IPSEC_ESP descriptor type but it is able to perform
IPSEC using HMAC_SNOOP_NO_AFEU, which is also existing on SEC2
In order to be able to define descriptors templates for SEC1 without
breaking SEC2+, we have to give lower priority to HMAC_SNOOP_NO_AFEU
so that SEC2+ selects IPSEC_ESP and not HMAC_SNOOP_NO_AFEU which is
less performant.
This is done by adding a priority field in the template. If the field
is 0, we use the default priority, otherwise we used the one in the
field.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patchs enhances the IPSEC_ESP related functions for them to
also supports the same operations with descriptor type
HMAC_SNOOP_NO_AFEU.
The differences between the two descriptor types are:
* pointeurs 2 and 3 are swaped (Confidentiality key and
Primary EU Context IN)
* HMAC_SNOOP_NO_AFEU has CICV out in pointer 6
* HMAC_SNOOP_NO_AFEU has no primary EU context out so we get it
from the end of data out
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In preparation of IPSEC for SEC1, first step is to make the mapping
helpers more generic so that they can also be used by AEAD functions.
First, the functions are moved before IPSEC functions in talitos.c
talitos_sg_unmap() and unmap_sg_talitos_ptr() are merged as they
are quite similar, the second one handling the SEC1 case an calling
the first one for SEC2
map_sg_in_talitos_ptr() and map_sg_out_talitos_ptr() are merged
into talitos_sg_map() and enhenced to support offseted zones
as used for AEAD. The actual mapping is now performed outside that
helper. The DMA sync is also done outside to not make it several
times.
talitos_edesc_alloc() size calculation are fixed to also take into
account AEAD specific parts also for SEC1
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In order to be able to use the mapping/unmapping helpers for IPSEC
it needs to be move upper in the file
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use helper for all modifications to talitos_ptr in preparation to
the implementation of AEAD for SEC1
to_talitos_ptr_extent_clear() has been removed in favor of
to_talitos_ptr_ext_set() to set any value and
to_talitos_ptr_ext_or() to or the extent field with a value
name has been shorten to help keeping single lines of 80 chars
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Algorithms can be registered only once. So skip registration of
algorithms if already registered (i.e. in case we have two AES cores
in the system.)
Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Bring some consistency by:
1. Replacing fixed-space indentation of structure members with just
tabs.
2. Remove indentation in declaration of local variable between type and
name. Driver was mixing usage of such indentation and lack of it.
When removing indentation, reorder variables in
reversed-christmas-tree order with first variables being initialized
ones.
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Acked-by: Vladimir Zapolskiy <vz@mleia.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This basically adds support for ls1043a platform.
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
There are SoCs like LS1043A where CAAM endianness (BE) does not match
the default endianness of the core (LE).
Moreover, there are requirements for the driver to handle cases like
CPU_BIG_ENDIAN=y on ARM-based SoCs.
This requires for a complete rewrite of the I/O accessors.
PPC-specific accessors - {in,out}_{le,be}XX - are replaced with
generic ones - io{read,write}[be]XX.
Endianness is detected dynamically (at runtime) to allow for
multiplatform kernels, for e.g. running the same kernel image
on LS1043A (BE CAAM) and LS2080A (LE CAAM) armv8-based SoCs.
While here: debugfs entries need to take into consideration the
endianness of the core when displaying data. Add the necessary
glue code so the entries remain the same, but they are properly
read, regardless of the core and/or SEC endianness.
Note: pdb.h fixes only what is currently being used (IPsec).
Reviewed-by: Tudor Ambarus <tudor-dan.ambarus@nxp.com>
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Alex Porosanu <alexandru.porosanu@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The offset field is 13 bits wide; make sure we don't overwrite more than
that in the caam hardware scatter gather structure.
Signed-off-by: Cristian Stoica <cristian.stoica@freescale.com>
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The sizeof(*ctx->dec_cd) and sizeof(*ctx->enc_cd) are equal,
but we should use the correct one for freeing memory anyway.
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Pull crypto fixes from Herbert Xu:
"This fixes the following issues:
- missing selection in public_key that may result in a build failure
- Potential crash in error path in omap-sham
- ccp AES XTS bug that affects requests larger than 4096"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: ccp - Fix AES XTS error for request sizes above 4096
crypto: public_key: select CRYPTO_AKCIPHER
crypto: omap-sham - potential Oops on error in probe
Most users of IS_ERR_VALUE() in the kernel are wrong, as they
pass an 'int' into a function that takes an 'unsigned long'
argument. This happens to work because the type is sign-extended
on 64-bit architectures before it gets converted into an
unsigned type.
However, anything that passes an 'unsigned short' or 'unsigned int'
argument into IS_ERR_VALUE() is guaranteed to be broken, as are
8-bit integers and types that are wider than 'unsigned long'.
Andrzej Hajda has already fixed a lot of the worst abusers that
were causing actual bugs, but it would be nice to prevent any
users that are not passing 'unsigned long' arguments.
This patch changes all users of IS_ERR_VALUE() that I could find
on 32-bit ARM randconfig builds and x86 allmodconfig. For the
moment, this doesn't change the definition of IS_ERR_VALUE()
because there are probably still architecture specific users
elsewhere.
Almost all the warnings I got are for files that are better off
using 'if (err)' or 'if (err < 0)'.
The only legitimate user I could find that we get a warning for
is the (32-bit only) freescale fman driver, so I did not remove
the IS_ERR_VALUE() there but changed the type to 'unsigned long'.
For 9pfs, I just worked around one user whose calling conventions
are so obscure that I did not dare change the behavior.
I was using this definition for testing:
#define IS_ERR_VALUE(x) ((unsigned long*)NULL == (typeof (x)*)NULL && \
unlikely((unsigned long long)(x) >= (unsigned long long)(typeof(x))-MAX_ERRNO))
which ends up making all 16-bit or wider types work correctly with
the most plausible interpretation of what IS_ERR_VALUE() was supposed
to return according to its users, but also causes a compile-time
warning for any users that do not pass an 'unsigned long' argument.
I suggested this approach earlier this year, but back then we ended
up deciding to just fix the users that are obviously broken. After
the initial warning that caused me to get involved in the discussion
(fs/gfs2/dir.c) showed up again in the mainline kernel, Linus
asked me to send the whole thing again.
[ Updated the 9p parts as per Al Viro - Linus ]
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Andrzej Hajda <a.hajda@samsung.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.org/lkml/2016/1/7/363
Link: https://lkml.org/lkml/2016/5/27/486
Acked-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> # For nvmem part
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The ccp-crypto module for AES XTS support has a bug that can allow requests
greater than 4096 bytes in size to be passed to the CCP hardware. The CCP
hardware does not support request sizes larger than 4096, resulting in
incorrect output. The request should actually be handled by the fallback
mechanism instantiated by the ccp-crypto module.
Add a check to insure the request size is less than or equal to the maximum
supported size and use the fallback mechanism if it is not.
Cc: <stable@vger.kernel.org> # 3.14.x-
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This if statement is reversed so we end up either leaking or Oopsing on
error.
Fixes: dbe246209b ('crypto: omap-sham - Use dma_request_chan() for requesting DMA channel')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Change the adf_ctl_stop_devices to a void function.
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
caam_jr_alloc() used to return NULL if a JR device could not be
allocated for a session. In turn, every user of this function used
IS_ERR() function to verify if anything went wrong, which does NOT look
for NULL values. This made the kernel crash if the sanity check failed,
because the driver continued to think it had allocated a valid JR dev
instance to the session and at some point it tries to do a caam_jr_free()
on a NULL JR dev pointer.
This patch is a fix for this issue.
Cc: <stable@vger.kernel.org>
Signed-off-by: Catalin Vasile <cata.vasile@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
It gives significant improvements ( ~+15%) on some modes.
These code has been adopted from OpenSSL project in collaboration
with the original author (Andy Polyakov <appro@openssl.org>).
Signed-off-by: Paulo Flabiano Smorigo <pfsmorigo@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>