The cc_cipher_handle structure contains only a single member, and only
one instance exists. Simplify the code and reduce memory consumption by
moving this member to struct cc_drvdata.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The buff_mgr_handle structure contains only a single member, and only
one instance exists. Simplify the code and reduce memory consumption by
moving this member to struct cc_drvdata.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The cc_debugfs_ctx structure contains only a single member, and only one
instance exists. Simplify the code and reduce memory consumption by
moving this member to struct cc_drvdata.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The cc_sram_ctx structure contains only a single member, and only one
instance exists. Simplify the code and reduce memory consumption by
moving this member to struct cc_drvdata.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
cc_pm_suspend() and cc_pm_resume() are not used outside
drivers/crypto/ccree/cc_pm.c.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
If CONFIG_PM=y, cc_pm_is_dev_suspended() is just a wrapper around
pm_runtime_suspended().
If CONFIG_PM=n, cc_pm_is_dev_suspended() a dummy that behaves exactly
the same as the dummy for pm_runtime_suspended().
Hence remove cc_pm_is_dev_suspended(), and call pm_runtime_suspended()
directly.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
If the driver is probed, it means a match was found in
arm_ccree_dev_of_match[]. Hence we can just use the
of_device_get_match_data() helper.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently, a large part of the probe function runs before Runtime PM is
enabled. As the driver manages the device's clock manually, this may
work fine on some systems, but may break on platforms with a more
complex power hierarchy.
Fix this by moving the initialization of Runtime PM before the first
register access (in cc_wait_for_reset_completion()), and putting the
device to sleep only after the last access (in cc_set_ree_fips_status()).
This allows to remove the pm_on flag, which was used to track manually
if Runtime PM had been enabled or not.
Remove the cc_pm_{init,go,fini}() wrappers, as they are called only
once, and obscure operation.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
SRAM addresses are small integer offsets into local SRAM. Currently
they are stored using a mixture of cc_sram_addr_t (u64), u32, and
dma_addr_t types.
Settle on u32, and remove the cc_sram_addr_t typedefs.
This allows to drop several casts.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The SRAM allocator does not support deallocating memory.
Hence remove all references to freeing SRAM.
Fix grammar while at it.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
While the larval digest addresses are not always used in
cc_get_plain_hmac_key() and cc_hash_digest(), they are always
calculated.
Defer their calculations to the points where needed.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use the existing lower_32_bits() and upper_32_bits() macros instead of
explicit casts and shifts to split a 64-bit address in its two 32-bit
parts.
Drop the superfluous cast to "u16", as the FIELD_PREP() macro already
masks it to the specified field width.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
mlli_params.mlli_virt_addr is just a buffer of memory.
This allows to drop a cast.
No change in generated code.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use devm_clk_get_optional() instead of devm_clk_get() and explicit
optional clock handling.
As clk_prepare_enable() and clk_disable_unprepare() handle optional
clocks fine, the cc_clk_on() and cc_clk_off() wrappers can be removed.
While at it, use the new "%pe" format specifier to print error codes.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
cc_sram_mgr_fini() doesn't do anything, so it can just be removed.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When no SRAM can be allocated, cc_sram_alloc() already prints an error
message. Hence there is no need to duplicate this in all callers.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Due to the way the hardware works, every double word in the SHA384 and
SHA512 larval hashes must be swapped. Currently this is done at run
time, during driver initialization.
However, this swapping can easily be done at build time. Treating each
double word as two words has the benefit of changing the larval hashes'
types from u64[] to u32[], like for all other hashes, and allows
dropping the casts and size doublings when calling cc_set_sram_desc().
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Unneeded casts prevent the compiler from performing valuable checks.
This is especially true for function pointers.
Remove these casts, to prevent silently introducing bugs when a
variable's type might be changed in the future.
No change in generated code.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
If cc_queues_status() indicates that the queue is full,
cc_send_sync_request() should loop and retry.
However, cc_queues_status() returns either 0 (for success), or -ENOSPC
(for queue full), while cc_send_sync_request() checks for real errors by
comparing with -EAGAIN. Hence -ENOSPC is always considered a real
error, and the code never retries the operation.
Fix this by just removing the check, as cc_queues_status() never returns
any other error value than -ENOSPC.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Reading the debugfs files under /sys/kernel/debug/ccree/ can be done by
the user at any time. On R-Car SoCs, the CCREE device is power-managed
using a moduile clock, and if this clock is not running, bogus register
values may be read.
Fix this by filling in the debugfs_regset32.dev field, so debugfs will
make sure the device is resumed while its registers are being read.
This fixes the bogus values (0x00000260) in the register dumps on R-Car
H3 ES1.0:
-e6601000.crypto/regs:HOST_IRR = 0x00000260
-e6601000.crypto/regs:HOST_POWER_DOWN_EN = 0x00000260
+e6601000.crypto/regs:HOST_IRR = 0x00000038
+e6601000.crypto/regs:HOST_POWER_DOWN_EN = 0x00000038
e6601000.crypto/regs:AXIM_MON_ERR = 0x00000000
e6601000.crypto/regs:DSCRPTR_QUEUE_CONTENT = 0x000002aa
-e6601000.crypto/regs:HOST_IMR = 0x00000260
+e6601000.crypto/regs:HOST_IMR = 0x017ffeff
e6601000.crypto/regs:AXIM_CFG = 0x001f0007
e6601000.crypto/regs:AXIM_CACHE_PARAMS = 0x00000000
-e6601000.crypto/regs:GPR_HOST = 0x00000260
+e6601000.crypto/regs:GPR_HOST = 0x017ffeff
e6601000.crypto/regs:AXIM_MON_COMP = 0x00000000
-e6601000.crypto/version:SIGNATURE = 0x00000260
-e6601000.crypto/version:VERSION = 0x00000260
+e6601000.crypto/version:SIGNATURE = 0xdcc63000
+e6601000.crypto/version:VERSION = 0xaf400001
Note that this behavior is system-dependent, and the issue does not show
up on all R-Car Gen3 SoCs and boards. Even when the device is
suspended, the module clock may be left enabled, if configured by the
firmware for Secure Mode, or when controlled by the Real-Time Core.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Gilad Ben-Yossef <gilad@benyossef.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund@ragnatech.se>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Hardware registers of devices under control of power management cannot
be accessed at all times. If such a device is suspended, register
accesses may lead to undefined behavior, like reading bogus values, or
causing exceptions or system lock-ups.
Extend struct debugfs_regset32 with an optional field to let device
drivers specify the device the registers in the set belong to. This
allows debugfs_show_regset32() to make sure the device is resumed while
its registers are being read.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Niklas Söderlund <niklas.soderlund@ragnatech.se>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Register qm to uacce framework for user crypto driver
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Remove the module_param uacce_mode, which is not used currently.
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Uacce (Unified/User-space-access-intended Accelerator Framework) targets to
provide Shared Virtual Addressing (SVA) between accelerators and processes.
So accelerator can access any data structure of the main cpu.
This differs from the data sharing between cpu and io device, which share
only data content rather than address.
Since unified address, hardware and user space of process can share the
same virtual address in the communication.
Uacce create a chrdev for every registration, the queue is allocated to
the process when the chrdev is opened. Then the process can access the
hardware resource by interact with the queue file. By mmap the queue
file space to user space, the process can directly put requests to the
hardware without syscall to the kernel space.
The IOMMU core only tracks mm<->device bonds at the moment, because it
only needs to handle IOTLB invalidation and PASID table entries. However
uacce needs a finer granularity since multiple queues from the same
device can be bound to an mm. When the mm exits, all bound queues must
be stopped so that the IOMMU can safely clear the PASID table entry and
reallocate the PASID.
An intermediate struct uacce_mm links uacce devices and queues.
Note that an mm may be bound to multiple devices but an uacce_mm
structure only ever belongs to a single device, because we don't need
anything more complex (if multiple devices are bound to one mm, then
we'll create one uacce_mm for each bond).
uacce_device --+-- uacce_mm --+-- uacce_queue
| '-- uacce_queue
|
'-- uacce_mm --+-- uacce_queue
+-- uacce_queue
'-- uacce_queue
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Kenneth Lee <liguozhu@hisilicon.com>
Signed-off-by: Zaibo Xu <xuzaibo@huawei.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Uacce (Unified/User-space-access-intended Accelerator Framework) is
a kernel module targets to provide Shared Virtual Addressing (SVA)
between the accelerator and process.
This patch add document to explain how it works.
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Kenneth Lee <liguozhu@hisilicon.com>
Signed-off-by: Zaibo Xu <xuzaibo@huawei.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
According to Geert's report[0],
kernel/padata.c: warning: 'err' may be used uninitialized in this
function [-Wuninitialized]: => 539:2
Warning is seen only with older compilers on certain archs. The
runtime effect is potentially returning garbage down the stack when
padata's cpumasks are modified before any pcrypt requests have run.
Simplest fix is to initialize err to the success value.
[0] http://lkml.kernel.org/r/20200210135506.11536-1-geert@linux-m68k.org
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Fixes: bbefa1dd6a ("crypto: pcrypt - Avoid deadlock by using per-instance padata queues")
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The imx-rngc driver binds to devices that are compatible to
"fsl,imx25-rngb". Grepping through the device tree sources suggests this
only exists on i.MX25. So restrict dependencies to configs that have
this SoC enabled, but allow compile testing. For the latter additional
dependencies for clk and readl/writel are necessary.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
chtls_uld_add allocates room for info->nports net_device structs
following the chtls_dev struct, presumably because it was originally
intended that the ports array would be stored there. This is suggested
by the assignment which was present in initial versions and removed by
c4e848586c ("crypto: chelsio - remove redundant assignment to
cdev->ports"):
cdev->ports = (struct net_device **)(cdev + 1);
This assignment was never used, being overwritten by lldi->ports
immediately afterwards, and I couldn't find any uses of the memory
allocated past the end of the struct.
Signed-off-by: Stephen Kitt <steve@sk2.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
QCE hangs when presented with an AES-XTS request whose length is larger
than QCE_SECTOR_SIZE (512-bytes), and is not a multiple of it. Let the
fallback cipher handle them.
Signed-off-by: Eneas U de Queiroz <cotequeiroz@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Process small blocks using the fallback cipher, as a workaround for an
observed failure (DMA-related, apparently) when computing the GCM ghash
key. This brings a speed gain as well, since it avoids the latency of
using the hardware engine to process small blocks.
Using software for all 16-byte requests would be enough to make GCM
work, but to increase performance, a larger threshold would be better.
Measuring the performance of supported ciphers with openssl speed,
software matches hardware at around 768-1024 bytes.
Considering the 256-bit ciphers, software is 2-3 times faster than qce
at 256-bytes, 30% faster at 512, and about even at 768-bytes. With
128-bit keys, the break-even point would be around 1024-bytes.
This adds the 'aes_sw_max_len' parameter, to set the largest request
length processed by the software fallback. Its default is being set to
512 bytes, a little lower than the break-even point, to balance the cost
in CPU usage.
Signed-off-by: Eneas U de Queiroz <cotequeiroz@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The qce crypto driver appends an extra entry to the dst sgl, to maintain
private state information.
When the gcm driver sends requests to the ctr skcipher, it passes the
authentication tag after the actual crypto payload, but it must not be
touched.
Commit 1336c2221bee ("crypto: qce - save a sg table slot for result
buf") limited the destination sgl to avoid overwriting the
authentication tag but it assumed the tag would be in a separate sgl
entry.
This is not always the case, so it is better to limit the length of the
destination buffer to req->cryptlen before appending the result buf.
Signed-off-by: Eneas U de Queiroz <cotequeiroz@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Function dev_err() after platform_get_irq() is redundant because
platform_get_irq() already prints an error.
Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
No logs are recorded in dmesg during chcr module load, hence
adding the print and also appending -ko to driver version.
Signed-off-by: Devulapally Shiva Krishna <shiva@chelsio.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When running tcrypt skcipher speed tests, logs contain things like:
testing speed of async ecb(des3_ede) (ecb(des3_ede-generic)) encryption
or:
testing speed of async ecb(aes) (ecb(aes-ce)) encryption
The algorithm implementations are sync, not async.
Fix this inaccuracy.
Fixes: 7166e589da ("crypto: tcrypt - Use skcipher")
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The path with the CRYPTO_ALG_LARVAL flag has jumped to the end before
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The libkcapi test which causes kernel panic is
aead asynchronous vmsplice multiple test.
./bin/kcapi -v -d 4 -x 10 -c "ccm(aes)"
-q 4edb58e8d5eb6bc711c43a6f3693daebde2e5524f1b55297abb29f003236e43d
-t a7877c99 -n 674742abd0f5ba -k 2861fd0253705d7875c95ba8a53171b4
-a fb7bc304a3909e66e2e0c5ef952712dd884ce3e7324171369f2c5db1adc48c7d
This patch avoids dma_mapping of a zero length sg which causes the panic,
by using sg_nents_for_len which maps only upto a specific length
Signed-off-by: Ayush Sawal <ayush.sawal@chelsio.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The libkcapi "cbc(aes)" failed tests are
symmetric asynchronous cipher one shot multiple test,
symmetric asynchronous cipher stream multiple test,
Symmetric asynchronous cipher vmsplice multiple test
In this patch a wait_for_completion is added in the chcr_aes_encrypt function,
which completes when the response of comes from the hardware.
This adds serialization for encryption in cbc(aes) aio case.
Signed-off-by: Ayush Sawal <ayush.sawal@chelsio.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/crypto/ccree/cc_cipher.c: In function 'cc_setup_state_desc':
drivers/crypto/ccree/cc_cipher.c:536:15: warning:
variable 'du_size' set but not used [-Wunused-but-set-variable]
commit 5c83e8ec4d ("crypto: ccree - fix FDE descriptor sequence")
involved this unused variable, so remove it.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Remove the auth tag size from cryptlen before mapping the destination
in out-of-place AEAD decryption thus resolving a crash with
extended testmgr tests.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Reported-by: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: stable@vger.kernel.org # v4.19+
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add reference counter incremented for each frame enqueued in CAAM
and replace unconditional sleep in empty_caam_fq() with polling the
reference counter.
When CONFIG_CRYPTO_MANAGER_EXTRA_TESTS=y boot time on LS1043A
platform with this optimization decreases from ~1100s to ~11s.
Signed-off-by: Valentin Ciocoi Radulescu <valentin.ciocoi@nxp.com>
Reviewed-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Fix an error causing no block sizes to be reported during
all AEAD registrations.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
OFB and CTR modes block sizes were wrongfully reported as
the underlying block sizes. Fix it to 1 bytes as they
turn the block ciphers into stream ciphers.
Also document why our XTS differes from the generic
implementation.
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Make sure to only add the size of the auth tag to the source mapping
for encryption if it is an in-place operation. Failing to do this
previously caused us to try and map auth size len bytes from a NULL
mapping and crashing if both the cryptlen and assoclen are zero.
Reported-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Cc: stable@vger.kernel.org # v4.19+
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Deal gracefully with a NULL or empty scatterlist which can happen
if both cryptlen and assoclen are zero and we're doing in-place
AEAD encryption.
This fixes a crash when this causes us to try and map a NULL page,
at least with some platforms / DMA mapping configs.
Cc: stable@vger.kernel.org # v4.19+
Reported-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This comes from INRIA's HACL*/Vale. It implements the same algorithm and
implementation strategy as the code it replaces, only this code has been
formally verified, sans the base point multiplication, which uses code
similar to prior, only it uses the formally verified field arithmetic
alongside reproducable ladder generation steps. This doesn't have a
pure-bmi2 version, which means haswell no longer benefits, but the
increased (doubled) code complexity is not worth it for a single
generation of chips that's already old.
Performance-wise, this is around 1% slower on older microarchitectures,
and slightly faster on newer microarchitectures, mainly 10nm ones or
backports of 10nm to 14nm. This implementation is "everest" below:
Xeon E5-2680 v4 (Broadwell)
armfazh: 133340 cycles per call
everest: 133436 cycles per call
Xeon Gold 5120 (Sky Lake Server)
armfazh: 112636 cycles per call
everest: 113906 cycles per call
Core i5-6300U (Sky Lake Client)
armfazh: 116810 cycles per call
everest: 117916 cycles per call
Core i7-7600U (Kaby Lake)
armfazh: 119523 cycles per call
everest: 119040 cycles per call
Core i7-8750H (Coffee Lake)
armfazh: 113914 cycles per call
everest: 113650 cycles per call
Core i9-9880H (Coffee Lake Refresh)
armfazh: 112616 cycles per call
everest: 114082 cycles per call
Core i3-8121U (Cannon Lake)
armfazh: 113202 cycles per call
everest: 111382 cycles per call
Core i7-8265U (Whiskey Lake)
armfazh: 127307 cycles per call
everest: 127697 cycles per call
Core i7-8550U (Kaby Lake Refresh)
armfazh: 127522 cycles per call
everest: 127083 cycles per call
Xeon Platinum 8275CL (Cascade Lake)
armfazh: 114380 cycles per call
everest: 114656 cycles per call
Achieving these kind of results with formally verified code is quite
remarkable, especialy considering that performance is favorable for
newer chips.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
We need to decrement this refcounter on these error paths.
Fixes: f7d76e05d0 ("crypto: user - fix use_after_free of struct xxx_request")
Cc: <stable@vger.kernel.org>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
If all possible errors occurs at the same time, the error_status will be
all 1s. The doorbell timeout error and FIFO overflow error will be print
in each cycle, which should be print just once.
Signed-off-by: Shukun Tan <tanshukun1@huawei.com>
Reviewed-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In error detect process, a lot of duplicate code can put into qm. We add
two callback(get_dev_hw_err_status and log_dev_hw_err) into struct
hisi_qm_err_ini to handle device error detect, meanwhile the qm error
detect not changed.
Signed-off-by: Shukun Tan <tanshukun1@huawei.com>
Signed-off-by: Zaibo Xu <xuzaibo@huawei.com>
Reviewed-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Configure zip RAS error type in error handle initialization,
Where ECC 1bit is configured as CE error, others are NFE.
Signed-off-by: Shukun Tan <tanshukun1@huawei.com>
Reviewed-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>