Turning on crypto self-tests on a POWER8 shows:
alg: hash: Test 1 failed for crc32c-vpmsum
00000000: ff ff ff ff
Comparing the code with the Intel CRC32c implementation on which
ours is based shows that we are doing an init with 0, not ~0
as CRC32c requires.
This probably wasn't caught because btrfs does its own weird
open-coded initialisation.
Initialise our internal context to ~0 on init.
This makes the self-tests pass, and btrfs continues to work.
Fixes: 6dd7a82cc5 ("crypto: powerpc - Add POWER8 optimised crc32c")
Cc: Anton Blanchard <anton@samba.org>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Axtens <dja@axtens.net>
Acked-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch utilises the GENERIC_CPU_AUTOPROBE infrastructure
to automatically load the crc32c-vpmsum module if the CPU supports
it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Use the vector polynomial multiply-sum instructions in POWER8 to
speed up crc32c.
This is just over 41x faster than the slice-by-8 method that it
replaces. Measurements on a 4.1 GHz POWER8 show it sustaining
52 GiB/sec.
A simple btrfs write performance test:
dd if=/dev/zero of=/mnt/tmpfile bs=1M count=4096
sync
is over 3.7x faster.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>