linux_dsm_epyc7002

AuxXxilium/linux_dsm_epyc7002

Fork 0

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-19 00:36:44 +07:00

Commit Graph

Author	SHA1	Message	Date
Luc Van Oostenryck	50a4b05609	arm64: add missing conversion to __wsum in ip_fast_csum() ARM64 implementation of ip_fast_csum() do most of the work in 128 or 64 bit and call csum_fold() to finalize. csum_fold() itself take a __wsum argument, to insure that this value is always a 32bit native-order value. Fix this by adding the sadly needed '__force' to cast the native 'sum' to the type '__wsum'. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-29 16:32:43 +01:00
Robin Murphy	0e455d8e80	arm64: Implement optimised IP checksum helpers AArch64 is capable of 128-bit memory accesses without alignment restrictions, which makes it both possible and highly practical to slurp up a typical 20-byte IP header in just 2 loads. Implement our own version of ip_fast_checksum() to take advantage of that, resulting in considerably fewer instructions and memory accesses than the generic version. We can also get more optimal code generation for csum_fold() by defining it a slightly different way round from the generic version, so throw that into the mix too. Suggested-by: Luke Starrett <luke.starrett@broadcom.com> Acked-by: Luke Starrett <luke.starrett@broadcom.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2016-06-21 15:09:11 +01:00

Author

SHA1

Message

Date

Luc Van Oostenryck

50a4b05609

arm64: add missing conversion to __wsum in ip_fast_csum()

ARM64 implementation of ip_fast_csum() do most of the work
in 128 or 64 bit and call csum_fold() to finalize. csum_fold()
itself take a __wsum argument, to insure that this value is
always a 32bit native-order value.

Fix this by adding the sadly needed '__force' to cast the native
'sum' to the type '__wsum'.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>

2017-06-29 16:32:43 +01:00

Robin Murphy

0e455d8e80

arm64: Implement optimised IP checksum helpers

AArch64 is capable of 128-bit memory accesses without alignment
restrictions, which makes it both possible and highly practical to slurp
up a typical 20-byte IP header in just 2 loads. Implement our own
version of ip_fast_checksum() to take advantage of that, resulting in
considerably fewer instructions and memory accesses than the generic
version. We can also get more optimal code generation for csum_fold() by
defining it a slightly different way round from the generic version, so
throw that into the mix too.

Suggested-by: Luke Starrett <luke.starrett@broadcom.com>
Acked-by: Luke Starrett <luke.starrett@broadcom.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

2016-06-21 15:09:11 +01:00

2 Commits