mirror of
https://github.com/AuxXxilium/linux_dsm_epyc7002.git
synced 2024-12-04 22:26:40 +07:00
223e23e8aa
We want to avoid lots of different copy_page implementations, settling for something that is "good enough" everywhere and hopefully easy to understand and maintain whilst we're at it. This patch reworks our copy_page implementation based on discussions with Cavium on the list and benchmarking on Cortex-A processors so that: - The loop is unrolled to copy 128 bytes per iteration - The reads are offset so that we read from the next 128-byte block in the same iteration that we store the previous block - Explicit prefetch instructions are removed for now, since they hurt performance on CPUs with hardware prefetching - The loop exit condition is calculated at the start of the loop Signed-off-by: Will Deacon <will.deacon@arm.com> Tested-by: Andrew Pinski <apinski@cavium.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> |
||
---|---|---|
.. | ||
atomic_ll_sc.c | ||
bitops.S | ||
clear_page.S | ||
clear_user.S | ||
copy_from_user.S | ||
copy_in_user.S | ||
copy_page.S | ||
copy_template.S | ||
copy_to_user.S | ||
delay.c | ||
Makefile | ||
memchr.S | ||
memcmp.S | ||
memcpy.S | ||
memmove.S | ||
memset.S | ||
strchr.S | ||
strcmp.S | ||
strlen.S | ||
strncmp.S | ||
strnlen.S | ||
strrchr.S |