linux_dsm_epyc7002/arch/arm64/include/asm
Pingfan Liu c4885bbb3a arm64/mm: save memory access in check_and_switch_context() fast switch path
On arm64, smp_processor_id() reads a per-cpu `cpu_number` variable,
using the per-cpu offset stored in the tpidr_el1 system register. In
some cases we generate a per-cpu address with a sequence like:

  cpu_ptr = &per_cpu(ptr, smp_processor_id());

Which potentially incurs a cache miss for both `cpu_number` and the
in-memory `__per_cpu_offset` array. This can be written more optimally
as:

  cpu_ptr = this_cpu_ptr(ptr);

Which only needs the offset from tpidr_el1, and does not need to
load from memory.

The following two test cases show a small performance improvement measured
on a 46-cpus qualcomm machine with 5.8.0-rc4 kernel.

Test 1: (about 0.3% improvement)
    #cat b.sh
    make clean && make all -j138
    #perf stat --repeat 10 --null --sync sh b.sh

    - before this patch
     Performance counter stats for 'sh b.sh' (10 runs):

                298.62 +- 1.86 seconds time elapsed  ( +-  0.62% )

    - after this patch
     Performance counter stats for 'sh b.sh' (10 runs):

               297.734 +- 0.954 seconds time elapsed  ( +-  0.32% )

Test 2: (about 1.69% improvement)
     'perf stat -r 10 perf bench sched messaging'
        Then sum the total time of 'sched/messaging' by manual.

    - before this patch
      total 0.707 sec for 10 times
    - after this patch
      totol 0.695 sec for 10 times

Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Vladimir Murzin <vladimir.murzin@arm.com>
Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
Link: https://lore.kernel.org/r/1594389852-19949-1-git-send-email-kernelfans@gmail.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-30 12:58:40 +01:00
..
vdso arm64: vdso32: Include common headers in the vdso library 2020-03-21 15:24:02 +01:00
xen xen: fixes and cleanups for 5.4-rc2 2019-10-04 11:13:09 -07:00
acenv.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
acpi.h arm64: acpi: fix UBSAN warning 2020-06-10 11:41:08 +01:00
alternative.h arm64: alternative: fix build with clang integrated assembler 2020-03-20 10:01:28 +00:00
arch_gicv3.h KVM/arm fixes for 5.6, take #1 2020-02-28 11:50:06 +01:00
arch_timer.h Merge branch 'timers/vdso' into timers/core 2019-07-03 10:50:21 +02:00
archrandom.h arm64: add credited/trusted RNG support 2020-02-27 23:21:52 -05:00
arm_dsu_pmu.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
arm-cci.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 2019-06-19 17:09:07 +02:00
asm_pointer_auth.h arm64: simplify ptrauth initialization 2020-04-28 11:23:21 +01:00
asm-bug.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 2019-06-19 17:09:07 +02:00
asm-offsets.h
asm-prototypes.h
asm-uaccess.h arm64 updates for 5.5: 2019-11-25 15:39:19 -08:00
assembler.h arm64: asm: Provide a mechanism for generating ELF note for BTI 2020-05-07 17:53:20 +01:00
atomic_ll_sc.h arm64: Move the LSE gas support detection to Kconfig 2020-01-15 12:50:48 +00:00
atomic_lse.h arm64: lse: fix LSE atomics with LLVM's integrated assembler 2020-01-16 17:25:10 +00:00
atomic.h locking/atomics: Flip fallbacks and instrumentation 2020-06-11 08:03:24 +02:00
barrier.h arm64: barrier: Use '__unqual_scalar_typeof' for acquire/release macros 2020-04-16 12:28:35 +01:00
bitops.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 2019-06-19 17:09:07 +02:00
bitrev.h
boot.h treewide: replace #include <asm/sizes.h> with #include <linux/sizes.h> 2019-05-14 19:52:52 -07:00
brk-imm.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
bug.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 2019-06-19 17:09:07 +02:00
cache.h arm64: Ask the compiler to __always_inline functions used by KVM at HYP 2020-02-22 11:01:47 +00:00
cacheflush.h arm64: use asm-generic/cacheflush.h 2020-06-08 11:05:57 -07:00
checksum.h arm64: csum: Optimise IPv6 header checksum 2020-03-09 18:08:25 +00:00
clocksource.h arm64: Introduce asm/vdso/clocksource.h 2020-03-21 15:23:55 +01:00
cmpxchg.h arm64: fix unreachable code issue with cmpxchg 2019-09-17 12:11:50 +01:00
compat.h compat: provide compat_ptr() on all architectures 2020-01-03 09:32:51 +01:00
compiler.h arm64/crash_core: Export KERNELPACMASK in vmcoreinfo 2020-05-11 14:29:10 +01:00
cpu_ops.h arm64: Introduce get_cpu_ops() helper function 2020-03-24 17:24:19 +00:00
cpu.h arm64/cpuinfo: Add ID_MMFR4_EL1 into the cpuinfo_arm64 context 2020-05-21 15:47:12 +01:00
cpucaps.h Merge branch 'for-next/kvm/errata' into for-next/core 2020-05-28 18:02:51 +01:00
cpufeature.h arm64/panic: Unify all three existing notifier blocks 2020-07-02 15:44:50 +01:00
cpuidle.h
cputype.h arm64: Add KRYO{3,4}XX CPU cores to spectre-v2 safe list 2020-01-17 12:46:41 +00:00
current.h
daifflags.h arm64: acpi: fix DAIF manipulation with pNMI 2020-01-22 14:41:22 +00:00
dcc.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 284 2019-06-05 17:36:37 +02:00
debug-monitors.h arm64: Call debug_traps_init() from trap_init() to help early kgdb 2020-05-18 17:51:20 +01:00
device.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 2019-06-19 17:09:07 +02:00
dmi.h
efi.h efi/libstub: unify EFI call wrappers for non-x86 2020-04-23 20:15:06 +02:00
elf.h Split the old READ_IMPLIES_EXEC workaround from executable PT_GNU_STACK 2020-06-05 13:45:21 -07:00
esr.h Merge branch 'for-next/bti-user' into for-next/bti 2020-05-05 15:15:58 +01:00
exception.h arm64: Basic Branch Target Identification support 2020-03-16 17:19:48 +00:00
exec.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 2019-06-19 17:09:07 +02:00
extable.h
fb.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 2019-06-19 17:09:07 +02:00
fixmap.h
fpsimd.h arm64: remove pointless __KERNEL__ guards 2019-08-05 11:06:33 +01:00
fpsimdmacros.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 2019-06-19 17:09:07 +02:00
ftrace.h arm64: implement ftrace with regs 2019-11-06 14:17:35 +00:00
futex.h futex: arch_futex_atomic_op_inuser() calling conventions change 2020-03-27 23:58:51 -04:00
hardirq.h arm64: Prepare arch_nmi_enter() for recursion 2020-05-19 15:51:17 +02:00
hugetlb.h arm64/hugetlb: Reserve CMA areas for gigantic pages on 16K and 64K configs 2020-07-15 13:38:03 +01:00
hw_breakpoint.h arm64: remove pointless __KERNEL__ guards 2019-08-05 11:06:33 +01:00
hwcap.h arm64: Reserve HWCAP2_MTE as (1 << 18) 2020-07-24 11:55:29 +01:00
hypervisor.h
image.h docs: arm64: convert docs to ReST and rename to .rst 2019-06-14 14:20:27 -06:00
insn.h arm64: insn: Provide a better name for aarch64_insn_is_nop() 2020-05-04 16:06:29 +01:00
io.h mm: reorder includes after introduction of linux/pgtable.h 2020-06-09 09:39:13 -07:00
irq_work.h
irq.h
irqflags.h arm64: Relax ICC_PMR_EL1 accesses when ICC_CTLR_EL1.PMHE is clear 2019-10-15 12:26:09 +01:00
jump_label.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 2019-06-19 17:09:07 +02:00
kasan.h arm64: mm: Introduce vabits_actual 2019-08-09 11:17:21 +01:00
Kbuild asm-generic: make more kernel-space headers mandatory 2020-04-02 09:35:25 -07:00
kernel-pgtable.h mm: introduce include/linux/pgtable.h 2020-06-09 09:39:13 -07:00
kexec.h Revert "arm64: kexec: make dtb_mem always enabled" 2020-01-10 16:00:50 +00:00
kgdb.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 2019-06-19 17:09:07 +02:00
kprobes.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 174 2019-05-30 11:26:41 -07:00
kvm_arm.h arm64/kvm: disable access to AMU registers from kvm guests 2020-03-06 16:02:50 +00:00
kvm_asm.h KVM: arm64: Move hyp_symbol_addr() to kvm_asm.h 2020-06-10 19:09:09 +01:00
kvm_coproc.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 2019-06-19 17:09:07 +02:00
kvm_emulate.h KVM/arm64 fixes for Linux 5.8, take #1 2020-06-11 14:02:32 -04:00
kvm_host.h KVM/arm64 fixes for Linux 5.8, take #1 2020-06-11 14:02:32 -04:00
kvm_hyp.h ARM: 2020-06-03 15:13:47 -07:00
kvm_mmu.h MIPS: 2020-06-12 11:05:52 -07:00
kvm_ptrauth.h KVM: arm/arm64: Context-switch ptrauth registers 2019-04-24 15:30:40 +01:00
kvm_ras.h
linkage.h arm64: Don't insert a BTI instruction at inner labels 2020-06-24 14:24:29 +01:00
lse.h arm64: lse: Fix LSE atomics with LLVM 2020-02-18 18:10:49 +00:00
memory.h arm64/panic: Unify all three existing notifier blocks 2020-07-02 15:44:50 +01:00
mman.h arm64: Basic Branch Target Identification support 2020-03-16 17:19:48 +00:00
mmu_context.h arm64/mm: save memory access in check_and_switch_context() fast switch path 2020-07-30 12:58:40 +01:00
mmu.h arm64: compat: Allow 32-bit vdso and sigpage to co-exist 2020-06-23 14:47:03 +01:00
mmzone.h
module.h arch: split MODULE_ARCH_VERMAGIC definitions out to <asm/vermagic.h> 2020-04-23 10:50:26 +09:00
neon-intrinsics.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
neon.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
numa.h
page-def.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 2019-06-19 17:09:07 +02:00
page.h mm/vma: define a default value for VM_DATA_DEFAULT_FLAGS 2020-04-10 15:36:21 -07:00
paravirt.h arm64: Retrieve stolen time as paravirtualized guest 2019-10-21 19:20:31 +01:00
pci.h arm64: remove pointless __KERNEL__ guards 2019-08-05 11:06:33 +01:00
percpu.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 2019-06-19 17:09:07 +02:00
perf_event.h arm64: perf: Add support for ARMv8.5-PMU 64-bit counters 2020-03-17 22:50:30 +00:00
pgalloc.h arm64: add support for folded p4d page tables 2020-06-04 19:06:21 -07:00
pgtable-hwdef.h arm64/mm: Redefine CONT_{PTE, PMD}_SHIFT 2020-07-03 17:49:58 +01:00
pgtable-prot.h arm64: bti: Fix support for userspace only BTI 2020-05-12 18:45:17 +01:00
pgtable-types.h arm64: add support for folded p4d page tables 2020-06-04 19:06:21 -07:00
pgtable.h arm64: pgtable: Clear the GP bit for non-executable kernel pages 2020-06-16 17:21:07 +01:00
pointer_auth.h arm64: sync kernel APIAKey when installing 2020-04-21 15:52:56 +01:00
preempt.h sched/rt, arm64: Use CONFIG_PREEMPTION 2019-12-08 14:37:32 +01:00
probes.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 174 2019-05-30 11:26:41 -07:00
proc-fns.h arm64: mm: convert cpu_do_switch_mm() to C 2020-02-27 14:30:50 +00:00
processor.h arm64 updates for 5.7: 2020-03-31 10:05:01 -07:00
ptdump.h arm64: mm: convert mm/dump.c to use walk_page_range() 2020-02-04 03:05:25 +00:00
ptrace.h KVM: arm64: Parametrize exception entry with a target EL 2020-05-28 13:16:55 +01:00
pvclock-abi.h KVM: arm64: Implement PV_TIME_FEATURES call 2019-10-21 19:20:27 +01:00
scs.h scs: Move scs_overflow_check() out of architecture code 2020-05-18 17:47:40 +01:00
sdei.h
seccomp.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
sections.h arm64: Remove __exception_text_start and __exception_text_end from asm/section.h 2020-01-08 17:30:19 +00:00
shmparam.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 2019-06-19 17:09:07 +02:00
signal32.h arm64: remove pointless __KERNEL__ guards 2019-08-05 11:06:33 +01:00
simd.h arm64: fpsimd: Make sure SVE setup is complete before SIMD is used 2020-01-14 17:11:21 +00:00
smp_plat.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 2019-06-19 17:09:07 +02:00
smp.h arm64: simplify ptrauth initialization 2020-04-28 11:23:21 +01:00
sparsemem.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 2019-06-19 17:09:07 +02:00
spinlock_types.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 2019-06-19 17:09:07 +02:00
spinlock.h arm64/spinlock: fix a -Wunused-function warning 2020-02-10 11:29:24 +00:00
stack_pointer.h
stackprotector.h arm64: initialize ptrauth keys for kernel booting task 2020-03-18 09:50:20 +00:00
stacktrace.h arm64: add loglvl to dump_backtrace() 2020-06-09 09:39:11 -07:00
stage2_pgtable.h mm: introduce include/linux/pgtable.h 2020-06-09 09:39:13 -07:00
stat.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 2019-06-19 17:09:07 +02:00
string.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 2019-06-19 17:09:07 +02:00
suspend.h arm64: Preserve register x18 when CPU is suspended 2020-05-15 16:35:50 +01:00
sync_bitops.h
syscall_wrapper.h arm64: simplify syscall wrapper ifdeffery 2019-10-14 10:55:00 +01:00
syscall.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 2019-06-19 17:09:07 +02:00
sysreg.h arm64: s/AMEVTYPE/AMEVTYPER 2020-07-22 13:59:38 +01:00
system_misc.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 2019-06-19 17:09:07 +02:00
thread_info.h arm64: scs: Store absolute SCS stack pointer value in thread_info 2020-05-18 17:47:22 +01:00
timex.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 2019-06-19 17:09:07 +02:00
tlb.h mm: treewide: clarify pgtable_page_{ctor,dtor}() naming 2019-09-26 10:10:44 -07:00
tlbflush.h arm64: tlb: Ensure we execute an ISB following walk cache invalidation 2019-08-27 17:38:26 +01:00
topology.h arm64 updates for 5.7: 2020-03-31 10:05:01 -07:00
traps.h arm64: remove __exception annotations 2019-10-28 11:22:38 +00:00
uaccess.h arm64: Add get_user() type annotation on the !access_ok() path 2020-05-22 16:59:49 +01:00
unistd32.h vfs: add faccessat2 syscall 2020-05-14 16:44:25 +02:00
unistd.h vfs: add faccessat2 syscall 2020-05-14 16:44:25 +02:00
uprobes.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
vdso.h arm64: remove pointless __KERNEL__ guards 2019-08-05 11:06:33 +01:00
vermagic.h arch: split MODULE_ARCH_VERMAGIC definitions out to <asm/vermagic.h> 2020-04-23 10:50:26 +09:00
virt.h KVM: arm64: Use cpus_have_final_cap for has_vhe() 2020-05-16 15:04:51 +01:00
vmalloc.h mm/vmalloc: Add empty <asm/vmalloc.h> headers and use them from <linux/vmalloc.h> 2019-12-10 10:12:55 +01:00
vmap_stack.h mm: reorder includes after introduction of linux/pgtable.h 2020-06-09 09:39:13 -07:00
word-at-a-time.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 2019-06-19 17:09:07 +02:00
xor.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00