linux_dsm_epyc7002/arch/arm64/include/asm
Waiman Long 46ad0840b1 locking/rwsem: Remove arch specific rwsem files
As the generic rwsem-xadd code is using the appropriate acquire and
release versions of the atomic operations, the arch specific rwsem.h
files will not be that much faster than the generic code as long as the
atomic functions are properly implemented. So we can remove those arch
specific rwsem.h and stop building asm/rwsem.h to reduce maintenance
effort.

Currently, only x86, alpha and ia64 have implemented architecture
specific fast paths. I don't have access to alpha and ia64 systems for
testing, but they are legacy systems that are not likely to be updated
to the latest kernel anyway.

By using a rwsem microbenchmark, the total locking rates on a 4-socket
56-core 112-thread x86-64 system before and after the patch were as
follows (mixed means equal # of read and write locks):

                      Before Patch              After Patch
   # of Threads  wlock   rlock   mixed     wlock   rlock   mixed
   ------------  -----   -----   -----     -----   -----   -----
        1        29,201  30,143  29,458    28,615  30,172  29,201
        2         6,807  13,299   1,171     7,725  15,025   1,804
        4         6,504  12,755   1,520     7,127  14,286   1,345
        8         6,762  13,412     764     6,826  13,652     726
       16         6,693  15,408     662     6,599  15,938     626
       32         6,145  15,286     496     5,549  15,487     511
       64         5,812  15,495      60     5,858  15,572      60

There were some run-to-run variations for the multi-thread tests. For
x86-64, using the generic C code fast path seems to be a little bit
faster than the assembly version with low lock contention.  Looking at
the assembly version of the fast paths, there are assembly to/from C
code wrappers that save and restore all the callee-clobbered registers
(7 registers on x86-64). The assembly generated from the generic C
code doesn't need to do that. That may explain the slight performance
gain here.

The generic asm rwsem.h can also be merged into kernel/locking/rwsem.h
with no code change as no other code other than those under
kernel/locking needs to access the internal rwsem macros and functions.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-c6x-dev@linux-c6x.org
Cc: linux-m68k@lists.linux-m68k.org
Cc: linux-riscv@lists.infradead.org
Cc: linux-um@lists.infradead.org
Cc: linux-xtensa@linux-xtensa.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: nios2-dev@lists.rocketboards.org
Cc: openrisc@lists.librecores.org
Cc: uclinux-h8-devel@lists.sourceforge.jp
Link: https://lkml.kernel.org/r/20190322143008.21313-2-longman@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-03 14:50:50 +02:00
..
xen arm64/xen: fix xen-swiotlb cache flushing 2019-01-23 22:14:56 +01:00
acenv.h
acpi.h arm64: KVM/mm: Move SEA handling behind a single 'claim' interface 2019-02-07 23:10:45 +01:00
alternative.h arm64: alternative: Apply alternatives early in boot process 2019-02-06 10:05:20 +00:00
arch_gicv3.h arm64: gic-v3: Implement arch support for priority masking 2019-02-06 10:05:21 +00:00
arch_timer.h
arm_dsu_pmu.h
arm-cci.h
asm-bug.h
asm-offsets.h
asm-prototypes.h arm64: asm-prototypes: Fix fat-fingered typo in comment 2019-01-10 11:11:46 +00:00
asm-uaccess.h arm64: Rename get_thread_info() 2019-02-26 16:57:59 +00:00
assembler.h arm64: Add workaround for Fujitsu A64FX erratum 010001 2019-02-28 16:24:25 +00:00
atomic_ll_sc.h Merge branch 'locking/atomics' into locking/core, to pick up WIP commits 2019-02-11 14:27:05 +01:00
atomic_lse.h Merge branch 'locking/atomics' into locking/core, to pick up WIP commits 2019-02-11 14:27:05 +01:00
atomic.h arm64, locking/atomics: Use instrumented atomics 2018-11-01 11:01:40 +01:00
barrier.h arm64: Add support for SB barrier and patch in over DSB; ISB sequences 2018-12-06 16:47:04 +00:00
bitops.h
bitrev.h
boot.h
brk-imm.h kasan, arm64: add brk handler for inline instrumentation 2018-12-28 12:11:44 -08:00
bug.h
cache.h kasan, arm64: remove redundant ARCH_SLAB_MINALIGN define 2019-01-16 12:09:11 +00:00
cacheflush.h
checksum.h
clocksource.h
cmpxchg.h Merge branch 'locking/atomics' into locking/core, to pick up WIP commits 2019-02-11 14:27:05 +01:00
compat.h
cpu_ops.h
cpu.h
cpucaps.h arm64: cpufeature: Add cpufeature for IRQ priority masking 2019-02-06 10:05:17 +00:00
cpufeature.h arm64: alternative: Apply alternatives early in boot process 2019-02-06 10:05:20 +00:00
cpuidle.h
cputype.h arm64: Add MIDR encoding for HiSilicon Taishan CPUs 2019-03-19 14:55:10 +00:00
current.h
daifflags.h arm64 updates for 5.1: 2019-03-10 10:17:23 -07:00
dcc.h
debug-monitors.h
device.h arm64/xen: fix xen-swiotlb cache flushing 2019-01-23 22:14:56 +01:00
dma-mapping.h dma-mapping: add a kconfig symbol for arch_teardown_dma_ops availability 2019-02-13 19:12:50 +01:00
dmi.h
efi.h arm64: irqflags: Use ICC_PMR_EL1 for interrupt masking 2019-02-06 10:05:19 +00:00
elf.h arm64: mm: Allow forcing all userspace addresses to 52-bit 2018-12-10 18:42:18 +00:00
esr.h arm64: add pointer authentication register bits 2018-12-13 16:42:45 +00:00
exception.h
exec.h
extable.h
fb.h
fixmap.h firmware: arm_sdei: Add ACPI GHES registration helper 2019-02-11 11:07:49 +01:00
fpsimd.h
fpsimdmacros.h
ftrace.h arm64 festive updates for 4.21 2018-12-25 17:41:56 -08:00
futex.h Remove 'type' argument from access_ok() function 2019-01-03 18:57:57 -08:00
hardirq.h arm64: Fix HCR.TGE status for NMI contexts 2019-02-06 10:05:16 +00:00
hugetlb.h arm64/mm: enable HugeTLB migration for contiguous bit HugeTLB pages 2019-03-05 21:07:15 -08:00
hw_breakpoint.h
hwcap.h
hypervisor.h
image.h arm64: add image head flag definitions 2018-12-06 14:38:51 +00:00
insn.h arm64/insn: add support for emitting ADR/ADRP instructions 2018-11-27 18:47:33 +00:00
io.h arm64: io: Hook up __io_par() for inX() ordering 2019-02-28 17:24:27 +00:00
irq_work.h
irq.h
irqflags.h arm64: irqflags: Fix clang build warnings 2019-02-12 11:33:57 +00:00
jump_label.h
kasan.h kasan: add tag related helper functions 2018-12-28 12:11:43 -08:00
Kbuild locking/rwsem: Remove arch specific rwsem files 2019-04-03 14:50:50 +02:00
kernel-pgtable.h
kexec.h arm64: kexec_file: allow for loading Image-format kernel 2018-12-06 14:38:52 +00:00
kgdb.h
kprobes.h
kvm_arm.h * ARM: selftests improvements, large PUD support for HugeTLB, 2018-12-26 11:46:28 -08:00
kvm_asm.h arm/arm64: KVM: Add ARM_EXCEPTION_IS_TRAP macro 2018-12-19 17:47:53 +00:00
kvm_coproc.h
kvm_emulate.h arm64: KVM: Describe data or unified caches as having 1 set and 1 way 2019-02-19 21:05:49 +00:00
kvm_host.h ARM: some cleanups, direct physical timer assignment, cache sanitization 2019-03-15 15:00:28 -07:00
kvm_hyp.h KVM: arm/arm64: Factor out VMID into struct kvm_vmid 2019-02-19 21:05:35 +00:00
kvm_mmio.h
kvm_mmu.h KVM: arm/arm64: vgic-its: Take the srcu lock when writing to guest memory 2019-03-19 17:56:56 +00:00
kvm_ras.h arm64: KVM/mm: Move SEA handling behind a single 'claim' interface 2019-02-07 23:10:45 +01:00
linkage.h
lse.h
memory.h arm64 updates for 5.1: 2019-03-10 10:17:23 -07:00
mmu_context.h arm64: Kconfig: Re-jig CONFIG options for 52-bit VA 2018-12-10 18:42:18 +00:00
mmu.h arm64: Remove asm/memblock.h 2019-01-21 17:31:15 +00:00
mmzone.h
module.h arm64/module: switch to ADRP/ADD sequences for PLT entries 2018-11-27 19:00:45 +00:00
neon-intrinsics.h arm64/neon: Disable -Wincompatible-pointer-types when building with Clang 2019-02-18 10:54:47 +00:00
neon.h
numa.h
page-def.h
page.h
paravirt.h
pci.h
percpu.h arm64: percpu: Fix LSE implementation of value-returning pcpu atomics 2018-12-12 14:43:35 +00:00
perf_event.h arm64: perf: Treat EXCLUDE_EL* bit definitions as unsigned 2018-12-13 15:34:44 +00:00
pgalloc.h mm: treewide: remove unused address argument from pte_alloc functions 2019-01-04 13:13:47 -08:00
pgtable-hwdef.h arm64: Add workaround for Fujitsu A64FX erratum 010001 2019-02-28 16:24:25 +00:00
pgtable-prot.h arm64: kpti: Avoid rewriting early page tables when KASLR is enabled 2019-01-10 17:49:35 +00:00
pgtable-types.h
pgtable.h * ARM: selftests improvements, large PUD support for HugeTLB, 2018-12-26 11:46:28 -08:00
pointer_auth.h arm64: ptr auth: Move per-thread keys from thread_info to thread_struct 2018-12-13 16:42:47 +00:00
preempt.h arm64: preempt: Provide our own implementation of asm/preempt.h 2018-12-07 12:35:53 +00:00
probes.h
proc-fns.h
processor.h arm64: Make PMR part of task context 2019-02-06 10:05:18 +00:00
ptdump.h arm64: dump: no need to check return value of debugfs_create functions 2019-01-31 17:38:19 +00:00
ptrace.h arm64: Make PMR part of task context 2019-02-06 10:05:18 +00:00
sdei.h
seccomp.h
sections.h
shmparam.h
signal32.h
simd.h
smp_plat.h
smp.h arm64: smp: Fix compilation error 2019-01-03 15:14:32 +00:00
sparsemem.h
spinlock_types.h
spinlock.h
stack_pointer.h
stackprotector.h arm64: enable per-task stack canaries 2018-12-12 18:45:31 +00:00
stacktrace.h
stage2_pgtable.h KVM: arm64: Clarify explanation of STAGE2_PGTABLE_LEVELS 2018-12-18 15:14:50 +00:00
stat.h
string.h
suspend.h
sync_bitops.h arm64, locking/atomics: Use instrumented atomics 2018-11-01 11:01:40 +01:00
syscall_wrapper.h
syscall.h
sysreg.h arm64: KVM: Expose sanitised cache type register to guest 2019-02-19 21:05:48 +00:00
system_misc.h KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing 2019-02-07 23:10:45 +01:00
thread_info.h arm64: Remove documentation about TIF_USEDFPU 2019-02-26 16:41:10 +00:00
timex.h
tlb.h
tlbflush.h arm64 festive updates for 4.21 2018-12-25 17:41:56 -08:00
topology.h
traps.h
uaccess.h arm64 updates for 5.1: 2019-03-10 10:17:23 -07:00
unistd32.h y2038: add 64-bit time_t syscalls to all 32-bit architectures 2019-02-07 00:13:28 +01:00
unistd.h y2038: add 64-bit time_t syscalls to all 32-bit architectures 2019-02-07 00:13:28 +01:00
uprobes.h
vdso_datapage.h
vdso.h
virt.h
vmap_stack.h
word-at-a-time.h
xor.h arm64: crypto: add NEON accelerated XOR implementation 2018-12-06 16:47:06 +00:00