linux_dsm_epyc7002/arch/x86/lib
Linus Torvalds bafaecd11d x86-64: support native xadd rwsem implementation
This one is much faster than the spinlock based fallback rwsem code,
with certain artifical benchmarks having shown 300%+ improvement on
threaded page faults etc.

Again, note the 32767-thread limit here. So this really does need that
whole "make rwsem_count_t be 64-bit and fix the BIAS values to match"
extension on top of it, but that is conceptually a totally independent
issue.

NOT TESTED! The original patch that this all was based on were tested by
KAMEZAWA Hiroyuki, but maybe I screwed up something when I created the
cleaned-up series, so caveat emptor..

Also note that it _may_ be a good idea to mark some more registers
clobbered on x86-64 in the inline asms instead of saving/restoring them.
They are inline functions, but they are only used in places where there
are not a lot of live registers _anyway_, so doing for example the
clobbers of %r8-%r11 in the asm wouldn't make the fast-path code any
worse, and would make the slow-path code smaller.

(Not that the slow-path really matters to that degree. Saving a few
unnecessary registers is the _least_ of our problems when we hit the slow
path. The instruction/cycle counting really only matters in the fast
path).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <alpine.LFD.2.00.1001121810410.17145@localhost.localdomain>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-01-13 22:39:50 -08:00
..
.gitignore x86: Gitignore: arch/x86/lib/inat-tables.c 2009-11-04 13:11:28 +01:00
atomic64_32.c x86: atomic64: Inline atomic64_read() again 2009-07-04 11:45:00 +02:00
checksum_32.S i386: move lib 2007-10-11 11:16:33 +02:00
clear_page_64.S x86: Fix symbol annotation for arch/x86/lib/clear_page_64.S::clear_page_c 2009-06-30 23:43:15 +02:00
cmpxchg8b_emu.S x86: Provide an alternative() based cmpxchg64() 2009-09-30 22:55:59 +02:00
copy_page_64.S x86_64: move lib 2007-10-11 11:17:08 +02:00
copy_user_64.S x86-64: Modify copy_user_generic() alternatives mechanism 2009-12-30 11:57:31 +01:00
copy_user_nocache_64.S x86: wrong register was used in align macro 2008-07-30 10:10:39 -07:00
csum-copy_64.S x86_64: move lib 2007-10-11 11:17:08 +02:00
csum-partial_64.c x86: fix csum_partial() export 2008-05-13 19:38:47 +02:00
csum-wrappers_64.c x86: clean up csum-wrappers_64.c some more 2008-02-19 16:18:32 +01:00
delay.c x86, delay: tsc based udelay should have rdtsc_barrier 2009-06-25 16:47:40 -07:00
getuser.S x86: use _types.h headers in asm where available 2009-02-13 11:35:01 -08:00
inat.c x86: AVX instruction set decoder support 2009-10-29 08:47:46 +01:00
insn.c x86: AVX instruction set decoder support 2009-10-29 08:47:46 +01:00
io_64.c x86: coding style fixes in arch/x86/lib/io_64.c 2008-02-19 16:18:32 +01:00
iomap_copy_64.S x86_64: move lib 2007-10-11 11:17:08 +02:00
Makefile x86-64: support native xadd rwsem implementation 2010-01-13 22:39:50 -08:00
memcpy_32.c x86: coding style fixes to arch/x86/lib/memcpy_32.c 2008-04-17 17:40:49 +02:00
memcpy_64.S x86-64: Modify memcpy()/memset() alternatives mechanism 2009-12-30 11:57:32 +01:00
memmove_64.c x86: coding style fixes to arch/x86/lib/memmove_64.c 2008-04-17 17:40:48 +02:00
memset_64.S x86-64: Modify memcpy()/memset() alternatives mechanism 2009-12-30 11:57:32 +01:00
mmx_32.c x86: clean up mmx_32.c 2008-04-17 17:40:47 +02:00
msr-reg-export.c x86, msr: change msr-reg.o to obj-y, and export its symbols 2009-09-04 10:00:09 -07:00
msr-reg.S x86, msr: Fix msr-reg.S compilation with gas 2.16.1, on 32-bit too 2009-09-03 21:26:34 +02:00
msr-smp.c x86, msr: msrs_alloc/free for CONFIG_SMP=n 2009-12-16 15:36:32 -08:00
msr.c x86, msr: msrs_alloc/free for CONFIG_SMP=n 2009-12-16 15:36:32 -08:00
putuser.S x86: merge putuser asm functions. 2008-07-09 09:14:13 +02:00
rwlock_64.S x86: rename .i assembler includes to .h 2007-10-17 20:16:29 +02:00
rwsem_64.S x86-64: support native xadd rwsem implementation 2010-01-13 22:39:50 -08:00
semaphore_32.S Generic semaphore implementation 2008-04-17 10:42:34 -04:00
string_32.c x86: coding style fixes to arch/x86/lib/string_32.c 2008-08-15 16:53:25 +02:00
strstr_32.c x86: coding style fixes to arch/x86/lib/strstr_32.c 2008-08-15 16:53:24 +02:00
thunk_32.S ftrace: trace irq disabled critical timings 2008-05-23 20:32:46 +02:00
thunk_64.S ftrace: trace irq disabled critical timings 2008-05-23 20:32:46 +02:00
usercopy_32.c x86: Turn the copy_from_user check into an (optional) compile time warning 2009-10-01 11:31:04 +02:00
usercopy_64.c x86, 64-bit: Clean up user address masking 2009-06-20 15:40:00 -07:00
x86-opcode-map.txt x86: Add Intel FMA instructions to x86 opcode map 2009-10-29 08:47:47 +01:00