linux_dsm_epyc7002/arch/x86/lib
Alexey Dobriyan 890890cb8e x86/i386: Use less assembly in strlen(), speed things up a bit
Current i386 strlen() hardcodes NOT/DEC sequence. DEC is
mentioned to be suboptimal on Core2. So, put only REPNE SCASB
sequence in assembly, compiler can do the rest.

The difference in generated code is like below (MCORE2=y):

	<strlen>:
		push   %edi
		mov    $0xffffffff,%ecx
		mov    %eax,%edi
		xor    %eax,%eax
		repnz scas %es:(%edi),%al
		not    %ecx

	-	dec    %ecx
	-	mov    %ecx,%eax
	+	lea    -0x1(%ecx),%eax

		pop    %edi
		ret

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jan Beulich <JBeulich@suse.com>
Link: http://lkml.kernel.org/r/20111211181319.GA17097@p183.telecom.by
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-12-12 18:33:42 +01:00
..
.gitignore x86: Gitignore: arch/x86/lib/inat-tables.c 2009-11-04 13:11:28 +01:00
atomic64_32.c atomic: use <linux/atomic.h> 2011-07-26 16:49:47 -07:00
atomic64_386_32.S x86: Use {push,pop}_cfi in more places 2011-02-28 18:06:22 +01:00
atomic64_cx8_32.S x86: Use {push,pop}_cfi in more places 2011-02-28 18:06:22 +01:00
cache-smp.c x86, lib: Add wbinvd smp helpers 2010-01-22 16:05:42 -08:00
checksum_32.S x86: Use {push,pop}_cfi in more places 2011-02-28 18:06:22 +01:00
clear_page_64.S x86, mem: clear_page_64.S: Support clear_page() with enhanced REP MOVSB/STOSB 2011-05-17 15:40:27 -07:00
cmpxchg8b_emu.S x86: Provide an alternative() based cmpxchg64() 2009-09-30 22:55:59 +02:00
cmpxchg16b_emu.S percpu: Omit segment prefix in the UP case for cmpxchg_double 2011-03-27 19:25:36 -07:00
cmpxchg.c x86, asm: Merge cmpxchg_486_u64() and cmpxchg8b_emu() 2010-07-28 17:05:11 -07:00
copy_page_64.S x86: Make alternative instruction pointers relative 2011-07-13 11:22:56 -07:00
copy_user_64.S x86, 64-bit: Fix copy_[to/from]_user() checks for the userspace address limit 2011-05-18 12:49:00 +02:00
copy_user_nocache_64.S x86: wrong register was used in align macro 2008-07-30 10:10:39 -07:00
csum-copy_64.S x86: Clean up csum-copy_64.S a bit 2011-03-18 10:44:26 +01:00
csum-partial_64.c x86: Fix common misspellings 2011-03-18 10:39:30 +01:00
csum-wrappers_64.c
delay.c x86: udelay: Use this_cpu_read to avoid address calculation 2011-01-04 06:08:55 +01:00
getuser.S x86: use _types.h headers in asm where available 2009-02-13 11:35:01 -08:00
inat.c x86: AVX instruction set decoder support 2009-10-29 08:47:46 +01:00
insn.c x86: Fix insn decoder for longer instruction 2011-10-10 09:05:51 +02:00
iomap_copy_64.S
Makefile Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2011-07-22 17:02:24 -07:00
memcpy_32.c x86, mem: Optimize memmove for small size and unaligned cases 2010-09-24 18:57:11 -07:00
memcpy_64.S Merge branches 'x86-apic-for-linus', 'x86-asm-for-linus' and 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2011-05-19 17:49:35 -07:00
memmove_64.S x86: Make alternative instruction pointers relative 2011-07-13 11:22:56 -07:00
memset_64.S x86, mem: memset_64.S: Optimize memset by enhanced REP MOVSB/STOSB 2011-05-17 15:40:31 -07:00
mmx_32.c x86: clean up mmx_32.c 2008-04-17 17:40:47 +02:00
msr-reg-export.c x86, msr: change msr-reg.o to obj-y, and export its symbols 2009-09-04 10:00:09 -07:00
msr-reg.S x86, msr: Fix msr-reg.S compilation with gas 2.16.1, on 32-bit too 2009-09-03 21:26:34 +02:00
msr-smp.c x86, msr: msrs_alloc/free for CONFIG_SMP=n 2009-12-16 15:36:32 -08:00
msr.c x86, msr: msrs_alloc/free for CONFIG_SMP=n 2009-12-16 15:36:32 -08:00
putuser.S x86: merge putuser asm functions. 2008-07-09 09:14:13 +02:00
rwlock.S x86: Fix write lock scalability 64-bit issue 2011-07-21 09:03:36 +02:00
rwsem.S x86: Unify rwsem assembly implementation 2011-07-21 09:03:32 +02:00
string_32.c x86/i386: Use less assembly in strlen(), speed things up a bit 2011-12-12 18:33:42 +01:00
strstr_32.c x86: coding style fixes to arch/x86/lib/strstr_32.c 2008-08-15 16:53:24 +02:00
thunk_32.S x86: Remove unused bits from lib/thunk_*.S 2011-02-28 18:06:22 +01:00
thunk_64.S x86: Fix write lock scalability 64-bit issue 2011-07-21 09:03:36 +02:00
usercopy_32.c x86: Turn the copy_from_user check into an (optional) compile time warning 2009-10-01 11:31:04 +02:00
usercopy_64.c x86, 64-bit: Clean up user address masking 2009-06-20 15:40:00 -07:00
usercopy.c x86, perf: Make copy_from_user_nmi() a library function 2011-07-21 20:41:57 +02:00
x86-opcode-map.txt x86: Add Intel FMA instructions to x86 opcode map 2009-10-29 08:47:47 +01:00