linux_dsm_epyc7002/arch
Nicholas Piggin d11914b21c powerpc/64: Implement clear_bit_unlock_is_negative_byte()
Commit b91e1302ad ("mm: optimize PageWaiters bit use for
unlock_page()") added a special bitop function to speed up
unlock_page(). Implement this for 64-bit powerpc.

This improves the unlock_page() core code from this:

	li	9,1
	lwsync
1:	ldarx	10,0,3,0
	andc	10,10,9
	stdcx.	10,0,3
	bne-	1b
	ori	2,2,0
	ld	9,0(3)
	andi.	10,9,0x80
	beqlr
	li	4,0
	b	wake_up_page_bit

To this:

	li	10,1
	lwsync
1:	ldarx	9,0,3,0
	andc	9,9,10
	stdcx.	9,0,3
	bne-	1b
	andi.	10,9,0x80
	beqlr
	li	4,0
	b	wake_up_page_bit

In a test of elapsed time for dd writing into 16GB of already-dirty
pagecache on a POWER8 with 4K pages, which has one unlock_page per 4kB
this patch reduced overhead by 1.1%:

    N           Min           Max        Median           Avg        Stddev
x  19         2.578         2.619         2.594         2.595         0.011
+  19         2.552         2.592         2.564         2.565         0.008
Difference at 95.0% confidence
	-0.030  +/- 0.006
	-1.142% +/- 0.243%

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Made 64-bit only until I can test it properly on 32-bit]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-18 14:40:01 +11:00
..
alpha clocksource: Use a plain u64 instead of cycle_t 2016-12-25 11:04:12 +01:00
arc ARC: Revert "ARC: mm: IOC: Don't enable IOC by default" 2017-01-18 19:21:06 -08:00
arm KVM fixes for v4.10-rc5 2017-01-20 14:19:34 -08:00
arm64 KVM fixes for v4.10-rc5 2017-01-20 14:19:34 -08:00
avr32 clocksource: Use a plain u64 instead of cycle_t 2016-12-25 11:04:12 +01:00
blackfin Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-12-25 14:30:04 -08:00
c6x clocksource: Use a plain u64 instead of cycle_t 2016-12-25 11:04:12 +01:00
cris Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
frv Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
h8300 Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
hexagon clocksource: Use a plain u64 instead of cycle_t 2016-12-25 11:04:12 +01:00
ia64 clocksource: Use a plain u64 instead of cycle_t 2016-12-25 11:04:12 +01:00
m32r Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
m68k m68k/mac: Replace via-maciisi driver with via-cuda driver 2017-02-07 16:56:25 +11:00
metag Merge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-12-25 14:05:56 -08:00
microblaze clocksource: Use a plain u64 instead of cycle_t 2016-12-25 11:04:12 +01:00
mips KVM fixes for v4.10-rc3 2017-01-06 15:27:17 -08:00
mn10300 clocksource: Use a plain u64 instead of cycle_t 2016-12-25 11:04:12 +01:00
nios2 clocksource: Use a plain u64 instead of cycle_t 2016-12-25 11:04:12 +01:00
openrisc openrisc: Add _text symbol to fix ksym build error 2017-01-02 10:35:11 +09:00
parisc parisc: Add line-break when printing segfault info 2017-01-02 18:07:25 +01:00
powerpc powerpc/64: Implement clear_bit_unlock_is_negative_byte() 2017-02-18 14:40:01 +11:00
s390 KVM fixes for v4.10-rc5 2017-01-20 14:19:34 -08:00
score Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
sh Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
sparc clocksource: Use a plain u64 instead of cycle_t 2016-12-25 11:04:12 +01:00
tile Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
um clocksource: Use a plain u64 instead of cycle_t 2016-12-25 11:04:12 +01:00
unicore32 clocksource: Use a plain u64 instead of cycle_t 2016-12-25 11:04:12 +01:00
x86 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-01-22 12:47:48 -08:00
xtensa Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-12-25 14:30:04 -08:00
.gitignore
Kconfig powerpc: ima: get the kexec buffer passed by the previous kernel 2016-12-20 09:48:40 -08:00