linux_dsm_epyc7002/arch/s390/include
Heiko Carstens 4423028203 s390/spinlock: optimize spin_unlock code
Use a memory barrier + store sequence instead of a load + compare and swap
sequence to unlock a spinlock and an rw lock.
For the spinlock case this saves us two memory reads and a not needed cpu
serialization after the compare and swap instruction stored the new value.

The kernel size (performance_defconfig) gets reduced by ~14k.

Average execution time of a tight inlined spin_unlock loop drops from
5.8ns to 0.7ns on a zEC12 machine.

An artificial stress test case where several counters are protected with
a single spinlock and which are only incremented while holding the spinlock
shows ~30% improvement on a 4 cpu machine.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-09 08:53:30 +02:00
..
asm s390/spinlock: optimize spin_unlock code 2014-09-09 08:53:30 +02:00
uapi/asm s390: wire up memfd_create syscall 2014-08-12 13:00:08 +02:00