mirror of
https://github.com/AuxXxilium/linux_dsm_epyc7002.git
synced 2025-01-19 02:56:15 +07:00
80127a3968
Currently the percpu-rwsem switches to (global) atomic ops while a writer is waiting; which could be quite a while and slows down releasing the readers. This patch cures this problem by ordering the reader-state vs reader-count (see the comments in __percpu_down_read() and percpu_down_write()). This changes a global atomic op into a full memory barrier, which doesn't have the global cacheline contention. This also enables using the percpu-rwsem with rcu_sync disabled in order to bias the implementation differently, reducing the writer latency by adding some cost to readers. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org [ Fixed modular build. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
---|---|---|
.. | ||
lglock.c | ||
lockdep_internals.h | ||
lockdep_proc.c | ||
lockdep_states.h | ||
lockdep.c | ||
locktorture.c | ||
Makefile | ||
mcs_spinlock.h | ||
mutex-debug.c | ||
mutex-debug.h | ||
mutex.c | ||
mutex.h | ||
osq_lock.c | ||
percpu-rwsem.c | ||
qrwlock.c | ||
qspinlock_paravirt.h | ||
qspinlock_stat.h | ||
qspinlock.c | ||
rtmutex_common.h | ||
rtmutex-debug.c | ||
rtmutex-debug.h | ||
rtmutex.c | ||
rtmutex.h | ||
rwsem-spinlock.c | ||
rwsem-xadd.c | ||
rwsem.c | ||
rwsem.h | ||
semaphore.c | ||
spinlock_debug.c | ||
spinlock.c |