linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-26 15:55:12 +07:00

History

Nicholas Piggin aa65ff6b18 powerpc/64s: Implement queued spinlocks and rwlocks These have shown significantly improved performance and fairness when spinlock contention is moderate to high on very large systems. With this series including subsequent patches, on a 16 socket 1536 thread POWER9, a stress test such as same-file open/close from all CPUs gets big speedups, 11620op/s aggregate with simple spinlocks vs 384158op/s (33x faster), where the difference in throughput between the fastest and slowest thread goes from 7x to 1.4x. Thanks to the fast path being identical in terms of atomics and barriers (after a subsequent optimisation patch), single threaded performance is not changed (no measurable difference). On smaller systems, performance and fairness seems to be generally improved. Using dbench on tmpfs as a test (that starts to run into kernel spinlock contention), a 2-socket OpenPOWER POWER9 system was tested with bare metal and KVM guest configurations. Results can be found here: https://github.com/linuxppc/issues/issues/305#issuecomment-663487453 Observations are: - Queued spinlocks are equal when contention is insignificant, as expected and as measured with microbenchmarks. - When there is contention, on bare metal queued spinlocks have better throughput and max latency at all points. - When virtualised, queued spinlocks are slightly worse approaching peak throughput, but significantly better throughput and max latency at all points beyond peak, until queued spinlock maximum latency rises when clients are 2x vCPUs. The regressions haven't been analysed very well yet, there are a lot of things that can be tuned, particularly the paravirtualised locking, but the numbers already look like a good net win even on relatively small systems. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Waiman Long <longman@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200724131423.1362108-4-npiggin@gmail.com		2020-07-27 00:01:23 +10:00
..
acpi	Merge branch 'acpica'	2020-06-10 17:27:28 +02:00
asm-generic	powerpc/64s: Implement queued spinlocks and rwlocks	2020-07-27 00:01:23 +10:00
clocksource
crypto
drm	drm/edid: Replace zero-length array with flexible-array	2020-06-15 23:08:31 -05:00
dt-bindings	- qcom :	2020-06-11 12:42:14 -07:00
keys	RxRPC: Replace zero-length array with flexible-array	2020-06-15 23:08:32 -05:00
kunit
kvm
linux	powerpc/64s: Remove PROT_SAO support	2020-07-22 00:01:25 +10:00
math-emu
media	media updates for v5.8-rc1	2020-06-13 13:09:38 -07:00
misc	ocxl: control via sysfs whether the FPGA is reloaded on a link reset	2020-07-15 11:07:19 +10:00
net	netfilter: flowtable: Make nf_flow_table_offload_add/del_cb inline	2020-06-15 18:06:52 -07:00
pcmcia
ras
rdma	dynamic_debug: add an option to enable dynamic debug for modules only	2020-06-08 11:05:56 -07:00
scsi	SCSI misc on 20200605	2020-06-05 15:11:50 -07:00
soc	pci-v5.8-changes	2020-06-06 11:01:58 -07:00
sound	ASoC: Updates for v5.8	2020-06-01 20:26:07 +02:00
target	scsi: target: Rename target_setup_cmd_from_cdb() to target_cmd_parse_cdb()	2020-06-09 21:57:26 -04:00
trace	powerpc/64s: Remove PROT_SAO support	2020-07-22 00:01:25 +10:00
uapi	libnvdimm for 5.8-rc2	2020-06-20 13:13:21 -07:00
vdso
video
xen	xen: Move xen_setup_callback_vector() definition to include/xen/hvm.h	2020-06-11 15:15:19 +02:00