mirror of
https://github.com/AuxXxilium/linux_dsm_epyc7002.git
synced 2025-01-17 09:26:08 +07:00
02d92f7903
Before this patch the driver had one CQ database protected via one spinlock, this spinlock is meant to synchronize between CQ adding/removing and CQ IRQ interrupt handling. On a system with large number of CPUs and on a work load that requires lots of interrupts, this global spinlock becomes a very nasty hotspot and introduces a contention between the active cores, which will significantly hurt performance and becomes a bottleneck that prevents seamless cpu scaling. To solve this we simply move the CQ database and its spinlock to be per EQ (IRQ), thus per core. Tested with: system: 2 sockets, 14 cores per socket, hyperthreading, 2x14x2=56 cores netperf command: ./super_netperf 200 -P 0 -t TCP_RR -H <server> -l 30 -- -r 300,300 -o -s 1M,1M -S 1M,1M WITHOUT THIS PATCH: Average: CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle Average: all 4.32 0.00 36.15 0.09 0.00 34.02 0.00 0.00 0.00 25.41 Samples: 2M of event 'cycles:pp', Event count (approx.): 1554616897271 Overhead Command Shared Object Symbol + 14.28% swapper [kernel.vmlinux] [k] intel_idle + 12.25% swapper [kernel.vmlinux] [k] queued_spin_lock_slowpath + 10.29% netserver [kernel.vmlinux] [k] queued_spin_lock_slowpath + 1.32% netserver [kernel.vmlinux] [k] mlx5e_xmit WITH THIS PATCH: Average: CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle Average: all 4.27 0.00 34.31 0.01 0.00 18.71 0.00 0.00 0.00 42.69 Samples: 2M of event 'cycles:pp', Event count (approx.): 1498132937483 Overhead Command Shared Object Symbol + 23.33% swapper [kernel.vmlinux] [k] intel_idle + 1.69% netserver [kernel.vmlinux] [k] mlx5e_xmit Tested-by: Song Liu <songliubraving@fb.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Gal Pressman <galp@mellanox.com> |
||
---|---|---|
.. | ||
3com | ||
8390 | ||
adaptec | ||
adi | ||
aeroflex | ||
agere | ||
alacritech | ||
allwinner | ||
alteon | ||
altera | ||
amazon | ||
amd | ||
apm | ||
apple | ||
aquantia | ||
arc | ||
atheros | ||
aurora | ||
broadcom | ||
brocade | ||
cadence | ||
calxeda | ||
cavium | ||
chelsio | ||
cirrus | ||
cisco | ||
cortina | ||
davicom | ||
dec | ||
dlink | ||
emulex | ||
ezchip | ||
faraday | ||
freescale | ||
fujitsu | ||
hisilicon | ||
hp | ||
huawei | ||
i825xx | ||
ibm | ||
intel | ||
marvell | ||
mediatek | ||
mellanox | ||
micrel | ||
microchip | ||
moxa | ||
myricom | ||
natsemi | ||
neterion | ||
netronome | ||
nuvoton | ||
nvidia | ||
nxp | ||
oki-semi | ||
packetengines | ||
pasemi | ||
qlogic | ||
qualcomm | ||
rdc | ||
realtek | ||
renesas | ||
rocker | ||
samsung | ||
seeq | ||
sfc | ||
sgi | ||
silan | ||
sis | ||
smsc | ||
socionext | ||
stmicro | ||
sun | ||
synopsys | ||
tehuti | ||
ti | ||
tile | ||
toshiba | ||
tundra | ||
via | ||
wiznet | ||
xilinx | ||
xircom | ||
xscale | ||
dnet.c | ||
dnet.h | ||
ec_bhf.c | ||
ethoc.c | ||
fealnx.c | ||
jme.c | ||
jme.h | ||
Kconfig | ||
korina.c | ||
lantiq_etop.c | ||
Makefile | ||
netx-eth.c |