linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-22 19:24:27 +07:00

History

David S. Miller 20eb08b2b0 mlx5-updates-2019-04-22 This series includes updates to mlx5e driver RX data path and some significant XDP RX/TX improvements to overcome/mitigate HW and PCIE bottlenecks. From Tariq: 1) Some Enhancements in rq->flags 2) Stabilize RX packet rate (on Striding RQ) with multiple outstanding UMR posts In this patch, we add support for multiple outstanding UMR posts, to allow faster gap closure between consuming MPWQEs and reposting them back into the WQ. Performance test: As expected, huge improvement in large-scale (48 cores). xdp_redirect_map, 64B UDP multi-stream. Redirect from ConnectX-5 100Gbps to ConnectX-6 100Gbps. CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz. Before: Unstable, 7 to 30 Mpps After: Stable, at 70.5 Mpps From Shay: 3) XDP, Inline small packets into the TX MPWQE in XDP xmit flow Upon high packet rate with multiple CPUs TX workloads, much of the HCA's resources are spent on prefetching TX descriptors, thus affecting transmission rates. This patch comes to mitigate this problem by moving some workload to the CPU and reducing the HW data prefetch overhead for small packets (<= 256B). When forwarding packets with XDP, a packet that is smaller than a certain size (set to ~256 bytes) would be sent inline within its WQE TX descrptor (mem-copied), when the hardware tx queue is congested beyond a pre-defined water-mark. Performance: Tested packet rate for UDP 64Byte multi-stream over two dual port ConnectX-5 100Gbps NICs. CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz * Tested with hyper-threading disabled XDP_TX: \| \| before \| after \| \| \| 24 rings \| 51Mpps \| 116Mpps \| +126% \| \| 1 ring \| 12Mpps \| 12Mpps \| same \| XDP_REDIRECT: ** Below is the transmit rate, not the redirection rate which might be larger, and is not affected by this patch. \| \| before \| after \| \| \| 32 rings \| 64Mpps \| 92Mpps \| +43% \| \| 1 ring \| 6.4Mpps \| 6.4Mpps \| same \| As we can see, feature significantly improves scaling, without hurting single ring performance. From Maxim: 4) Some trivial refactoring and code improvements prior to a larger series to support AF_XDP. -Saeed. -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJcv2LjAAoJEEg/ir3gV/o+90gIAI8+4lwkXZAVk4mxf9PMjxuB bQiKd80e++26sgrNHCyuWZnIzTQqYAnUJ3WRC+Kk1pFTo1O23A+fvweT8m1dqAvP Z/5ktfbAeF3fwOVu7aGu9vh4zJEWJj8oO+I+G+OaOe2iV7FVTTFnWHxiiCfungAW oUnXozq4vERSQLechqqgz6nACxOPgEOCJrp4T9lDYSbqZizHgFttmInMQguq/7KS LvITcNu3EF5l4y2LxwCFiKRgGc2y/belU63AK+2pQUXhH46kQPEHdncdLg5d9QYA xJwthn697qxS0PIP5oHPHNVN+qJXfuUHVonXqVOAJebGQnV82of6+sPweRxwh1s= =MfAR -----END PGP SIGNATURE----- Merge tag 'mlx5-updates-2019-04-22' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2019-04-22 This series includes updates to mlx5e driver RX data path and some significant XDP RX/TX improvements to overcome/mitigate HW and PCIE bottlenecks. From Tariq: 1) Some Enhancements in rq->flags 2) Stabilize RX packet rate (on Striding RQ) with multiple outstanding UMR posts In this patch, we add support for multiple outstanding UMR posts, to allow faster gap closure between consuming MPWQEs and reposting them back into the WQ. Performance test: As expected, huge improvement in large-scale (48 cores). xdp_redirect_map, 64B UDP multi-stream. Redirect from ConnectX-5 100Gbps to ConnectX-6 100Gbps. CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz. Before: Unstable, 7 to 30 Mpps After: Stable, at 70.5 Mpps From Shay: 3) XDP, Inline small packets into the TX MPWQE in XDP xmit flow Upon high packet rate with multiple CPUs TX workloads, much of the HCA's resources are spent on prefetching TX descriptors, thus affecting transmission rates. This patch comes to mitigate this problem by moving some workload to the CPU and reducing the HW data prefetch overhead for small packets (<= 256B). When forwarding packets with XDP, a packet that is smaller than a certain size (set to ~256 bytes) would be sent inline within its WQE TX descrptor (mem-copied), when the hardware tx queue is congested beyond a pre-defined water-mark. Performance: Tested packet rate for UDP 64Byte multi-stream over two dual port ConnectX-5 100Gbps NICs. CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz * Tested with hyper-threading disabled XDP_TX: \| \| before \| after \| \| \| 24 rings \| 51Mpps \| 116Mpps \| +126% \| \| 1 ring \| 12Mpps \| 12Mpps \| same \| XDP_REDIRECT: ** Below is the transmit rate, not the redirection rate which might be larger, and is not affected by this patch. \| \| before \| after \| \| \| 32 rings \| 64Mpps \| 92Mpps \| +43% \| \| 1 ring \| 6.4Mpps \| 6.4Mpps \| same \| As we can see, feature significantly improves scaling, without hurting single ring performance. From Maxim: 4) Some trivial refactoring and code improvements prior to a larger series to support AF_XDP. ==================== Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>		2019-04-23 17:03:40 -07:00
..
appletalk	ipv4: Prepare rtable for IPv6 gateway	2019-04-08 15:22:40 -07:00
arcnet
bonding	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2019-04-17 11:26:25 -07:00
caif
can
dsa	net: dsa: mv88e6xxx: Only reconfigure MAC when something changes	2019-04-19 14:08:21 -07:00
ethernet	mlx5-updates-2019-04-22	2019-04-23 17:03:40 -07:00
fddi
fjes
hamradio
hippi	net: hippi:Fix misuse of %x in rrunner.c	2019-04-21 10:37:26 -07:00
hyperv	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2019-04-05 14:14:19 -07:00
ieee802154	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2019-03-27 17:37:58 -07:00
ipvlan
netdevsim	netdevsim: move sdev-specific init/uninit code into separate functions	2019-04-12 16:49:54 -07:00
phy	net: phy: vitesse: Remove support for VSC8514.	2019-04-23 10:47:58 -07:00
plip
ppp
slip
team	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2019-04-17 11:26:25 -07:00
usb	r8152: sync sa_family with the media type of network device	2019-04-22 22:14:43 -07:00
vmxnet3
wan
wimax
wireless	wireless-drivers-next patches for 5.2	2019-04-18 11:07:55 -07:00
xen-netback	xen-netback: add reference from xenvif to backend_info to facilitate coredump analysis	2019-04-12 10:10:28 -07:00
dummy.c	net: dummy: use generic helper to report timestamping info	2019-04-12 16:26:37 -07:00
eql.c
geneve.c	ipv6: Move ipv6 stubs to a separate header file	2019-03-29 10:53:45 -07:00
gtp.c	genetlink: make policy common to family	2019-03-22 10:38:23 -04:00
ifb.c
Kconfig	net: devlink: select NET_DEVLINK from drivers	2019-03-24 14:55:31 -04:00
LICENSE.SRC
loopback.c	net: loopback: use generic helper to report timestamping info	2019-04-12 16:26:37 -07:00
macsec.c	macsec: add noinline tag to avoid a frame size warning	2019-04-01 18:52:05 -07:00
macvlan.c	macvlan: pass get_ts_info and SIOC[SG]HWTSTAMP ioctl to real device	2019-03-20 11:04:41 -07:00
macvtap.c
Makefile
mdio.c
mii.c
net_failover.c	net: remove 'fallback' argument from dev->ndo_select_queue()	2019-03-20 11:18:55 -07:00
netconsole.c
nlmon.c
ntb_netdev.c
rionet.c
sb1000.c	sb1000: fix variable set but not used warnings	2019-04-18 17:06:15 -07:00
Space.c
sungem_phy.c
tap.c
thunderbolt.c
tun.c	net: convert rps_needed and rfs_needed to new static branch api	2019-03-23 21:57:38 -04:00
veth.c	net: veth: use generic helper to report timestamping info	2019-04-12 16:26:37 -07:00
virtio_net.c	virtio-net: Fix some minor formatting errors	2019-04-06 18:10:11 -07:00
vrf.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2019-04-08 23:39:36 -07:00
vsockmon.c
vxlan.c	ipv6: Move ipv6 stubs to a separate header file	2019-03-29 10:53:45 -07:00
xen-netfront.c	xen-netfront: mark expected switch fall-through	2019-04-16 21:03:02 -07:00