linux_dsm_epyc7002/include
Shay Agroskin c2273219ba net/mlx5e: XDP, Inline small packets into the TX MPWQE in XDP xmit flow
Upon high packet rate with multiple CPUs TX workloads, much of the HCA's
resources are spent on prefetching TX descriptors, thus affecting
transmission rates.
This patch comes to mitigate this problem by moving some workload to the
CPU and reducing the HW data prefetch overhead for small packets (<= 256B).

When forwarding packets with XDP, a packet that is smaller
than a certain size (set to ~256 bytes) would be sent inline within
its WQE TX descrptor (mem-copied), when the hardware tx queue is congested
beyond a pre-defined water-mark.

This is added to better utilize the HW resources (which now makes
one less packet data prefetch) and allow better scalability, on the
account of CPU usage (which now 'memcpy's the packet into the WQE).

To load balance between HW and CPU and get max packet rate, we use
watermarks to detect how much the HW is congested and move the work
loads back and forth between HW and CPU.

Performance:
Tested packet rate for UDP 64Byte multi-stream
over two dual port ConnectX-5 100Gbps NICs.
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz

* Tested with hyper-threading disabled

XDP_TX:

|          | before | after   |       |
| 24 rings | 51Mpps | 116Mpps | +126% |
| 1 ring   | 12Mpps | 12Mpps  | same  |

XDP_REDIRECT:

** Below is the transmit rate, not the redirection rate
which might be larger, and is not affected by this patch.

|          | before  | after   |      |
| 32 rings | 64Mpps  | 92Mpps  | +43% |
| 1 ring   | 6.4Mpps | 6.4Mpps | same |

As we can see, feature significantly improves scaling, without
hurting single ring performance.

Signed-off-by: Shay Agroskin <shayag@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-04-23 12:09:20 -07:00
..
acpi ACPI: use different default debug value than ACPICA 2019-03-25 10:45:59 +01:00
asm-generic syscalls: Remove start and number from syscall_set_arguments() args 2019-04-05 09:27:23 -04:00
clocksource
crypto
drm drm/atomic-helper: Make atomic_enable/disable crtc callbacks optional 2019-03-29 11:56:52 +01:00
dt-bindings dt-bindings: clock: sifive: add FU540-C000 PRCI clock constants 2019-04-09 20:36:40 -07:00
keys KEYS: trusted: fix -Wvarags warning 2019-04-08 15:58:54 -07:00
kvm ARM: some cleanups, direct physical timer assignment, cache sanitization 2019-03-15 15:00:28 -07:00
linux net/mlx5e: XDP, Inline small packets into the TX MPWQE in XDP xmit flow 2019-04-23 12:09:20 -07:00
math-emu
media
memory
misc auxdisplay: charlcd: Introduce charlcd_free() helper 2019-03-17 08:48:16 +01:00
net ipv6: Restore RTF_ADDRCONF check in rt6_qualify_for_ecmp 2019-04-21 19:44:16 -07:00
pcmcia
ras
rdma
scsi
soc IOMMU Updates for Linux v5.1 2019-03-10 12:29:52 -07:00
sound ASoC: core: conditionally increase module refcount on component open 2019-04-08 14:15:44 +07:00
target
trace ipv6: Add fib6_type and fib6_flags to fib6_result 2019-04-17 23:11:30 -07:00
uapi tipc: introduce new socket option TIPC_SOCK_RECVQ_USED 2019-04-19 14:59:05 -07:00
video media updates for v5.1-rc1 2019-03-09 14:45:54 -08:00
xen