mirror of
https://github.com/AuxXxilium/linux_dsm_epyc7002.git
synced 2025-01-14 00:26:18 +07:00
735fc4054b
This patch change the API for ndo_xdp_xmit to support bulking xdp_frames. When kernel is compiled with CONFIG_RETPOLINE, XDP sees a huge slowdown. Most of the slowdown is caused by DMA API indirect function calls, but also the net_device->ndo_xdp_xmit() call. Benchmarked patch with CONFIG_RETPOLINE, using xdp_redirect_map with single flow/core test (CPU E5-1650 v4 @ 3.60GHz), showed performance improved: for driver ixgbe: 6,042,682 pps -> 6,853,768 pps = +811,086 pps for driver i40e : 6,187,169 pps -> 6,724,519 pps = +537,350 pps With frames avail as a bulk inside the driver ndo_xdp_xmit call, further optimizations are possible, like bulk DMA-mapping for TX. Testing without CONFIG_RETPOLINE show the same performance for physical NIC drivers. The virtual NIC driver tun sees a huge performance boost, as it can avoid doing per frame producer locking, but instead amortize the locking cost over the bulk. V2: Fix compile errors reported by kbuild test robot <lkp@intel.com> V4: Isolated ndo, driver changes and callers. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> |
||
---|---|---|
.. | ||
i40e_adminq_cmd.h | ||
i40e_adminq.c | ||
i40e_adminq.h | ||
i40e_alloc.h | ||
i40e_client.c | ||
i40e_client.h | ||
i40e_common.c | ||
i40e_dcb_nl.c | ||
i40e_dcb.c | ||
i40e_dcb.h | ||
i40e_debugfs.c | ||
i40e_devids.h | ||
i40e_diag.c | ||
i40e_diag.h | ||
i40e_ethtool.c | ||
i40e_hmc.c | ||
i40e_hmc.h | ||
i40e_lan_hmc.c | ||
i40e_lan_hmc.h | ||
i40e_main.c | ||
i40e_nvm.c | ||
i40e_osdep.h | ||
i40e_prototype.h | ||
i40e_ptp.c | ||
i40e_register.h | ||
i40e_status.h | ||
i40e_trace.h | ||
i40e_txrx.c | ||
i40e_txrx.h | ||
i40e_type.h | ||
i40e_virtchnl_pf.c | ||
i40e_virtchnl_pf.h | ||
i40e.h | ||
Makefile |