linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-25 19:55:28 +07:00

Author	SHA1	Message	Date
Huazhong Tan	130509213b	net: hns3: fix interrupt clearing error for VF Currently, VF driver has two kinds of interrupts, reset & CMDQ RX. For revision 0x21, according to the UM, each interrupt should be cleared by write 0 to the corresponding bit, but the implementation writes 0 to the whole register in fact, it will clear other interrupt at the same time, then the VF will loss the interrupt. But for revision 0x20, this interrupt clear register is a read & write register, for compatible, we just keep the old implementation for 0x20. This patch fixes it, also, adds a new register for reading the interrupt status according to hardware user manual. Fixes: `e2cb1dec97` ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support") Fixes: `b90fcc5bd9` ("net: hns3: add reset handling for VF when doing Core/Global/IMP reset") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-09 13:44:32 -07:00
Zhongzhu Liu	9e6717af61	net: hns3: fix GFP flag error in hclge_mac_update_stats() When CONFIG_DEBUG_ATOMIC_SLEEP on, calling kzalloc with GFP_KERNEL in hclge_mac_update_stats() will get below warning: [ 52.514677] BUG: sleeping function called from invalid context at mm/slab.h:501 [ 52.522051] in_atomic(): 0, irqs_disabled(): 0, pid: 1015, name: ifconfig [ 52.528827] 2 locks held by ifconfig/1015: [ 52.532921] #0: (____ptrval____) (&p->lock){....}, at: seq_read+0x54/0x748 [ 52.539878] #1: (____ptrval____) (rcu_read_lock){....}, at: dev_seq_start+0x0/0x140 [ 52.547610] CPU: 16 PID: 1015 Comm: ifconfig Not tainted 5.3.0-rc3-00697-g20b80be #98 [ 52.555408] Hardware name: Huawei TaiShan 2280 V2/BC82AMDC, BIOS 2280-V2 CS V3.B050.01 08/08/2019 [ 52.564242] Call trace: [ 52.566687] dump_backtrace+0x0/0x1f8 [ 52.570338] show_stack+0x14/0x20 [ 52.573646] dump_stack+0xb4/0xec [ 52.576950] ___might_sleep+0x178/0x198 [ 52.580773] __might_sleep+0x74/0xe0 [ 52.584338] __kmalloc+0x244/0x2d8 [ 52.587744] hclge_mac_update_stats+0xc8/0x1f8 [hclge] [ 52.592870] hclge_update_stats+0xe0/0x170 [hclge] [ 52.597651] hns3_nic_get_stats64+0xa0/0x458 [hns3] [ 52.602514] dev_get_stats+0x58/0x138 [ 52.606165] dev_seq_printf_stats+0x8c/0x280 [ 52.610420] dev_seq_show+0x14/0x40 [ 52.613898] seq_read+0x574/0x748 [ 52.617205] proc_reg_read+0xb4/0x108 [ 52.620857] __vfs_read+0x54/0xa8 [ 52.624162] vfs_read+0xa0/0x190 [ 52.627380] ksys_read+0xc8/0x178 [ 52.630685] __arm64_sys_read+0x40/0x50 [ 52.634509] el0_svc_common.constprop.0+0x120/0x1e0 [ 52.639369] el0_svc_handler+0x50/0x90 [ 52.643106] el0_svc+0x8/0xc So this patch uses GFP_ATOMIC instead of GFP_KERNEL to fix it. Fixes: `d174ea75c9` ("net: hns3: add statistics for PFC frames and MAC control frames") Signed-off-by: Zhongzhu Liu <liuzhongzhu@huawei.com> Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com> Reviewed-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-09 13:44:32 -07:00
YueHaibing	ca497fb6aa	taprio: remove unused variable 'entry_list_policy' net/sched/sch_taprio.c:680:32: warning: entry_list_policy defined but not used [-Wunused-const-variable=] One of the points of commit `a3d43c0d56` ("taprio: Add support adding an admin schedule") is that it removes support (it now returns "not supported") for schedules using the TCA_TAPRIO_ATTR_SCHED_SINGLE_ENTRY attribute (which were never used), the parsing of those types of schedules was the only user of this policy. So removing this policy should be fine. Reported-by: Hulk Robot <hulkci@huawei.com> Suggested-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-09 13:41:24 -07:00
Holger Hoffstätte	a7eb6a4f25	r8169: fix performance issue on RTL8168evl Disabling TSO but leaving SG active results is a significant performance drop. Therefore disable also SG on RTL8168evl. This restores the original performance. Fixes: `93681cd7d9` ("r8169: enable HW csum and TSO") Signed-off-by: Holger Hoffstätte <holger@applied-asynchrony.com> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-09 13:37:39 -07:00
Josh Hunt	1555e6fdf0	tcp: Update TCP_BASE_MSS comment TCP_BASE_MSS is used as the default initial MSS value when MTU probing is enabled. Update the comment to reflect this. Suggested-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Josh Hunt <johunt@akamai.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-09 13:03:30 -07:00
Josh Hunt	c04b79b6cf	tcp: add new tcp_mtu_probe_floor sysctl The current implementation of TCP MTU probing can considerably underestimate the MTU on lossy connections allowing the MSS to get down to 48. We have found that in almost all of these cases on our networks these paths can handle much larger MTUs meaning the connections are being artificially limited. Even though TCP MTU probing can raise the MSS back up we have seen this not to be the case causing connections to be "stuck" with an MSS of 48 when heavy loss is present. Prior to pushing out this change we could not keep TCP MTU probing enabled b/c of the above reasons. Now with a reasonble floor set we've had it enabled for the past 6 months. The new sysctl will still default to TCP_MIN_SND_MSS (48), but gives administrators the ability to control the floor of MSS probing. Signed-off-by: Josh Hunt <johunt@akamai.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-09 13:03:30 -07:00
Jiri Pirko	3a5e523479	devlink: remove pointless data_len arg from region snapshot create The size of the snapshot has to be the same as the size of the region, therefore no need to pass it again during snapshot creation. Remove the arg and use region->size instead. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-09 11:21:46 -07:00
Eric Dumazet	1a9914884d	tcp: batch calls to sk_flush_backlog() Starting from commit `d41a69f1d3` ("tcp: make tcp_sendmsg() aware of socket backlog") loopback flows got hurt, because for each skb sent, the socket receives an immediate ACK and sk_flush_backlog() causes extra work. Intent was to not let the backlog grow too much, but we went a bit too far. We can check the backlog every 16 skbs (about 1MB chunks) to increase TCP over loopback performance by about 15 % Note that the call to sk_flush_backlog() handles a single ACK, thanks to coalescing done on backlog, but cleans the 16 skbs found in rtx rb-tree. Reported-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-09 11:03:27 -07:00
Daniel Borkmann	9f30cd568b	Merge branch 'bpf-xdp-fwd-sample-improvements' Jesper Dangaard Brouer says: ==================== V3: Hopefully fixed all issues point out by Yonghong Song V2: Addressed issues point out by Yonghong Song - Please ACK patch 2/3 again - Added ACKs and reviewed-by to other patches This patchset is focused on improvements for XDP forwarding sample named xdp_fwd, which leverage the existing FIB routing tables as described in LPC2018[1] talk by David Ahern. The primary motivation is to illustrate how Toke's recent work improves usability of XDP_REDIRECT via lookups in devmap. The other patches are to help users understand the sample. I have more improvements to xdp_fwd, but those might requires changes to libbpf. Thus, sending these patches first as they are isolated. [1] http://vger.kernel.org/lpc-networking2018.html#session-1 ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-08-09 18:05:04 +02:00
Jesper Dangaard Brouer	abcce733ad	samples/bpf: xdp_fwd explain bpf_fib_lookup return codes Make it clear that this XDP program depend on the network stack to do the ARP resolution. This is connected with the BPF_FIB_LKUP_RET_NO_NEIGH return code from bpf_fib_lookup(). Another common mistake (seen via XDP-tutorial) is that users don't realize that sysctl net.ipv{4,6}.conf.all.forwarding setting is honored by bpf_fib_lookup. Reported-by: Anton Protopopov <a.s.protopopov@gmail.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Reviewed-by: David Ahern <dsahern@gmail.com> Acked-by: Yonghong Song <yhs@fb.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-08-09 18:05:03 +02:00
Jesper Dangaard Brouer	a32a32cb26	samples/bpf: make xdp_fwd more practically usable via devmap lookup This address the TODO in samples/bpf/xdp_fwd_kern.c, which points out that the chosen egress index should be checked for existence in the devmap. This can now be done via taking advantage of Toke's work in commit `0cdbb4b09a` ("devmap: Allow map lookups from eBPF"). This change makes xdp_fwd more practically usable, as this allows for a mixed environment, where IP-forwarding fallback to network stack, if the egress device isn't configured to use XDP. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Reviewed-by: David Ahern <dsahern@gmail.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-08-09 18:05:03 +02:00
Jesper Dangaard Brouer	3783d43752	samples/bpf: xdp_fwd rename devmap name to be xdp_tx_ports The devmap name 'tx_port' came from a copy-paste from xdp_redirect_map which only have a single TX port. Change name to xdp_tx_ports to make it more descriptive. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Reviewed-by: David Ahern <dsahern@gmail.com> Acked-by: Yonghong Song <yhs@fb.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-08-09 18:05:03 +02:00
Ivan Khoronzhuk	d9973cec9d	xdp: xdp_umem: fix umem pages mapping for 32bits systems Use kmap instead of page_address as it's not always in low memory. Acked-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-08-09 18:02:19 +02:00
Denis Efremov	fcc32a2165	liquidio: Use pcie_flr() instead of reimplementing it octeon_mbox_process_cmd() directly writes the PCI_EXP_DEVCTL_BCR_FLR bit, which bypasses timing requirements imposed by the PCIe spec. This patch fixes the function to use the pcie_flr() interface instead. Signed-off-by: Denis Efremov <efremov@linux.com> Reviewed-by: Andrew Murray <andrew.murray@arm.com> Reviewed-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-08 22:41:02 -07:00
Heiner Kallweit	32879f0001	r8169: allocate rx buffers using alloc_pages_node We allocate 16kb per rx buffer, so we can avoid some overhead by using alloc_pages_node directly instead of bothering kmalloc_node. Due to this change buffers are page-aligned now, therefore the alignment check can be removed. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Acked-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-08 22:34:55 -07:00
YueHaibing	018e5b4587	fq_codel: remove set but not used variables 'prev_ecn_mark' and 'prev_drop_count' Fixes gcc '-Wunused-but-set-variable' warning: net/sched/sch_fq_codel.c: In function fq_codel_dequeue: net/sched/sch_fq_codel.c:288:23: warning: variable prev_ecn_mark set but not used [-Wunused-but-set-variable] net/sched/sch_fq_codel.c:288:6: warning: variable prev_drop_count set but not used [-Wunused-but-set-variable] They are not used since commit `77ddaff218` ("fq_codel: Kill useless per-flow dropped statistic") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-08 22:32:19 -07:00
Jiri Pirko	da382875c6	mlxsw: spectrum: Extend to support Spectrum-3 ASIC Extend existing driver for Spectrum and Spectrum-2 ASICs to support Spectrum-3 ASIC as well. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-08 22:27:09 -07:00
David S. Miller	eb716a649f	Merge branch 'stmmac-next' Jose Abreu says: ==================== net: stmmac: Improvements for -next [ This is just a rebase of v2 into latest -next in order to avoid a merge conflict ] Couple of improvements for -next tree. More info in commit logs. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-08 22:20:19 -07:00
Jose Abreu	ccfc639a94	net: stmmac: selftests: Add a selftest for Flexible RX Parser Add a selftest for the Flexible RX Parser feature. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-08 22:20:19 -07:00
Jose Abreu	d6e1c12cf9	net: stmmac: Add Flexible RX Parser support in XGMAC XGMAC cores also support the Flexible RX Parser feature. Add the support for it in the XGMAC core. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-08 22:20:19 -07:00
Jose Abreu	56e58d6c8a	net: stmmac: Implement Safety Features in XGMAC core XGMAC also supports Safety Features. This patch implements the configuration and handling of this feature in XGMAC core. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-08 22:20:19 -07:00
Jose Abreu	74043f6b22	net: stmmac: selftests: Add test for VLAN and Double VLAN Filtering Add a selftest for VLAN and Double VLAN Filtering in stmmac. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-08 22:20:19 -07:00
Jose Abreu	3cd1cfcba2	net: stmmac: Implement VLAN Hash Filtering in XGMAC Implement the VLAN Hash Filtering feature in XGMAC core. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-08 22:20:19 -07:00
Jose Abreu	1fbdad0005	net: stmmac: selftests: Add RSS test Add a test for RSS in the stmmac selftests. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-08 22:20:19 -07:00
Jose Abreu	76067459c6	net: stmmac: Implement RSS and enable it in XGMAC core Implement the RSS functionality and add the corresponding callbacks in XGMAC core. Changes from v1: - Do not use magic constants (Jakub) - Use ethtool_rxfh_indir_default() (Jakub) Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-08 22:20:19 -07:00
Jose Abreu	7035aad875	net: stmmac: xgmac: Implement tx_queue_prio() Implement the TX Queue Priority callback in XGMAC core. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-08 22:20:19 -07:00
Jose Abreu	5656ac5542	net: stmmac: xgmac: Implement set_mtl_tx_queue_weight() Implement the TX Queue Weight callback. In order for this to be active we also need to set ETS algorithm when configuring Queue. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-08 22:20:19 -07:00
Jose Abreu	b6cdf09f51	net: stmmac: xgmac: Implement MMC counters Implement the MMC counters feature in XGMAC core. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-08 22:20:18 -07:00
John Rutherford	6c9081a391	tipc: add loopback device tracking Since node internal messages are passed directly to the socket, it is not possible to observe those messages via tcpdump or wireshark. We now remedy this by making it possible to clone such messages and send the clones to the loopback interface. The clones are dropped at reception and have no functional role except making the traffic visible. The feature is enabled if network taps are active for the loopback device. pcap filtering restrictions require the messages to be presented to the receiving side of the loopback device. v3 - Function dev_nit_active used to check for network taps. - Procedure netif_rx_ni used to send cloned messages to loopback device. Signed-off-by: John Rutherford <john.rutherford@dektech.com.au> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-08 22:11:39 -07:00
David S. Miller	2339ef1cf3	Merge branch 'flow_offload-add-indr-block-in-nf_table_offload' wenxu says: ==================== flow_offload: add indr-block in nf_table_offload This series patch make nftables offload support the vlan and tunnel device offload through indr-block architecture. The first four patches mv tc indr block to flow offload and rename to flow-indr-block. Because the new flow-indr-block can't get the tcf_block directly. The fifth patch provide a callback list to get flow_block of each subsystem immediately when the device register and contain a block. The last patch make nf_tables_offload support flow-indr-block. This version add a mutex lock for add/del flow_indr_block_ing_cb ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-08 18:44:31 -07:00
wenxu	9a32669fec	netfilter: nf_tables_offload: support indr block call nftable support indr-block call. It makes nftable an offload vlan and tunnel device. nft add table netdev firewall nft add chain netdev firewall aclout { type filter hook ingress offload device mlx_pf0vf0 priority - 300 \; } nft add rule netdev firewall aclout ip daddr 10.0.0.1 fwd to vlan0 nft add chain netdev firewall aclin { type filter hook ingress device vlan0 priority - 300 \; } nft add rule netdev firewall aclin ip daddr 10.0.0.7 fwd to mlx_pf0vf0 Signed-off-by: wenxu <wenxu@ucloud.cn> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-08 18:44:30 -07:00
wenxu	1150ab0f1b	flow_offload: support get multi-subsystem block It provide a callback list to find the blocks of tc and nft subsystems Signed-off-by: wenxu <wenxu@ucloud.cn> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-08 18:44:30 -07:00
wenxu	4e481908c5	flow_offload: move tc indirect block to flow offload move tc indirect block to flow_offload and rename it to flow indirect block.The nf_tables can use the indr block architecture. Signed-off-by: wenxu <wenxu@ucloud.cn> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-08 18:44:30 -07:00
wenxu	e4da910211	cls_api: add flow_indr_block_call function This patch make indr_block_call don't access struct tc_indr_block_cb and tc_indr_block_dev directly Signed-off-by: wenxu <wenxu@ucloud.cn> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-08 18:44:30 -07:00
wenxu	f843698857	cls_api: remove the tcf_block cache Remove the tcf_block in the tc_indr_block_dev for muti-subsystem support. Signed-off-by: wenxu <wenxu@ucloud.cn> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-08 18:44:30 -07:00
wenxu	242453c227	cls_api: modify the tc_indr_block_ing_cmd parameters. This patch make tc_indr_block_ing_cmd can't access struct tc_indr_block_dev and tc_indr_block_cb. Signed-off-by: wenxu <wenxu@ucloud.cn> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-08 18:44:30 -07:00
David S. Miller	61552d2ce8	Merge branch 'net-batched-receive-in-GRO-path' Edward Cree says: ==================== net: batched receive in GRO path This series listifies part of GRO processing, in a manner which allows those packets which are not GROed (i.e. for which dev_gro_receive returns GRO_NORMAL) to be passed on to the listified regular receive path. dev_gro_receive() itself is not listified, nor the per-protocol GRO callback, since GRO's need to hold packets on lists under napi->gro_hash makes keeping the packets on other lists awkward, and since the GRO control block state of held skbs can refer only to one 'new' skb at a time. Instead, when napi_frags_finish() handles a GRO_NORMAL result, stash the skb onto a list in the napi struct, which is received at the end of the napi poll or when its length exceeds the (new) sysctl net.core.gro_normal_batch. Performance figures with this series, collected on a back-to-back pair of Solarflare sfn8522-r2 NICs with 120-second NetPerf tests. In the stats, sample size n for old and new code is 6 runs each; p is from a Welch t-test. Tests were run both with GRO enabled and disabled, the latter simulating uncoalesceable packets (e.g. due to IP or TCP options). The receive side (which was the device under test) had the NetPerf process pinned to one CPU, and the device interrupts pinned to a second CPU. CPU utilisation figures (used in cases of line-rate performance) are summed across all CPUs. net.core.gro_normal_batch was left at its default value of 8. TCP 4 streams, GRO on: all results line rate (9.415Gbps) net-next: 210.3% cpu after #1: 181.5% cpu (-13.7%, p=0.031 vs net-next) after #3: 196.7% cpu (- 8.4%, p=0.136 vs net-next) TCP 4 streams, GRO off: net-next: 8.017 Gbps after #1: 7.785 Gbps (- 2.9%, p=0.385 vs net-next) after #3: 7.604 Gbps (- 5.1%, p=0.282 vs net-next. But note ) TCP 1 stream, GRO off: net-next: 6.553 Gbps after #1: 6.444 Gbps (- 1.7%, p=0.302 vs net-next) after #3: 6.790 Gbps (+ 3.6%, p=0.169 vs net-next) TCP 1 stream, GRO on, busy_read = 50: all results line rate net-next: 156.0% cpu after #1: 174.5% cpu (+11.9%, p=0.015 vs net-next) after #3: 165.0% cpu (+ 5.8%, p=0.147 vs net-next) TCP 1 stream, GRO off, busy_read = 50: net-next: 6.488 Gbps after #1: 6.625 Gbps (+ 2.1%, p=0.059 vs net-next) after #3: 7.351 Gbps (+13.3%, p=0.026 vs net-next) TCP_RR 100 streams, GRO off, 8000 byte payload net-next: 995.083 us after #1: 969.167 us (- 2.6%, p=0.204 vs net-next) after #3: 976.433 us (- 1.9%, p=0.254 vs net-next) TCP_RR 100 streams, GRO off, 8000 byte payload, busy_read = 50: net-next: 2.851 ms after #1: 2.871 ms (+ 0.7%, p=0.134 vs net-next) after #3: 2.937 ms (+ 3.0%, p<0.001 vs net-next) TCP_RR 100 streams, GRO off, 1 byte payload, busy_read = 50: net-next: 867.317 us after #1: 865.717 us (- 0.2%, p=0.334 vs net-next) after #3: 868.517 us (+ 0.1%, p=0.414 vs net-next) () These tests produced a mixture of line-rate and below-line-rate results, meaning that statistically speaking the results were 'censored' by the upper bound, and were thus not normally distributed, making a Welch t-test mathematically invalid. I therefore also calculated estimators according to [1], which gave the following: net-next: 8.133 Gbps after #1: 8.130 Gbps (- 0.0%, p=0.499 vs net-next) after #3: 7.680 Gbps (- 5.6%, p=0.285 vs net-next) (though my procedure for determining ν wasn't mathematically well-founded either, so take that p-value with a grain of salt). A further check came from dividing the bandwidth figure by the CPU usage for each test run, giving: net-next: 3.461 after #1: 3.198 (- 7.6%, p=0.145 vs net-next) after #3: 3.641 (+ 5.2%, p=0.280 vs net-next) The above results are fairly mixed, and in most cases not statistically significant. But I think we can roughly conclude that the series marginally improves non-GROable throughput, without hurting latency (except in the large-payload busy-polling case, which in any case yields horrid performance even on net-next (almost triple the latency without busy-poll). Also, drivers which, unlike sfc, pass UDP traffic to GRO would expect to see a benefit from gaining access to batching. Changed in v3: * gro_normal_batch sysctl now uses SYSCTL_ONE instead of &one * removed RFC tags (no comments after a week means no-one objects, right?) Changed in v2: * During busy poll, call gro_normal_list() to receive batched packets after each cycle of the napi busy loop. See comments in Patch #3 for complications of doing the same in busy_poll_stop(). [1]: Cohen 1959, doi: 10.1080/00401706.1959.10489859 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-08 18:22:29 -07:00
Edward Cree	323ebb61e3	net: use listified RX for handling GRO_NORMAL skbs When GRO decides not to coalesce a packet, in napi_frags_finish(), instead of passing it to the stack immediately, place it on a list in the napi struct. Then, at flush time (napi_complete_done(), napi_poll(), or napi_busy_loop()), call netif_receive_skb_list_internal() on the list. We'd like to do that in napi_gro_flush(), but it's not called if !napi->gro_bitmask, so we have to do it in the callers instead. (There are a handful of drivers that call napi_gro_flush() themselves, but it's not clear why, or whether this will affect them.) Because a full 64 packets is an inefficiently large batch, also consume the list whenever it exceeds gro_normal_batch, a new net/core sysctl that defaults to 8. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-08 18:22:29 -07:00
Edward Cree	6727013694	sfc: falcon: don't score irq moderation points for GRO Same rationale as for sfc, except that this wasn't performance-tested. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-08 18:22:29 -07:00
Edward Cree	5e040d4b1a	sfc: don't score irq moderation points for GRO We already scored points when handling the RX event, no-one else does this, and looking at the history it appears this was originally meant to only score on merges, not on GRO_NORMAL. Moreover, it gets in the way of changing GRO to not immediately pass GRO_NORMAL skbs to the stack. Performance testing with four TCP streams received on a single CPU (where throughput was line rate of 9.4Gbps in all tests) showed a 13.7% reduction in RX CPU usage (n=6, p=0.03). Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-08 18:22:29 -07:00
Rahul Verma	5e6d9fc761	qed: Add new ethtool supported port types based on media. Supported ports in ethtool <eth1> are displayed based on media type. For media type fibre and twinaxial, port type is "FIBRE". Media type Base-T is "TP" and media KR is "Backplane". V1->V2: Corrected the subject. Signed-off-by: Rahul Verma <rahulv@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-08 18:14:07 -07:00
Chuhong Yuan	ad2dcba008	cxgb4: smt: Use normal int for refcount All refcount operations are protected by spinlocks now. Then the atomic counter can be replaced by a normal int. This patch depends on PATCH 1/2. Signed-off-by: Chuhong Yuan <hslester96@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-08 18:12:17 -07:00
Chuhong Yuan	4a8937b838	cxgb4: smt: Add lock for atomic_dec_and_test The atomic_dec_and_test() is not safe because it is outside of locks. Move the locks of t4_smte_free() to its caller, cxgb4_smt_release() to protect the atomic decrement. Fixes: `3bdb376e69` ("cxgb4: introduce SMT ops to prepare for SMAC rewrite support") Signed-off-by: Chuhong Yuan <hslester96@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-08 18:12:17 -07:00
David Ahern	e858ef1cd4	selftests: Add l2tp tests Add IPv4 and IPv6 l2tp tests. Current set is over IP and with IPsec. v2 - add l2tp.sh to TEST_PROGS in Makefile Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-08 18:08:09 -07:00
Alexey Dobriyan	9d2f112383	net: delete "register" keyword Delete long obsoleted "register" keyword. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-08 18:03:42 -07:00
Chuhong Yuan	4b4de39850	mkiss: Use refcount_t for refcount refcount_t is better for reference counters since its implementation can prevent overflows. So convert atomic_t ref counters to refcount_t. Signed-off-by: Chuhong Yuan <hslester96@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-08 17:58:40 -07:00
Chuhong Yuan	31168a6d12	dpaa_eth: Use refcount_t for refcount refcount_t is better for reference counters since its implementation can prevent overflows. So convert atomic_t ref counters to refcount_t. Signed-off-by: Chuhong Yuan <hslester96@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-08 17:57:54 -07:00
David S. Miller	b3a598eb0d	This feature/cleanup patchset includes the following patches: - bump version strings, by Simon Wunderlich - Replace usage of strlcpy with strscpy, by Sven Eckelmann - Add OGMv2 per-interface queue and aggregations, by Linus Luessing (2 patches) -----BEGIN PGP SIGNATURE----- iQJKBAABCgA0FiEE1ilQI7G+y+fdhnrfoSvjmEKSnqEFAl1MHekWHHN3QHNpbW9u d3VuZGVybGljaC5kZQAKCRChK+OYQpKeoW8TEACox7vhtW4MS8QCzPySCU3f8V6m f+2PPlUbM4CqXcOPGw/jQng6PAtcb4gNPssm52GaxdeB9jFqkI/ELdSn5mCh+EcG QRrhf5DVruYyqBU2gNhovEe7SlJl8IJno6kFdAggaMngXnvlBzIr7n4FMIUKNFYn 6kFbA8pugBXXvhiRcuzs+l5iUxdKUxTsUNPyppyqnqb8lrb0/30/681dfq87PmcV zehEf8Ry23W7CVQv6YougVJvK0GUwysULsvm8Wc8FsOke7CeeQIPLEF2Pcrl/CFM mfynXVXngE41MPBC59eUcWBGlRYwkuwm4Q+YQ8OUjr5+X5YP06jR5Dh8u6KVAMuy QWGSwyrlXSsCE6BTxoijdJqsLzHDXCmYY0GQI2tEMCyDnL95CU3tuTk4vckusuf+ NlhHv7m+Bo0w9ztDUBifzNyURW9VgUCoOZfW9rdYRWjjN8Oe6wnMWCFttGnkg0qu zrCJn5mGvz3Vp434K0uGY1wOZincSdM6grBSgmZv1UNMaBdlfNroJsUqvz6IE5Fe iI5kqRXoUG+ftfwacgyFEK08HpcZHvJkNDiHHRlOPPZm75yE7mwreVUrGRaibywQ pzTEwM+H2MX9xF+osjiPVc197fxpnkX9fI2LhDOEiblaJbdiIxd+cgWRMFS1YPvd ANaFfAh58+gDWcRqhg== =Cbzf -----END PGP SIGNATURE----- Merge tag 'batadv-next-for-davem-20190808' of git://git.open-mesh.org/linux-merge Simon Wunderlich says: ==================== This feature/cleanup patchset includes the following patches: - bump version strings, by Simon Wunderlich - Replace usage of strlcpy with strscpy, by Sven Eckelmann - Add OGMv2 per-interface queue and aggregations, by Linus Luessing (2 patches) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-08 11:28:40 -07:00
Yonghong Song	b707659213	tools/bpf: fix core_reloc.c compilation error On my local machine, I have the following compilation errors: ===== In file included from prog_tests/core_reloc.c:3:0: ./progs/core_reloc_types.h:517:46: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘fancy_char_ptr_t’ typedef const char * const volatile restrict fancy_char_ptr_t; ^ ./progs/core_reloc_types.h:527:2: error: unknown type name ‘fancy_char_ptr_t’ fancy_char_ptr_t d; ^ ===== I am using gcc 4.8.5. Later compilers may change their behavior not emitting the error. Nevertheless, let us fix the issue. "restrict" can be tested without typedef. Fixes: `9654e2ae90` ("selftests/bpf: add CO-RE relocs modifiers/typedef tests") Cc: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-08-07 18:24:03 -07:00
Alexei Starovoitov	726e333fd2	Merge branch 'compile-once-run-everywhere' Andrii Nakryiko says: ==================== This patch set implements central part of CO-RE (Compile Once - Run Everywhere, see [0] and [1] for slides and video): relocating fields offsets. Most of the details are written down as comments to corresponding parts of the code. Patch #1 adds a bunch of commonly useful btf_xxx helpers to simplify working with BTF types. Patch #2 converts existing libbpf code to these new helpers and removes some of pre-existing ones. Patch #3 adds loading of .BTF.ext offset relocations section and macros to work with its contents. Patch #4 implements CO-RE relocations algorithm in libbpf. Patch #5 introduced BPF_CORE_READ macro, hiding usage of Clang's __builtin_preserve_access_index intrinsic that records offset relocation. Patches #6-#14 adds selftests validating various parts of relocation handling, type compatibility, etc. For all tests to work, you'll need latest Clang/LLVM supporting __builtin_preserve_access_index intrinsic, used for recording offset relocations. Kernel on which selftests run should have BTF information built in (CONFIG_DEBUG_INFO_BTF=y). [0] http://vger.kernel.org/bpfconf2019.html#session-2 [1] http://vger.kernel.org/lpc-bpf2018.html#session-2 v5->v6: - fix bad comment formatting for real (Alexei); v4->v5: - drop constness for btf_xxx() helpers, allowing to avoid type casts (Alexei); - rebase on latest bpf-next, change test__printf back to printf; v3->v4: - added btf_xxx helpers (Alexei); - switched libbpf code to new helpers; - reduced amount of logging and simplified format in few places (Alexei); - made flavor name parsing logic more strict (exactly three underscores); - no uname() error checking (Alexei); - updated misc tests to reflect latest Clang fixes (Yonghong); v2->v3: - enclose BPF_CORE_READ args in parens (Song); v1->v2: - add offsetofend(), fix btf_ext optional fields checks (Song); - add bpf_core_dump_spec() for logging spec representation; - move special first element processing out of the loop (Song); - typo fixes (Song); - drop BPF_ST \| BPF_MEM insn relocation (Alexei); - extracted BPF_CORE_READ into bpf_helpers (Alexei); - added extra tests validating Clang capturing relocs correctly (Yonghong); - switch core_relocs.c to use sub-tests; - updated mods tests after Clang bug was fixed (Yonghong); - fix bug enumerating candidate types; ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-08-07 14:43:50 -07:00

... 2 3 4 5 6 ...

856975 Commits