linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2025-02-14 08:45:24 +07:00

Author	SHA1	Message	Date
Jisheng Zhang	0f5c6c30a0	net: mvneta: fix mvneta_config_rss on armada 3700 The mvneta Ethernet driver is used on a few different Marvell SoCs. Some SoCs have per cpu interrupts for Ethernet events, the driver uses a per CPU napi structure for this case. Some SoCs such as armada 3700 have a single interrupt for Ethernet events, the driver uses a global napi structure for this case. Current mvneta_config_rss() always operates the per cpu napi structure. Fix it by operating a global napi for "single interrupt" case, and per cpu napi structure for remaining cases. Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Fixes: `2636ac3cc2` ("net: mvneta: Add network support for Armada 3700 SoC") Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-10 14:40:11 -07:00
Ursula Braun	0d86caff06	net/smc: send response to test link signal With SMC-D z/OS sends a test link signal every 10 seconds. Linux is supposed to answer, otherwise the SMC-D connection breaks. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-10 14:38:43 -07:00
David S. Miller	a487711aac	Merge branch 'r8169-smaller-improvements' Heiner Kallweit says: ==================== r8169: smaller improvements This series includes smaller improvements, no functional change intended. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-10 14:32:35 -07:00
Heiner Kallweit	abe8b2f71e	r8169: don't configure max jumbo frame size per chip version We don't have to configure the max jumbo frame size per chip (sub-)version. It can be easily determined based on the chip family. And new members of the RTL8168 family (if there are any) should be automatically covered. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-10 14:32:35 -07:00
Heiner Kallweit	eb88f5f712	r8169: don't configure csum function per chip version We don't have to configure the csum function per chip (sub-)version. The distinction is simple, versions RTL8102e and from RTL8168c onwards support csum_v2. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-10 14:32:35 -07:00
Heiner Kallweit	05bbe5584f	r8169: simplify interrupt handler Simplify the interrupt handler a little and make it better readable. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-10 14:32:35 -07:00
Heiner Kallweit	098b01ad98	r8169: don't include asm headers directly The asm headers shouldn't be included directly. asm/irq.h is implicitly included by linux/interrupt.h, and instead of asm/io.h include linux/io.h. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-10 14:32:35 -07:00
Heiner Kallweit	bc4fcd0a1b	r8169: remove version info The version number hasn't changed for ages and in general I doubt it provides any benefit. The message in rtl_init_one() may even be misleading because it's printed also if something fails in probe. Therefore let's remove the version information. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-10 14:26:59 -07:00
David S. Miller	0780b86666	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2018-08-10 Here's one more (most likely last) bluetooth-next pull request for the 4.19 kernel. - Added support for MediaTek serial Bluetooth devices - Initial skeleton for controller-side address resolution support - Fix BT_HCIUART_RTL related Kconfig dependencies - A few other minor fixes/cleanups Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-10 14:24:57 -07:00
Daniel Borkmann	74b247f4c3	Merge branch 'bpf-btf-for-htab-lru' Yonghong Song says: ==================== Commit `a26ca7c982` ("bpf: btf: Add pretty print support to the basic arraymap") added pretty print support to array map. This patch adds pretty print for hash and lru_hash maps. The following example shows the pretty-print result of a pinned hashmap. Without this patch set, user will get an error instead. struct map_value { int count_a; int count_b; }; cat /sys/fs/bpf/pinned_hash_map: 87907: {87907,87908} 57354: {37354,57355} 76625: {76625,76626} ... Patch #1 fixed a bug in bpffs map_seq_next() function so that all elements in the hash table will be traversed. Patch #2 implemented map_seq_show_elem() and map_check_btf() callback functions for hash and lru hash maps. Patch #3 enhanced tools/testing/selftests/bpf/test_btf.c to test bpffs hash and lru hash map pretty print. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-08-10 20:54:09 +02:00
Yonghong Song	af2a81dab4	tools/bpf: add bpffs pretty print btf test for hash/lru_hash maps Pretty print tests for hash/lru_hash maps are added in test_btf.c. The btf type blob is the same as pretty print array map test. The test result: $ mount -t bpf bpf /sys/fs/bpf $ ./test_btf -p BTF pretty print array......OK BTF pretty print hash......OK BTF pretty print lru hash......OK PASS:3 SKIP:0 FAIL:0 Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-08-10 20:54:07 +02:00
Yonghong Song	699c86d6ec	bpf: btf: add pretty print for hash/lru_hash maps Commit `a26ca7c982` ("bpf: btf: Add pretty print support to the basic arraymap") added pretty print support to array map. This patch adds pretty print for hash and lru_hash maps. The following example shows the pretty-print result of a pinned hashmap: struct map_value { int count_a; int count_b; }; cat /sys/fs/bpf/pinned_hash_map: 87907: {87907,87908} 57354: {37354,57355} 76625: {76625,76626} ... Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-08-10 20:54:07 +02:00
Yonghong Song	dc1508a579	bpf: fix bpffs non-array map seq_show issue In function map_seq_next() of kernel/bpf/inode.c, the first key will be the "0" regardless of the map type. This works for array. But for hash type, if it happens key "0" is in the map, the bpffs map show will miss some items if the key "0" is not the first element of the first bucket. This patch fixed the issue by guaranteeing to get the first element, if the seq_show is just started, by passing NULL pointer key to map_get_next_key() callback. This way, no missing elements will occur for bpffs hash table show even if key "0" is in the map. Fixes: `a26ca7c982` ("bpf: btf: Add pretty print support to the basic arraymap") Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-08-10 20:54:07 +02:00
David S. Miller	fd685657cd	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following batch contains netfilter updates for your net-next tree: 1) Expose NFT_OSF_MAXGENRELEN maximum OS name length from the new OS passive fingerprint matching extension, from Fernando Fernandez. 2) Add extension to support for fine grain conntrack timeout policies from nf_tables. As preparation works, this patchset moves nf_ct_untimeout() to nf_conntrack_timeout and it also decouples the timeout policy from the ctnl_timeout object, most work done by Harsha Sharma. 3) Enable connection tracking when conntrack helper is in place. 4) Missing enumeration in uapi header when splitting original xt_osf to nfnetlink_osf, also from Fernando. 5) Fix a sparse warning due to incorrect typing in the nf_osf_find(), from Wei Yongjun. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-10 10:33:08 -07:00
Ganesh Goudar	ebddd97afb	cxgb4: add support to display DCB info display Data Center bridging information in debug fs. Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-10 10:26:28 -07:00
Colin Ian King	e4ed2b9eff	net: chelsio: cxgb2: remove unused array pci_speed Array pci_speed is defined but is never used hence it is redundant and can be removed. Cleans up clang warning: warning: 'pci_speed' defined but not used [-Wunused-const-variable=] Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-10 10:24:50 -07:00
Colin Ian King	29a06a7799	mlxsw: remove unused arrays mlxsw_i2c_driver_name and mlxsw_pci_driver_name Arrays mlxsw_i2c_driver_name and mlxsw_pci_driver_name are defined but never used hence they are redundant and can be removed. Cleans up clang warnings: warning: 'mlxsw_i2c_driver_name' defined but not used warning: 'mlxsw_pci_driver_name' defined but not used Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-10 10:23:51 -07:00
Krzysztof Kozlowski	c9fbb2d252	net: Provide stub for __netif_set_xps_queue if there is no CONFIG_XPS Building virtio_net driver without CONFIG_XPS fails with: drivers/net/virtio_net.c: In function ‘virtnet_set_affinity’: drivers/net/virtio_net.c:1910:3: error: implicit declaration of function ‘__netif_set_xps_queue’ [-Werror=implicit-function-declaration] __netif_set_xps_queue(vi->dev, mask, i, false); ^ Fixes: `4d99f6602c` ("net: allow to call netif_reset_xps_queues() under cpus_read_lock") Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-10 10:22:22 -07:00
Linus Torvalds	f313b43be4	Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fix from Wolfram Sang: "A single driver bugfix for I2C. The bug was found by systematically stress testing the driver, so I am confident to merge it that late in the cycle although it is probably unusually large" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: xlp9xx: Fix case where SSIF read transaction completes early	2018-08-10 10:04:56 -07:00
Ankit Navik	aa12af77aa	Bluetooth: Add definitions for LE set address resolution Add the definitions for LE address resolution enable HCI commands. When the LE address resolution enable gets changed via HCI commands make sure that flag gets updated. Signed-off-by: Ankit Navik <ankit.p.navik@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2018-08-10 16:57:57 +02:00
Daniel Borkmann	60afdf066a	Merge branch 'bpf-veth-xdp-support' Toshiaki Makita says: ==================== This patch set introduces driver XDP for veth. Basically this is used in conjunction with redirect action of another XDP program. NIC -----------> veth===veth (XDP) (redirect) (XDP) In this case xdp_frame can be forwarded to the peer veth without modification, so we can expect far better performance than generic XDP. Envisioned use-cases -------------------- * Container managed XDP program Container host redirects frames to containers by XDP redirect action, and privileged containers can deploy their own XDP programs. * XDP program cascading Two or more XDP programs can be called for each packet by redirecting xdp frames to veth. * Internal interface for an XDP bridge When using XDP redirection to create a virtual bridge, veth can be used to create an internal interface for the bridge. Implementation -------------- This changeset is making use of NAPI to implement ndo_xdp_xmit and XDP_TX/REDIRECT. This is mainly because XDP heavily relies on NAPI context. - patch 1: Export a function needed for veth XDP. - patch 2-3: Basic implementation of veth XDP. - patch 4-6: Add ndo_xdp_xmit. - patch 7-9: Add XDP_TX and XDP_REDIRECT. - patch 10: Performance optimization for multi-queue env. Tests and performance numbers ----------------------------- Tested with a simple XDP program which only redirects packets between NIC and veth. I used i40e 25G NIC (XXV710) for the physical NIC. The server has 20 of Xeon Silver 2.20 GHz cores. pktgen --(wire)--> XXV710 (i40e) <--(XDP redirect)--> veth===veth (XDP) The rightmost veth loads XDP progs and just does DROP or TX. The number of packets is measured in the XDP progs. The leftmost pktgen sends packets at 37.1 Mpps (almost 25G wire speed). veth XDP action Flows Mpps ================================ DROP 1 10.6 DROP 2 21.2 DROP 100 36.0 TX 1 5.0 TX 2 10.0 TX 100 31.0 I also measured netperf TCP_STREAM but was not so great performance due to lack of tx/rx checksum offload and TSO, etc. netperf <--(wire)--> XXV710 (i40e) <--(XDP redirect)--> veth===veth (XDP PASS) Direction Flows Gbps ============================== external->veth 1 20.8 external->veth 2 23.5 external->veth 100 23.6 veth->external 1 9.0 veth->external 2 17.8 veth->external 100 22.9 Also tested doing ifup/down or load/unload a XDP program repeatedly during processing XDP packets in order to check if enabling/disabling NAPI is working as expected, and found no problems. v8: - Don't use xdp_frame pointer address to calculate skb->head, headroom, and xdp_buff.data_hard_start. v7: - Introduce xdp_scrub_frame() to clear kernel pointers in xdp_frame and use it instead of memset(). v6: - Check skb->len only if reallocation is needed. - Add __GFP_NOWARN to alloc_page() since it can be triggered by external events. - Fix sparse warning around EXPORT_SYMBOL. v5: - Fix broken SOBs. v4: - Don't adjust MTU automatically. - Skip peer IFF_UP check on .ndo_xdp_xmit() because it is unnecessary. Add comments to explain that. - Use redirect_info instead of xdp_mem_info for storing no_direct flag to avoid per packet copy cost. v3: - Drop skb bulk xmit patch since it makes little performance difference. The hotspot in TCP skb xmit at this point is checksum computation in skb_segment and packet copy on XDP_REDIRECT due to cloned/nonlinear skb. - Fix race on closing device. - Add extack messages in ndo_bpf. v2: - Squash NAPI patch with "Add driver XDP" patch. - Remove conversion from xdp_frame to skb when NAPI is not enabled. - Introduce per-queue XDP ring (patch 8). - Introduce bulk skb xmit when XDP is enabled on the peer (patch 9). ==================== Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-08-10 16:12:23 +02:00
Toshiaki Makita	638264dc90	veth: Support per queue XDP ring Move XDP and napi related fields from veth_priv to newly created veth_rq structure. When xdp_frames are enqueued from ndo_xdp_xmit and XDP_TX, rxq is selected by current cpu. When skbs are enqueued from the peer device, rxq is one to one mapping of its peer txq. This way we have a restriction that the number of rxqs must not less than the number of peer txqs, but leave the possibility to achieve bulk skb xmit in the future because txq lock would make it possible to remove rxq ptr_ring lock. v3: - Add extack messages. - Fix array overrun in veth_xmit. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-08-10 16:12:21 +02:00
Toshiaki Makita	d1396004dd	veth: Add XDP TX and REDIRECT This allows further redirection of xdp_frames like NIC -> veth--veth -> veth--veth (XDP) (XDP) (XDP) The intermediate XDP, redirecting packets from NIC to the other veth, reuses xdp_mem_info from NIC so that page recycling of the NIC works on the destination veth's XDP. In this way return_frame is not fully guarded by NAPI, since another NAPI handler on another cpu may use the same xdp_mem_info concurrently. Thus disable napi_direct by xdp_set_return_frame_no_direct() during the NAPI context. v8: - Don't use xdp_frame pointer address for data_hard_start of xdp_buff. v4: - Use xdp_[set\|clear]_return_frame_no_direct() instead of a flag in xdp_mem_info. v3: - Fix double free when veth_xdp_tx() returns a positive value. - Convert xdp_xmit and xdp_redir variables into flags. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-08-10 16:12:21 +02:00
Toshiaki Makita	2539650fad	xdp: Helpers for disabling napi_direct of xdp_return_frame We need some mechanism to disable napi_direct on calling xdp_return_frame_rx_napi() from some context. When veth gets support of XDP_REDIRECT, it will redirects packets which are redirected from other devices. On redirection veth will reuse xdp_mem_info of the redirection source device to make return_frame work. But in this case .ndo_xdp_xmit() called from veth redirection uses xdp_mem_info which is not guarded by NAPI, because the .ndo_xdp_xmit() is not called directly from the rxq which owns the xdp_mem_info. This approach introduces a flag in bpf_redirect_info to indicate that napi_direct should be disabled even when _rx_napi variant is used as well as helper functions to use it. A NAPI handler who wants to use this flag needs to call xdp_set_return_frame_no_direct() before processing packets, and call xdp_clear_return_frame_no_direct() after xdp_do_flush_map() before exiting NAPI. v4: - Use bpf_redirect_info for storing the flag instead of xdp_mem_info to avoid per-frame copy cost. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-08-10 16:12:21 +02:00
Toshiaki Makita	0b19cc0a86	bpf: Make redirect_info accessible from modules We are going to add kern_flags field in redirect_info for kernel internal use. In order to avoid function call to access the flags, make redirect_info accessible from modules. Also as it is now non-static, add prefix bpf_ to redirect_info. v6: - Fix sparse warning around EXPORT_SYMBOL. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-08-10 16:12:21 +02:00
Toshiaki Makita	af87a3aa1b	veth: Add ndo_xdp_xmit This allows NIC's XDP to redirect packets to veth. The destination veth device enqueues redirected packets to the napi ring of its peer, then they are processed by XDP on its peer veth device. This can be thought as calling another XDP program by XDP program using REDIRECT, when the peer enables driver XDP. Note that when the peer veth device does not set driver xdp, redirected packets will be dropped because the peer is not ready for NAPI. v4: - Don't use xdp_ok_fwd_dev() because checking IFF_UP is not necessary. Add comments about it and check only MTU. v2: - Drop the part converting xdp_frame into skb when XDP is not enabled. - Implement bulk interface of ndo_xdp_xmit. - Implement XDP_XMIT_FLUSH bit and drop ndo_xdp_flush. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-08-10 16:12:21 +02:00
Toshiaki Makita	9fc8d518d9	veth: Handle xdp_frames in xdp napi ring This is preparation for XDP TX and ndo_xdp_xmit. This allows napi handler to handle xdp_frames through xdp ring as well as sk_buff. v8: - Don't use xdp_frame pointer address to calculate skb->head and headroom. v7: - Use xdp_scrub_frame() instead of memset(). v3: - Revert v2 change around rings and use a flag to differentiate skb and xdp_frame, since bulk skb xmit makes little performance difference for now. v2: - Use another ring instead of using flag to differentiate skb and xdp_frame. This approach makes bulk skb transmit possible in veth_xmit later. - Clear xdp_frame feilds in skb->head. - Implement adjust_tail. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-08-10 16:12:21 +02:00
Toshiaki Makita	a8d5b4ab35	xdp: Helper function to clear kernel pointers in xdp_frame xdp_frame has kernel pointers which should not be readable from bpf programs. When we want to reuse xdp_frame region but it may be read by bpf programs later, we can use this helper to clear kernel pointers. This is more efficient than calling memset() for the entire struct. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-08-10 16:12:20 +02:00
Toshiaki Makita	dc2248220a	veth: Avoid drops by oversized packets when XDP is enabled Oversized packets including GSO packets can be dropped if XDP is enabled on receiver side, so don't send such packets from peer. Drop TSO and SCTP fragmentation features so that veth devices themselves segment packets with XDP enabled. Also cap MTU accordingly. v4: - Don't auto-adjust MTU but cap max MTU. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-08-10 16:12:20 +02:00
Toshiaki Makita	948d4f214f	veth: Add driver XDP This is the basic implementation of veth driver XDP. Incoming packets are sent from the peer veth device in the form of skb, so this is generally doing the same thing as generic XDP. This itself is not so useful, but a starting point to implement other useful veth XDP features like TX and REDIRECT. This introduces NAPI when XDP is enabled, because XDP is now heavily relies on NAPI context. Use ptr_ring to emulate NIC ring. Tx function enqueues packets to the ring and peer NAPI handler drains the ring. Currently only one ring is allocated for each veth device, so it does not scale on multiqueue env. This can be resolved by allocating rings on the per-queue basis later. Note that NAPI is not used but netif_rx is used when XDP is not loaded, so this does not change the default behaviour. v6: - Check skb->len only when allocation is needed. - Add __GFP_NOWARN to alloc_page() as it can be triggered by external events. v3: - Fix race on closing the device. - Add extack messages in ndo_bpf. v2: - Squashed with the patch adding NAPI. - Implement adjust_tail. - Don't acquire consumer lock because it is guarded by NAPI. - Make poll_controller noop since it is unnecessary. - Register rxq_info on enabling XDP rather than on opening the device. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-08-10 16:12:20 +02:00
Toshiaki Makita	b0768a8658	net: Export skb_headers_offset_update This is needed for veth XDP which does skb_copy_expand()-like operation. v2: - Drop skb_copy_header part because it has already been exported now. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-08-10 16:12:20 +02:00
Daniel Borkmann	c4c2021754	Merge branch 'bpf-sample-cpumap-lb' Jesper Dangaard Brouer says: ==================== Background: cpumap moves the SKB allocation out of the driver code, and instead allocate it on the remote CPU, and invokes the regular kernel network stack with the newly allocated SKB. The idea behind the XDP CPU redirect feature, is to use XDP as a load-balancer step in-front of regular kernel network stack. But the current sample code does not provide a good example of this. Part of the reason is that, I have implemented this as part of Suricata XDP load-balancer. Given this is the most frequent feature request I get. This patchset implement the same XDP load-balancing as Suricata does, which is a symmetric hash based on the IP-pairs + L4-protocol. The expected setup for the use-case is to reduce the number of NIC RX queues via ethtool (as XDP can handle more per core), and via smp_affinity assign these RX queues to a set of CPUs, which will be handling RX packets. The CPUs that runs the regular network stack is supplied to the sample xdp_redirect_cpu tool by specifying the --cpu option multiple times on the cmdline. I do note that cpumap SKB creation is not feature complete yet, and more work is coming. E.g. given GRO is not implemented yet, do expect TCP workloads to be slower. My measurements do indicate UDP workloads are faster. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-08-10 16:07:51 +02:00
Jesper Dangaard Brouer	1bca4e6b18	samples/bpf: xdp_redirect_cpu load balance like Suricata This implement XDP CPU redirection load-balancing across available CPUs, based on the hashing IP-pairs + L4-protocol. This equivalent to xdp-cpu-redirect feature in Suricata, which is inspired by the Suricata 'ippair' hashing code. An important property is that the hashing is flow symmetric, meaning that if the source and destination gets swapped then the selected CPU will remain the same. This is helps locality by placing both directions of a flows on the same CPU, in a forwarding/routing scenario. The hashing INITVAL (15485863 the 10^6th prime number) was fairly arbitrary choosen, but experiments with kernel tree pktgen scripts (pktgen_sample04_many_flows.sh +pktgen_sample05_flow_per_thread.sh) showed this improved the distribution. This patch also change the default loaded XDP program to be this load-balancer. As based on different user feedback, this seems to be the expected behavior of the sample xdp_redirect_cpu. Link: https://github.com/OISF/suricata/commit/796ec08dd7a63 Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-08-10 16:07:49 +02:00
Jesper Dangaard Brouer	1139568658	samples/bpf: add Paul Hsieh's (LGPL 2.1) hash function SuperFastHash Adjusted function call API to take an initval. This allow the API user to set the initial value, as a seed. This could also be used for inputting the previous hash. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-08-10 16:07:49 +02:00
Björn Töpel	eb91e4d4db	Revert "xdp: add NULL pointer check in __xdp_return()" This reverts commit `36e0f12bbf`. The reverted commit adds a WARN to check against NULL entries in the mem_id_ht rhashtable. Any kernel path implementing the XDP (generic or driver) fast path is required to make a paired xdp_rxq_info_reg/xdp_rxq_info_unreg call for proper function. In addition, a driver using a different allocation scheme than the default MEM_TYPE_PAGE_SHARED is required to additionally call xdp_rxq_info_reg_mem_model. For MEM_TYPE_ZERO_COPY, an xdp_rxq_info_reg_mem_model call ensures that the mem_id_ht rhashtable has a properly inserted allocator id. If not, this would be a driver bug. A NULL pointer kernel OOPS is preferred to the WARN. Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-08-10 16:06:48 +02:00
David S. Miller	e91e218946	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2018-08-10 The following pull-request contains BPF updates for your net tree. The main changes are: 1) Fix cpumap and devmap on teardown as they're under RCU context and won't have same assumption as running under NAPI protection, from Jesper. 2) Fix various sockmap bugs in bpf_tcp_sendmsg() code, e.g. we had a bug where socket error was not propagated correctly, from Daniel. 3) Fix incompatible libbpf header license for BTF code and match it before it gets officially released with the rest of libbpf which is LGPL-2.1, from Martin. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-09 23:18:29 -07:00
Ganesh Goudar	36d2f761b5	cxgb4: update 1.20.8.0 as the latest firmware supported Change t4fw_version.h to update latest firmware version number to 1.20.8.0. Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-09 14:35:42 -07:00
Andrei Vagin	4d99f6602c	net: allow to call netif_reset_xps_queues() under cpus_read_lock The definition of static_key_slow_inc() has cpus_read_lock in place. In the virtio_net driver, XPS queues are initialized after setting the queue:cpu affinity in virtnet_set_affinity() which is already protected within cpus_read_lock. Lockdep prints a warning when we are trying to acquire cpus_read_lock when it is already held. This patch adds an ability to call __netif_set_xps_queue under cpus_read_lock(). Acked-by: Jason Wang <jasowang@redhat.com> ============================================ WARNING: possible recursive locking detected 4.18.0-rc3-next-20180703+ #1 Not tainted -------------------------------------------- swapper/0/1 is trying to acquire lock: 00000000cf973d46 (cpu_hotplug_lock.rw_sem){++++}, at: static_key_slow_inc+0xe/0x20 but task is already holding lock: 00000000cf973d46 (cpu_hotplug_lock.rw_sem){++++}, at: init_vqs+0x513/0x5a0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(cpu_hotplug_lock.rw_sem); lock(cpu_hotplug_lock.rw_sem); * DEADLOCK * May be due to missing lock nesting notation 3 locks held by swapper/0/1: #0: 00000000244bc7da (&dev->mutex){....}, at: __driver_attach+0x5a/0x110 #1: 00000000cf973d46 (cpu_hotplug_lock.rw_sem){++++}, at: init_vqs+0x513/0x5a0 #2: 000000005cd8463f (xps_map_mutex){+.+.}, at: __netif_set_xps_queue+0x8d/0xc60 v2: move cpus_read_lock() out of __netif_set_xps_queue() Cc: "Nambiar, Amritha" <amritha.nambiar@intel.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Fixes: `8af2c06ff4` ("net-sysfs: Add interface for Rx queue(s) map per Tx queue") Signed-off-by: Andrei Vagin <avagin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-09 14:25:06 -07:00
Andrew Lunn	4005a7cb4f	net: phy: sftp: print debug message with text, not numbers Convert the state numbers, device state, etc from numbers to strings when printing debug messages. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-09 14:19:20 -07:00
Colin Ian King	51507c5f64	ethernet/qlogic: remove unused array msi_tgt_status Array msi_tgt_status is defined but never used, hence it is redundant and can be removed. Cleans up clang warning: warning: 'msi_tgt_status' defined but not used [-Wunused-const-variable=] Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-09 14:16:51 -07:00
Linus Walleij	933de7866b	net: dsa: rtl8366rb: Support port 4 (WAN) The totally undocumented IO mode needs to be set to enumerator 0 to enable port 4 also known as WAN in most configurations, for ordinary traffic. The 3 bits in the register come up as 010 after reset, but need to be set to 000. The Realtek source code contains a name for these bits, but no explanation of what the 8 different IO modes may be. Set it to zero for the time being and drop a comment so people know what is going on if they run into trouble. This "mode zero" works fine with the D-Link DIR-685 with RTL8366RB. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-09 14:15:00 -07:00
YueHaibing	cd16e5b233	mlxsw: spectrum_flower: use PTR_ERR_OR_ZERO() Fix ptr_ret.cocci warnings: drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c:543:1-3: WARNING: PTR_ERR_OR_ZERO can be used Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-09 14:14:03 -07:00
Jiri Pirko	63cc5bcc9f	net: sched: fix block->refcnt decrement Currently the refcnt is never decremented in case the value is not 1. Fix it by adding decrement in case the refcnt is not 1. Reported-by: Vlad Buslov <vladbu@mellanox.com> Fixes: `f71e0ca4db` ("net: sched: Avoid implicit chain 0 creation") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-09 14:12:04 -07:00
YueHaibing	0bab1cdc8c	decnet: fix using plain integer as NULL warning Fixes the following sparse warning: net/decnet/dn_route.c:407:30: warning: Using plain integer as NULL pointer net/decnet/dn_route.c:1923:22: warning: Using plain integer as NULL pointer Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-09 14:11:24 -07:00
YueHaibing	15693fd37f	net: skbuff.h: fix using plain integer as NULL warning Fixes the following sparse warning: ./include/linux/skbuff.h:2365:58: warning: Using plain integer as NULL pointer Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-09 14:10:01 -07:00
Petr Oros	98471b5b72	be2net: Use Kconfig flag to support for enabling/disabling adapters Add flags to enable/disable supported chips in be2net. With disable support are removed coresponding PCI IDs and also codepaths with [BE2\|BE3\|BEx\|lancer\|skyhawk]_chip checks. Disable chip will reduce module size by: BE2 ~2kb BE3 ~3kb Lancer ~10kb Skyhawk ~9kb When enable skyhawk only it will reduce module size by ~20kb New help style in Kconfig Reviewed-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: Petr Oros <poros@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-09 14:08:59 -07:00
Maria Pasechnik	eb95f52fc7	net: ipv6_gre: Fix GRO to work on IPv6 over GRE tap IPv6 GRO over GRE tap is not working while GRO is not set over the native interface. gro_list_prepare function updates the same_flow variable of existing sessions to 1 if their mac headers match the one of the incoming packet. same_flow is used to filter out non-matching sessions and keep potential ones for aggregation. The number of bytes to compare should be the number of bytes in the mac headers. In gro_list_prepare this number is set to be skb->dev->hard_header_len. For GRE interfaces this hard_header_len should be as it is set in the initialization process (when GRE is created), it should not be overridden. But currently it is being overridden by the value that is actually supposed to represent the needed_headroom. Therefore, the number of bytes compared in order to decide whether the the mac headers are the same is greater than the length of the headers. As it's documented in netdevice.h, hard_header_len is the maximum hardware header length, and needed_headroom is the extra headroom the hardware may need. hard_header_len is basically all the bytes received by the physical till layer 3 header of the packet received by the interface. For example, if the interface is a GRE tap then the needed_headroom should be the total length of the following headers: IP header of the physical, GRE header, mac header of GRE. It is often used to calculate the MTU of the created interface. This patch removes the override of the hard_header_len, and assigns the calculated value to needed_headroom. This way, the comparison in gro_list_prepare is really of the mac headers, and if the packets have the same mac headers the same_flow will be set to 1. Performance testing: 45% higher bandwidth. Measuring bandwidth of single-stream IPv4 TCP traffic over IPv6 GRE tap while GRO is not set on the native. NIC: ConnectX-4LX Before (GRO not working) : 7.2 Gbits/sec After (GRO working): 10.5 Gbits/sec Signed-off-by: Maria Pasechnik <mariap@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-09 14:07:38 -07:00
David S. Miller	333771526f	Merge branch 'qed-Enhancements' Manish Chopra says: ==================== qed*: Enhancements This patch series adds following support in drivers - 1. Egress mqprio offload. 2. Add destination IP based flow profile. 3. Ingress flower offload (for drop action). Please consider applying this series to "net-next". ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-09 14:05:31 -07:00
Manish Chopra	2ce9c93eac	qede: Ingress tc flower offload (drop action) support. The main motive of this patch is to lay down driver's tc offload infrastructure in place. With these changes tc can offload various supported flow profiles (4 tuples, src-ip, dst-ip, l4 port) for the drop action. Dropped flows statistic is a global counter for all the offloaded flows for drop action and is populated in ethtool statistics as common "gft_filter_drop". Examples - tc qdisc add dev p4p1 ingress tc filter add dev p4p1 protocol ipv4 parent ffff: flower \ skip_sw ip_proto tcp dst_ip 192.168.40.200 action drop tc filter add dev p4p1 protocol ipv4 parent ffff: flower \ skip_sw ip_proto udp src_ip 192.168.40.100 action drop tc filter add dev p4p1 protocol ipv4 parent ffff: flower \ skip_sw ip_proto tcp src_ip 192.168.40.100 dst_ip 192.168.40.200 \ src_port 453 dst_port 876 action drop tc filter add dev p4p1 protocol ipv4 parent ffff: flower \ skip_sw ip_proto tcp dst_port 98 action drop Signed-off-by: Manish Chopra <manish.chopra@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-09 14:05:30 -07:00
Manish Chopra	91a56adbf1	qede: Add destination ip based flow profile. This patch adds support for dropping and redirecting the flows based on destination IP in the packet. This also moves the profile mode settings in their own functions which can be used through tc flows in successive patch. For example - ethtool -N p5p1 flow-type tcp4 dst-ip 192.168.40.100 action -1 ethtool -N p5p1 flow-type udp4 dst-ip 192.168.50.100 action 1 ethtool -N p5p1 flow-type tcp4 dst-ip 192.168.60.100 action 0x100000000 Signed-off-by: Manish Chopra <manish.chopra@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-09 14:05:30 -07:00

... 2 3 4 5 6 ...

770703 Commits