linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2025-01-25 04:19:41 +07:00

Author	SHA1	Message	Date
Marek Vasut	806f66495e	net: ks8851: Remove ks8851_rdreg32() The ks8851_rdreg32() is used only in one place, to read two registers using a single read. To make it easier to support 16-bit accesses via parallel bus later on, replace this single read with two 16-bit reads from each of the registers and drop the ks8851_rdreg32() altogether. If this has noticeable performance impact on the SPI variant of KS8851, then we should consider using regmap to abstract the SPI and parallel bus options and in case of SPI, permit regmap to merge register reads of neighboring registers into single, longer, read. Signed-off-by: Marek Vasut <marex@denx.de> Cc: David S. Miller <davem@davemloft.net> Cc: Lukas Wunner <lukas@wunner.de> Cc: Petr Stetiar <ynezz@true.cz> Cc: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 16:30:04 -07:00
Marek Vasut	2c5b0a86ac	net: ks8851: Use dev_{get,set}_drvdata() Replace spi_{get,set}_drvdata() with dev_{get,set}_drvdata(), which works for both SPI and platform drivers. This is done in preparation for unifying the KS8851 SPI and parallel bus drivers. There should be no functional change. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Marek Vasut <marex@denx.de> Cc: David S. Miller <davem@davemloft.net> Cc: Lukas Wunner <lukas@wunner.de> Cc: Petr Stetiar <ynezz@true.cz> Cc: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 16:30:04 -07:00
Marek Vasut	b6948e1b7b	net: ks8851: Use devm_alloc_etherdev() Use device managed version of alloc_etherdev() to simplify the code. No functional change intended. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Marek Vasut <marex@denx.de> Cc: David S. Miller <davem@davemloft.net> Cc: Lukas Wunner <lukas@wunner.de> Cc: Petr Stetiar <ynezz@true.cz> Cc: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 16:30:04 -07:00
Marek Vasut	848fc0ce6c	net: ks8851: Pass device node into ks8851_init_mac() Since the driver probe function already has a struct device *dev pointer and can easily derive of_node pointer from it, pass the of_node pointer as a parameter to ks8851_init_mac() to avoid fishing it out from ks->spidev. This is the only reference to spidev in the function, so get rid of it. This is done in preparation for unifying the KS8851 SPI and parallel bus drivers. No functional change. Signed-off-by: Marek Vasut <marex@denx.de> Cc: David S. Miller <davem@davemloft.net> Cc: Lukas Wunner <lukas@wunner.de> Cc: Petr Stetiar <ynezz@true.cz> Cc: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 16:30:04 -07:00
Marek Vasut	2f3271c952	net: ks8851: Replace dev_err() with netdev_err() in IRQ handler Use netdev_err() instead of dev_err() to avoid accessing the spidev->dev in the interrupt handler. This is the only place which uses the spidev in this function, so replace it with netdev_err() to get rid of it. This is done in preparation for unifying the KS8851 SPI and parallel drivers. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Marek Vasut <marex@denx.de> Cc: David S. Miller <davem@davemloft.net> Cc: Lukas Wunner <lukas@wunner.de> Cc: Petr Stetiar <ynezz@true.cz> Cc: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 16:30:03 -07:00
Marek Vasut	bfd1e0eb08	net: ks8851: Rename ndev to netdev in probe Rename ndev variable to netdev for the sake of consistency. No functional change. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Marek Vasut <marex@denx.de> Cc: David S. Miller <davem@davemloft.net> Cc: Lukas Wunner <lukas@wunner.de> Cc: Petr Stetiar <ynezz@true.cz> Cc: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 16:30:03 -07:00
Marek Vasut	d320692d9f	net: ks8851: Factor out spi->dev in probe()/remove() Pull out the spi->dev into one common place in the function instead of having it repeated over and over again. This is done in preparation for unifying ks8851 and ks8851-mll drivers. No functional change. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Marek Vasut <marex@denx.de> Cc: David S. Miller <davem@davemloft.net> Cc: Lukas Wunner <lukas@wunner.de> Cc: Petr Stetiar <ynezz@true.cz> Cc: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 16:30:03 -07:00
David S. Miller	04d8262134	Merge branch 'vmxnet3-upgrade-to-version-4' Ronak Doshi says: ==================== vmxnet3: upgrade to version 4 vmxnet3 emulation has recently added several new features which includes offload support for tunnel packets, support for new commands the driver can issue to emulation, change in descriptor fields, etc. This patch series extends the vmxnet3 driver to leverage these new features. Compatibility is maintained using existing vmxnet3 versioning mechanism as follows: - new features added to vmxnet3 emulation are associated with new vmxnet3 version viz. vmxnet3 version 4. - emulation advertises all the versions it supports to the driver. - during initialization, vmxnet3 driver picks the highest version number supported by both the emulation and the driver and configures emulation to run at that version. In particular, following changes are introduced: Patch 1: This patch introduces utility macros for vmxnet3 version 4 comparison and updates Copyright information. Patch 2: This patch implements get_rss_hash_opts and set_rss_hash_opts methods to allow querying and configuring different Rx flow hash configurations which can be used to support UDP/ESP RSS. Patch 3: This patch introduces segmentation and checksum offload support for encapsulated packets. This avoids segmenting and calculating checksum for each segment and hence gives performance boost. Patch 4: With all vmxnet3 version 4 changes incorporated in the vmxnet3 driver, with this patch, the driver can configure emulation to run at vmxnet3 version 4. Changes in v3 -> v4: - Replaced BUG_ON() with WARN_ON_ONCE() Changes in v2 -> v3: - fixed get_rss_hash_opts to return correct values for udp rss Changes in v2: - Fixed compilation issue due to missing closed brace - added fallthrough comment ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 16:26:49 -07:00
Ronak Doshi	a31135e36e	vmxnet3: update to version 4 With all vmxnet3 version 4 changes incorporated in the vmxnet3 driver, the driver can configure emulation to run at vmxnet3 version 4, provided the emulation advertises support for version 4. Signed-off-by: Ronak Doshi <doshir@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 16:26:48 -07:00
Ronak Doshi	dacce2be33	vmxnet3: add geneve and vxlan tunnel offload support Vmxnet3 version 3 device supports checksum/TSO offload. Thus, vNIC to pNIC traffic can leverage hardware checksum/TSO offloads. However, vmxnet3 does not support checksum/TSO offload for Geneve/VXLAN encapsulated packets. Thus, for a vNIC configured with an overlay, the guest stack must first segment the inner packet, compute the inner checksum for each segment and encapsulate each segment before transmitting the packet via the vNIC. This results in significant performance penalty. This patch will enhance vmxnet3 to support Geneve/VXLAN TSO as well as checksum offload. Signed-off-by: Ronak Doshi <doshir@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 16:26:48 -07:00
Ronak Doshi	d3a8a9e5c3	vmxnet3: add support to get/set rx flow hash With vmxnet3 version 4, the emulation supports multiqueue(RSS) for UDP and ESP traffic. A guest can enable/disable RSS for UDP/ESP over IPv4/IPv6 by issuing commands introduced in this patch. ESP ipv6 is not yet supported in this patch. This patch implements get_rss_hash_opts and set_rss_hash_opts methods to allow querying and configuring different Rx flow hash configurations. Signed-off-by: Ronak Doshi <doshir@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 16:26:48 -07:00
Ronak Doshi	123db31d01	vmxnet3: prepare for version 4 changes vmxnet3 is currently at version 3 and this patch initiates the preparation to accommodate changes for version 4. Introduced utility macros for vmxnet3 version 4 comparison and update Copyright information. Signed-off-by: Ronak Doshi <doshir@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 16:26:48 -07:00
Arnd Bergmann	b113cabd43	sfc: avoid an unused-variable warning 'nic_data' is no longer used outside of the #ifdef block in efx_ef10_set_mac_address: drivers/net/ethernet/sfc/ef10.c:3231:28: error: unused variable 'nic_data' [-Werror,-Wunused-variable] struct efx_ef10_nic_data *nic_data = efx->nic_data; Move the variable into a local scope. Fixes: `dfcabb0788` ("sfc: move vport_id to struct efx_nic") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 12:49:06 -07:00
David S. Miller	62c027883c	Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 100GbE Intel Wired LAN Driver Updates 2020-05-27 This series contains updates to the ice driver only. Jesse fixes a number of issues, starting with fixing the remaining signed versus unsigned comparison issues. Cleaned up an unused code define. Fixed the implementation of the manage MAC write command, to simplify it by using a simple array to represent the MAC address when writing it. Paul fixes the setting of the VF default LAN address, by removing a check that assumed that the address had been deleted and zeroed. Surabhi prevents a memory leak on filter management initialization failures and during queue initialization and buffer allocation failures. Brett adds additional receive error counters that are reported by ethtool. Fixed the enabling and disabling of VLAN stripping when the PVID has been set. Evan fixes a race condition between the firmware and software, which can occur between the admin queue setup and the first command sent. Marta fixes the driver when XDP transmit rings are destroyed, also make sure the XDP transmit queues are also destroyed. Update the statistics when XDP transmit programs are loaded and packets are sent. Changed the number of XDP transmit queues to match the number of receive queues, instead of matching the number of transmit queues. Bruce avoids undefined behavior by not writing the 8-bit element init_q_state with the associated internal-to-hardware field which is 122-bits. Anirudh (Ani) refactors the receive checksum checks. Krzysztof notifies the user if the fill queue is not long enough to prepare all buffers before packet processing starts and allocates the buffers during the NAPI poll. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:17:20 -07:00
David S. Miller	1e372dbd68	Merge branch 'remove-most-callers-of-kernel_setsockopt-v3' Christoph Hellwig says: ==================== remove most callers of kernel_setsockopt v3 this series removes most callers of the kernel_setsockopt functions, and instead switches their users to small functions that implement setting a sockopt directly using a normal kernel function call with type safety and all the other benefits of not having a function call. In some cases these functions seem pretty heavy handed as they do a lock_sock even for just setting a single variable, but this mirrors the real setsockopt implementation unlike a few drivers that just set set the fields directly. Changes since v2: - drop the separately merged kernel_getopt_removal - drop the sctp patches, as there is conflicting cleanup going on - add an additional ACK for the rxrpc changes Changes since v1: - use ->getname for sctp sockets in dlm - add a new ->bind_add struct proto method for dlm/sctp - switch the ipv6 and remaining sctp helpers to inline function so that the ipv6 and sctp modules are not pulled in by any module that could potentially use ipv6 or sctp connections - remove arguments to various sock_* helpers that are always used with the same constant arguments ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:11:46 -07:00
Christoph Hellwig	095ae61253	tipc: call tsk_set_importance from tipc_topsrv_create_listener Avoid using kernel_setsockopt for the TIPC_IMPORTANCE option when we can just use the internal helper. The only change needed is to pass a struct sock instead of tipc_sock, which is private to socket.c Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:11:46 -07:00
Christoph Hellwig	298cd88a66	rxrpc: add rxrpc_sock_set_min_security_level Add a helper to directly set the RXRPC_MIN_SECURITY_LEVEL sockopt from kernel space without going through a fake uaccess. Thanks to David Howells for the documentation updates. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:11:46 -07:00
Christoph Hellwig	7d7207c2d5	ipv6: add ip6_sock_set_recvpktinfo Add a helper to directly set the IPV6_RECVPKTINFO sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:11:46 -07:00
Christoph Hellwig	18d5ad6232	ipv6: add ip6_sock_set_addr_preferences Add a helper to directly set the IPV6_ADD_PREFERENCES sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:11:46 -07:00
Christoph Hellwig	fce934949c	ipv6: add ip6_sock_set_recverr Add a helper to directly set the IPV6_RECVERR sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:11:45 -07:00
Christoph Hellwig	9b115749ac	ipv6: add ip6_sock_set_v6only Add a helper to directly set the IPV6_V6ONLY sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:11:45 -07:00
Christoph Hellwig	c1f9ec5776	ipv4: add ip_sock_set_pktinfo Add a helper to directly set the IP_PKTINFO sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:11:45 -07:00
Christoph Hellwig	2de569bda2	ipv4: add ip_sock_set_mtu_discover Add a helper to directly set the IP_MTU_DISCOVER sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> [rxrpc bits] Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:11:45 -07:00
Christoph Hellwig	db45c0ef25	ipv4: add ip_sock_set_recverr Add a helper to directly set the IP_RECVERR sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:11:45 -07:00
Christoph Hellwig	c4e446bf5a	ipv4: add ip_sock_set_freebind Add a helper to directly set the IP_FREEBIND sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:11:45 -07:00
Christoph Hellwig	6ebf71bab9	ipv4: add ip_sock_set_tos Add a helper to directly set the IP_TOS sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:11:45 -07:00
Christoph Hellwig	480aeb9639	tcp: add tcp_sock_set_keepcnt Add a helper to directly set the TCP_KEEPCNT sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:11:45 -07:00
Christoph Hellwig	d41ecaac90	tcp: add tcp_sock_set_keepintvl Add a helper to directly set the TCP_KEEPINTVL sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:11:45 -07:00
Christoph Hellwig	71c48eb81c	tcp: add tcp_sock_set_keepidle Add a helper to directly set the TCP_KEEP_IDLE sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:11:45 -07:00
Christoph Hellwig	c488aeadcb	tcp: add tcp_sock_set_user_timeout Add a helper to directly set the TCP_USER_TIMEOUT sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:11:45 -07:00
Christoph Hellwig	557eadfcc5	tcp: add tcp_sock_set_syncnt Add a helper to directly set the TCP_SYNCNT sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:11:45 -07:00
Christoph Hellwig	ddd061b8da	tcp: add tcp_sock_set_quickack Add a helper to directly set the TCP_QUICKACK sockopt from kernel space without going through a fake uaccess. Cleanup the callers to avoid pointless wrappers now that this is a simple function call. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:11:45 -07:00
Christoph Hellwig	12abc5ee78	tcp: add tcp_sock_set_nodelay Add a helper to directly set the TCP_NODELAY sockopt from kernel space without going through a fake uaccess. Cleanup the callers to avoid pointless wrappers now that this is a simple function call. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Sagi Grimberg <sagi@grimberg.me> Acked-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:11:45 -07:00
Christoph Hellwig	db10538a4b	tcp: add tcp_sock_set_cork Add a helper to directly set the TCP_CORK sockopt from kernel space without going through a fake uaccess. Cleanup the callers to avoid pointless wrappers now that this is a simple function call. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:11:45 -07:00
Christoph Hellwig	fe31a326a4	net: add sock_set_reuseport Add a helper to directly set the SO_REUSEPORT sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:11:45 -07:00
Christoph Hellwig	26cfabf9cd	net: add sock_set_rcvbuf Add a helper to directly set the SO_RCVBUFFORCE sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:11:44 -07:00
Christoph Hellwig	ce3d9544ce	net: add sock_set_keepalive Add a helper to directly set the SO_KEEPALIVE sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:11:44 -07:00
Christoph Hellwig	783da70e83	net: add sock_enable_timestamps Add a helper to directly enable timestamps instead of setting the SO_TIMESTAMP* sockopts from kernel space and going through a fake uaccess. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:11:44 -07:00
Christoph Hellwig	7594888c78	net: add sock_bindtoindex Add a helper to directly set the SO_BINDTOIFINDEX sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:11:44 -07:00
Christoph Hellwig	76ee0785f4	net: add sock_set_sndtimeo Add a helper to directly set the SO_SNDTIMEO_NEW sockopt from kernel space without going through a fake uaccess. The interface is simplified to only pass the seconds value, as that is the only thing needed at the moment. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:11:44 -07:00
Christoph Hellwig	6e43496745	net: add sock_set_priority Add a helper to directly set the SO_PRIORITY sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:11:44 -07:00
Christoph Hellwig	c433594c07	net: add sock_no_linger Add a helper to directly set the SO_LINGER sockopt from kernel space with onoff set to true and a linger time of 0 without going through a fake uaccess. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:11:44 -07:00
Christoph Hellwig	b58f0e8f38	net: add sock_set_reuseaddr Add a helper to directly set the SO_REUSEADDR sockopt from kernel space without going through a fake uaccess. For this the iscsi target now has to formally depend on inet to avoid a mostly theoretical compile failure. For actual operation it already did depend on having ipv4 or ipv6 support. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:11:44 -07:00
David S. Miller	1eba1110f0	mlx5-updates-2020-05-26 Updates highlights: 1) From Vu Pham (8): Support VM traffics failover with bonded VF representors and e-switch egress/ingress ACLs This series introduce the support for Virtual Machine running I/O traffic over direct/fast VF path and failing over to slower paravirtualized path using the following features: __________________________________ \| VM _________________ \| \| \|FAILOVER device \| \| \| \|________________\| \| \| \| \| \| ____\|_____ \| \| \| \| \| \| ______ \|___ ____\|_______ \| \| \| VF PT \| \|VIRTIO-NET \| \| \| \| device \| \| device \| \| \| \|_________\| \|___________\| \| \|___________\|______________\|________\| \| \| \| HYPERVISOR \| \| ____\|______ \| \| macvtap \| \| \|virtio BE \| \| \|___________\| \| \| \| ____\|_____ \| \|host VF \| \| \|_________\| \| \| _____\|______ _____\|_____ \| PT VF \| \| host VF \| \|representor\| \|representor\| \|___________\| \|___________\| \ / \ / \ / \ / _________________ \_______/ \| \| _______\|________ \| V-SWITCH \| \|VF representors \|________________\| (OVS) \| \| bond \| \|________________\| \|________________\| \| ________\|________ \| Uplink \| \| representor \| \|_________________\| Summary: -------- Problem statement: ------------------ Currently in above topology, when netfailover device is configured using VFs and eswitch VF representors, and when traffic fails over to stand-by VF which is exposed using macvtap device to guest VM, eswitch fails to switch the traffic to the stand-by VF representor. This occurs because there is no knowledge at eswitch level of the stand-by representor device. Solution: --------- Using standard bonding driver, a bond netdevice is created over VF representor device which is used for offloading tc rules. Two VF representors are bonded together, one for the passthrough VF device and another one for the stand-by VF device. With this solution, mlx5 driver listens to the failover events occuring at the bond device level to failover traffic to either of the active VF representor of the bond. a. VM with netfailover device of VF pass-thru (PT) device and virtio-net paravirtualized device with same MAC-address to handle failover traffics at VM level. b. Host bond is active-standby mode, with the lower devices being the VM VF PT representor, and the representor of the 2nd VF to handle failover traffics at Hypervisor/V-Switch OVS level. - During the steady state (fast datapath): set the bond active device to be the VM PT VF representor. - During failover: apply bond failover to the second VF representor device which connects to the VM non-accelerated path. c. E-Switch ingress/egress ACL tables to support failover traffics at E-Switch level I. E-Switch egress ACL with forward-to-vport rule: - By default, eswitch vport egress acl forward packets to its counterpart NIC vport. - During port failover, the egress acl forward-to-vport rule will be added to e-switch vport of passive/in-active slave VF representor to forward packets to other e-switch vport ie. the active slave representor's e-switch vport to handle egress "failover" traffics. - Using lower change netdev event to detect a representor is a lower dev (slave) of bond and becomes active, adding egress acl forward-to-vport rule of all other slave netdevs to forward to this representor's vport. - Using upper change netdev event to detect a representor unslaving from bond device to delete its vport's egress acl forward-to-vport rule. II. E-Switch ingress ACL metadata reg_c for match - Bonded representors' vorts sharing tc block have the same root ingress acl table and a unique metadata for match. - Traffics from both representors's vports will be tagged with same unique metadata reg_c. - Using upper change netdev event to detect a representor enslaving/unslaving from bond device to setup shared root ingress acl and unique metadata. 2) From Alex Vesker (2): Slpit RX and TX lock for parallel rule insertion in software steering 3) Eli Britstein (2): Optimize performance for IPv4/IPv6 ethertype use the HW ip_version register rather than parsing eth frames for ethertype. -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl7PEFAACgkQSD+KveBX +j4Z5Af+NYwihYZpQYBBN00K7Wu10XZ65u5MbGSDmzpdN62w0kKfjsJ70bb9aiws h8LC7lspdMLRMMn9pWwFKshyF6RoSD9Ku3ZYhUbtj+hJLElAd9IwGt6pPKr8hPDd 9h+ZcBkacdhNwWKf7CKThic0c/0PLdVyzRysHxcQWKSMPCTdgiL5Z3PQHA0TM6J3 6Excs2z7kSuuyyxQ1cyWCaqSz4rqCrYyd8Ws4HOPhXgSbX14Q3mtMsBDayx2gHNW rdVbaNN6s2o0TxbrCwd0AaNP3UWcnjNqu1ohxgJiSe8y+MHMoB0OMoO+6vQJnwNI bzpZEioswV1zdgK3qNmXqbHOiHRSVQ== =xM1D -----END PGP SIGNATURE----- Merge tag 'mlx5-updates-2020-05-26' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2020-05-26 Updates highlights: 1) From Vu Pham (8): Support VM traffics failover with bonded VF representors and e-switch egress/ingress ACLs This series introduce the support for Virtual Machine running I/O traffic over direct/fast VF path and failing over to slower paravirtualized path using the following features: __________________________________ \| VM _________________ \| \| \|FAILOVER device \| \| \| \|________________\| \| \| \| \| \| ____\|_____ \| \| \| \| \| \| ______ \|___ ____\|_______ \| \| \| VF PT \| \|VIRTIO-NET \| \| \| \| device \| \| device \| \| \| \|_________\| \|___________\| \| \|___________\|______________\|________\| \| \| \| HYPERVISOR \| \| ____\|______ \| \| macvtap \| \| \|virtio BE \| \| \|___________\| \| \| \| ____\|_____ \| \|host VF \| \| \|_________\| \| \| _____\|______ _____\|_____ \| PT VF \| \| host VF \| \|representor\| \|representor\| \|___________\| \|___________\| \ / \ / \ / \ / _________________ \_______/ \| \| _______\|________ \| V-SWITCH \| \|VF representors \|________________\| (OVS) \| \| bond \| \|________________\| \|________________\| \| ________\|________ \| Uplink \| \| representor \| \|_________________\| Summary: -------- Problem statement: ------------------ Currently in above topology, when netfailover device is configured using VFs and eswitch VF representors, and when traffic fails over to stand-by VF which is exposed using macvtap device to guest VM, eswitch fails to switch the traffic to the stand-by VF representor. This occurs because there is no knowledge at eswitch level of the stand-by representor device. Solution: --------- Using standard bonding driver, a bond netdevice is created over VF representor device which is used for offloading tc rules. Two VF representors are bonded together, one for the passthrough VF device and another one for the stand-by VF device. With this solution, mlx5 driver listens to the failover events occuring at the bond device level to failover traffic to either of the active VF representor of the bond. a. VM with netfailover device of VF pass-thru (PT) device and virtio-net paravirtualized device with same MAC-address to handle failover traffics at VM level. b. Host bond is active-standby mode, with the lower devices being the VM VF PT representor, and the representor of the 2nd VF to handle failover traffics at Hypervisor/V-Switch OVS level. - During the steady state (fast datapath): set the bond active device to be the VM PT VF representor. - During failover: apply bond failover to the second VF representor device which connects to the VM non-accelerated path. c. E-Switch ingress/egress ACL tables to support failover traffics at E-Switch level I. E-Switch egress ACL with forward-to-vport rule: - By default, eswitch vport egress acl forward packets to its counterpart NIC vport. - During port failover, the egress acl forward-to-vport rule will be added to e-switch vport of passive/in-active slave VF representor to forward packets to other e-switch vport ie. the active slave representor's e-switch vport to handle egress "failover" traffics. - Using lower change netdev event to detect a representor is a lower dev (slave) of bond and becomes active, adding egress acl forward-to-vport rule of all other slave netdevs to forward to this representor's vport. - Using upper change netdev event to detect a representor unslaving from bond device to delete its vport's egress acl forward-to-vport rule. II. E-Switch ingress ACL metadata reg_c for match - Bonded representors' vorts sharing tc block have the same root ingress acl table and a unique metadata for match. - Traffics from both representors's vports will be tagged with same unique metadata reg_c. - Using upper change netdev event to detect a representor enslaving/unslaving from bond device to setup shared root ingress acl and unique metadata. 2) From Alex Vesker (2): Slpit RX and TX lock for parallel rule insertion in software steering 3) Eli Britstein (2): Optimize performance for IPv4/IPv6 ethertype use the HW ip_version register rather than parsing eth frames for ethertype. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:04:12 -07:00
Eric Dumazet	d29245692a	tcp: ipv6: support RFC 6069 (TCP-LD) Make tcp_ld_RTO_revert() helper available to IPv6, and implement RFC 6069 : Quoting this RFC : 3. Connectivity Disruption Indication For Internet Protocol version 6 (IPv6) [RFC2460], the counterpart of the ICMP destination unreachable message of code 0 (net unreachable) and of code 1 (host unreachable) is the ICMPv6 destination unreachable message of code 0 (no route to destination) [RFC4443]. As with IPv4, a router should generate an ICMPv6 destination unreachable message of code 0 in response to a packet that cannot be delivered to its destination address because it lacks a matching entry in its routing table. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:02:46 -07:00
Vladimir Oltean	4d7525085a	net: dsa: sja1105: offload the Credit-Based Shaper qdisc SJA1105, being AVB/TSN switches, provide hardware assist for the Credit-Based Shaper as described in the IEEE 8021Q-2018 document. First generation has 10 shapers, freely assignable to any of the 4 external ports and 8 traffic classes, and second generation has 16 shapers. The Credit-Based Shaper tables are accessed through the dynamic reconfiguration interface, so we have to restore them manually after a switch reset. The tables are backed up by the static config only on P/Q/R/S, and we don't want to add custom code only for that family, since the procedure that is in place now works for both. Tested with the following commands: data_rate_kbps=67000 port_transmit_rate_kbps=1000000 idleslope=$data_rate_kbps sendslope=$(($idleslope - $port_transmit_rate_kbps)) locredit=$((-0x80000000)) hicredit=$((0x7fffffff)) tc qdisc add dev swp2 root handle 1: mqprio hw 0 num_tc 8 \ map 0 1 2 3 4 5 6 7 \ queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 tc qdisc replace dev swp2 parent 1:1 cbs \ idleslope $idleslope \ sendslope $sendslope \ hicredit $hicredit \ locredit $locredit \ offload 1 Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:01:22 -07:00
David Ahern	7c741868ce	selftests: Add torture tests to nexthop tests Add Nik's torture tests as a new set to stress the replace and cleanup paths. Torture test created by Nikolay Aleksandrov and then I adapted to selftest and added IPv6 version. Signed-off-by: David Ahern <dsahern@kernel.org> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-28 11:00:31 -07:00
Krzysztof Kazimierczak	3f0d97cdfe	ice: Check UMEM FQ size when allocating bufs If a UMEM is present on a queue when an interface/queue pair is being enabled, the driver will try to prepare the Rx buffers in advance to improve performance. However, if fill queue is shorter than HW Rx ring, the driver will report failure after getting the last address from the fill queue. This still lets the driver process the packets correctly during the NAPI poll, but leads to a constant NAPI rescheduling. Not allocating the buffers in advance would result in a potential performance decrease. Commit `d57d76428a` ("xsk: Add API to check for available entries in FQ") provides an API that lets drivers check the number of addresses that the fill queue holds. Notify the user if fill queue is not long enough to prepare all buffers before packet processing starts, and allocate the buffers during the NAPI poll. If the fill queue size is sufficient, prepare Rx buffers in advance. Signed-off-by: Krzysztof Kazimierczak <krzysztof.kazimierczak@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-05-27 18:13:59 -07:00
Alex Vesker	ed03a418ab	net/mlx5: DR, Split RX and TX lock for parallel insertion Change the locking flow to support RX and TX locks, splitting the single lock to two will allow inserting rules in parallel for RX and TX parts of the FDB. Locking the dr_domain will be done by locking the RX domain and the TX domain locks, this is mostly used for control operations on the dr_domain. When inserting rules for RX or TX the single nic_doamin RX or TX lock will be used. Splitting the lock is safe since RX and TX domains are logically separated from each other, shared objects such the send-ring and memory pool are protected by locks. Signed-off-by: Alex Vesker <valex@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-05-27 18:13:52 -07:00
Alex Vesker	cedb28191f	net/mlx5: DR, Add a spinlock to protect the send ring Adding this lock will allow writing steering entries without locking the dr_domain and allow parallel insertion. Signed-off-by: Alex Vesker <valex@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-05-27 18:13:51 -07:00

1 2 3 4 5 ...

919691 Commits