linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-21 17:29:05 +07:00

Author	SHA1	Message	Date
Saeed Mahameed	31871f87bb	net/mlx5e: Move XDP SQ instance into RQ To save many rq->channel->sq dereferences in fast-path. And rename it to xdpsq. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 19:11:45 -07:00
Saeed Mahameed	eba2db2bd2	net/mlx5e: Move mlx5e_rq struct declaration Move struct mlx5e_rq and friends to appear after mlx5e_sq declaration in en.h. We will need this for next patch to move the mlx5e_sq instance into mlx5e_rq struct for XDP SQs. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 19:11:45 -07:00
Saeed Mahameed	1c4bf94045	net/mlx5e: Move XDP completion functions to rx file XDP code belongs to RX path, move mlx5e_poll_xdp_tx_cq and mlx5e_free_xdp_tx_descs to en_rx.c. Rename them to mlx5e_poll_xdpsq_cq and mlx5e_free_xdpsq_descs. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 19:11:45 -07:00
Saeed Mahameed	aff2615763	net/mlx5e: Single bfreg (UAR) for all mlx5e SQs and netdevs One is sufficient since Blue Flame is not supported anymore. This will also come in handy for switchdev mode to save resources, since VF representors will use same single UAR as well for their own SQs. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 19:11:45 -07:00
Saeed Mahameed	6982ab6097	net/mlx5e: Xmit, no write combining mlx5e netdev Blue Flame (write combining) support demands a lot of overhead for a little latency gain for some special cases, this overhead is hurting the common case. Here we remove xmit Blue Flame support by creating all bfregs with no write combining for all SQs, and we remove a lot of BF logic and conditions from xmit data path. Simplify mlx5e_tx_notify_hw (doorbell function) by removing BF related code and by removing one memory barrier needed for WC mapped SQ doorbell buffers, which no longer exist. Performance improvement: System: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz Test case Before Now improvement --------------------------------------------------------------- TX packets (24 threads) 50Mpps 54Mpps 8% Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 19:11:45 -07:00
Saeed Mahameed	80fe326ab8	net/mlx5e: Use dma_rmb rather than rmb in CQE fetch routine Use dma_rmb in mlx5e_get_cqe rather than aggressive rmb (at least on some architectures), this should help improve the performance on such CPU archs where dma_rmb is optimized. Performance improvement: System: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz Test case Baseline Now improvement --------------------------------------------------------------- TX packets (24 threads) 45Mpps 50Mpps 11% TC stack Drop (1 core) 3.45Mpps 3.6Mpps 5% XDP Drop (1 core) 14Mpps 16.9Mpps 20% XDP TX (1 core) 10.4Mpps 12Mpps 15% Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 19:11:44 -07:00
Florian Fainelli	68e498554f	net: dsa: bcm_sf2: Add missing OF_MDIO dependency bcm_sf2 does require the MDIO_BCM_UNIMAC driver which is now dependent on OF_MDIO but also internally uses of_mdio.c provided routines which are guarted with OF_MDIO. Reported-by: kbuild test robot <fengguang.wu@intel.com> Fixes: `90eff9096c` ("net: phy: Allow splitting MDIO bus/device support from PHYs") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 15:03:06 -07:00
David S. Miller	f106d16472	Merge branch 'ipv6-sr-perf-improvements' David Lebrun says: ==================== Performances improvement for IPv6 Segment Routing This patch series improves the performances of IPv6 SR by optimizing skb head reallocation and extending the use of dst_cache. The overall performances improve by 35%. Before patch series (SRH encap): Result: OK: 7348320(c7347271+d1048) usec, 5000000 (1000byte,0frags) 680427pps 5443Mb/sec (5443416000bps) errors: 0 After patch series (SRH encap): Result: OK: 4774543(c4774084+d459) usec, 5000000 (1000byte,0frags) 1047220pps 8377Mb/sec (8377760000bps) errors: 0 Baseline for plain IPv6 forwarding: Result: OK: 4244144(c4243722+d422) usec, 5000000 (1000byte,0frags) 1178093pps 9424Mb/sec (9424744000bps) errors: 0 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 14:47:32 -07:00
David Lebrun	af4a2209b1	ipv6: sr: use dst_cache in seg6_input We already use dst_cache in seg6_output, when handling locally generated packets. We extend it in seg6_input, to also handle forwarded packets, and avoid unnecessary fib lookups. Performances for SRH encapsulation before the patch: Result: OK: 5656067(c5655678+d388) usec, 5000000 (1000byte,0frags) 884006pps 7072Mb/sec (7072048000bps) errors: 0 Performances after the patch: Result: OK: 4774543(c4774084+d459) usec, 5000000 (1000byte,0frags) 1047220pps 8377Mb/sec (8377760000bps) errors: 0 Signed-off-by: David Lebrun <david.lebrun@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 14:47:32 -07:00
David Lebrun	19d5a26f5e	ipv6: sr: expand skb head only if necessary To insert or encapsulate a packet with an SRH, we need a large enough skb headroom. Currently, we are using pskb_expand_head to inconditionally increase the size of the headroom by the amount needed by the SRH (and IPv6 header). If this reallocation is performed by another CPU than the one that initially allocated the skb, then when the initial CPU kfree the skb, it will enter the __slab_free slowpath, impacting performances. This patch replaces pskb_expand_head with skb_cow_head, that will reallocate the skb head only if the headroom is not large enough. Performances for SRH encapsulation before the patch: Result: OK: 7348320(c7347271+d1048) usec, 5000000 (1000byte,0frags) 680427pps 5443Mb/sec (5443416000bps) errors: 0 Performances after the patch: Result: OK: 5656067(c5655678+d388) usec, 5000000 (1000byte,0frags) 884006pps 7072Mb/sec (7072048000bps) errors: 0 Signed-off-by: David Lebrun <david.lebrun@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 14:47:32 -07:00
Geliang Tang	3b1af93cf1	net_sched: use setup_deferrable_timer Use setup_deferrable_timer() instead of init_timer_deferrable() to simplify the code. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 14:42:52 -07:00
David S. Miller	ff41c7fa64	Merge branch 'mlxsw-query-resources' Jiri Pirko says: ==================== mlxsw: Query resources from firmware Ido says: Some parts of the driver already use the resource query mechanism, but in other parts we still rely on hard coded values that may change over time. This patchset removes most of these remaining values and queries them from the firmware instead. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 13:53:29 -07:00
Ido Schimmel	18281f2dab	mlxsw: spectrum: Query cell size from firmware As explained in the previous patch, the cell size may change in future devices, so query it from the firmware instead of hard coding it. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 13:53:29 -07:00
Ido Schimmel	f417f04da5	mlxsw: spectrum: Refactor port buffer configuration The sizes and thresholds of the priority group (PG) buffers are configured in cells, which represent a specific amount of bytes. The cell size can vary in different devices, so it's better to query it from the firmware than hard coding it. Refactor the code dealing with this value into different functions, so that it will be easier to make the conversion in the next patch. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 13:53:29 -07:00
Ido Schimmel	d3daae1b08	mlxsw: spectrum_buffers: Query shared buffer size from firmware Instead of hard coding the size of the shared buffer in the driver, query it from the firmware, as it may change in future devices. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 13:53:28 -07:00
Ido Schimmel	5ec2ee7dd2	mlxsw: Query maximum number of ports from firmware We currently hard code the maximum number of ports in the driver, but this may change in future devices, so query it from the firmware instead. Fallback to a maximum of 64 ports in case this number can't be queried. This should only happen in SwitchX-2 for which this number is correct. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 13:53:28 -07:00
Ido Schimmel	8494ab06e0	mlxsw: spectrum_router: Query number of LPM trees from firmware Instead of hard coding the number of LPM trees in the driver, query it from the firmware, as it may change in future devices. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 13:53:28 -07:00
David S. Miller	ba82427d4a	Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2017-03-23 This series contains updates to i40e and i40e.txt documentation. Jake provides all the changes in the series which are centered around ntuple filter fixes and additional support. Fixed the current implementation of .set_rxnfc, where we were not reading the mask field for filter entries which was resulting in filters not behaving as expected and not working correctly. When cleaning up after disabling flow director support, ensure that the default input set is correctly reprogrammed. Since the hardware only supports a single input set for all flows of that type, the driver shall only allow the input set to change if there are no other configured filters for that flow type, so add support to detect when we can update the input set for each flow type. Align the driver to other drivers to partition the ring_cookie value into 8bits of VF index, along with 32bits of queue number instead of using the user-def field. Added support to parse the user-def field into a data structure format to allow future extensions of the user-def filed by keeping all the code that read/writes the field into a single location. Added support for flexible payloads passed via ethtool user-def field. We support a single flexible word (2byte) value per protocol type, and we handle the FLX_PIT register using a list of flexible entries so that each flow type may be configured separately. Enabled flow director filters for SCTPv4 packets using the ethtool ntuple interface to enable filters. Updated the documentation on the i40e driver to include the newly added support to ntuple filters. Reduced complexity of a if-continue-else-break section of code by taking advantage of using hlist_for_each_entry_continue() instead. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 13:45:07 -07:00
David Ahern	6a18c31232	net: mpls: Fix setting ttl_propagate for rt2 Fix copy and paste error setting rt_ttl_propagate. Fixes: `5b441ac878` ("mpls: allow TTL propagation to IP packets to be configured") Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Acked-by: Robert Shearman <rshearma@brocade.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 13:29:17 -07:00
Gao Feng	c48367427a	tcp: sysctl: Fix a race to avoid unexpected 0 window from space Because sysctl_tcp_adv_win_scale could be changed any time, so there is one race in tcp_win_from_space. For example, 1.sysctl_tcp_adv_win_scale<=0 (sysctl_tcp_adv_win_scale is negative now) 2.space>>(-sysctl_tcp_adv_win_scale) (sysctl_tcp_adv_win_scale is postive now) As a result, tcp_win_from_space returns 0. It is unexpected. Certainly if the compiler put the sysctl_tcp_adv_win_scale into one register firstly, then use the register directly, it would be ok. But we could not depend on the compiler behavior. Signed-off-by: Gao Feng <fgao@ikuai8.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 13:29:16 -07:00
Alexey Dobriyan	e013fb7c4c	net: make in_aton() 32-bit internally Converting IPv4 address doesn't need 64-bit arithmetic. Space savings: 10 bytes! add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-10 (-10) function old new delta in_aton 96 86 -10 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 13:27:19 -07:00
Felix Manlunas	7cc61db9c7	liquidio: do not reset Octeon if NIC firmware was preloaded The PF driver is incorrectly resetting Octeon when the module parameter "fw_type=none" is there. "fw_type=none" means the PF should not load any firmware to the NIC because Octeon is already running preloaded firmware. Fix it by putting an if (fw_type != none) around the reset code. Because the Octeon reset is now conditionally gone, when unloading the driver, conditionally send the RESET_PF command to the firmware who will then free up PF-related data structures. Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: Satanand Burla <satananda.burla@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 13:20:43 -07:00
subashab@codeaurora.org	dddb64bcb3	net: Add sysctl to toggle early demux for tcp and udp Certain system process significant unconnected UDP workload. It would be preferrable to disable UDP early demux for those systems and enable it for TCP only. By disabling UDP demux, we see these slight gains on an ARM64 system- 782 -> 788Mbps unconnected single stream UDPv4 633 -> 654Mbps unconnected UDPv4 different sources The performance impact can change based on CPU architecure and cache sizes. There will not much difference seen if entire UDP hash table is in cache. Both sysctls are enabled by default to preserve existing behavior. v1->v2: Change function pointer instead of adding conditional as suggested by Stephen. v2->v3: Read once in callers to avoid issues due to compiler optimizations. Also update commit message with the tests. v3->v4: Store and use read once result instead of querying pointer again incorrectly. v4->v5: Refactor to avoid errors due to compilation with IPV6={m,n} Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Suggested-by: Eric Dumazet <edumazet@google.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Tom Herbert <tom@herbertland.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 13:17:07 -07:00
David S. Miller	8fa96e3bf6	Merge branch 'systemport-tx-napi-improvements' Florian Fainelli says: ==================== net: systemport: TX/NAPI improvements This patch series builds up on Doug's latest changes done in BCMGENET to reduce the number of spurious interrupts in NAPI, simplify pointer arithmetic and finally tracking of per TX ring statistics to be SMP friendly. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 12:53:15 -07:00
Florian Fainelli	e9d7af78b2	net: systemport: Simplify circular pointer arithmetic Similar to `c298ede2fe` ("net: bcmgenet: simplify circular pointer arithmetic") we don't need to complex arthimetic since we always have a ring size that is a power of 2. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 12:53:15 -07:00
Florian Fainelli	6baa785a9c	net: systemport: Clear status to reduce spurious interrupts Do something similar to commit `d5810ca325` ("net: bcmgenet: clear status to reduce spurious interrupts") and clear interrupts right before servicing them. This reduces the number of interrupts by 10K interrupts/sec for a TX TCP session 1Gbits/sec. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 12:53:14 -07:00
Florian Fainelli	30defeb2fb	net: systemport: Track per TX ring statistics bcm_sysport_tx_reclaim_one() is currently summing TX bytes/packets in a way that is not SMP friendly, mutliples CPUs could run bcm_sysport_tx_reclaim_one() independently and still update stats->tx_bytes and stats->tx_packets, cloberring the other CPUs statistics. Fix this by tracking per TX rings the number of bytes, packets, dropped and errors statistics, and provide a bcm_sysport_get_nstats() function which aggregates everything and returns a consistent output. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 12:53:14 -07:00
David S. Miller	12459cbd98	Merge branch 'phy-mdio-split' Florian Fainelli says: ==================== net: phy: Allow splitting MDIO bus/device support This patch series allows building support for MDIO bus controllers which are sometimes usable and necessary in cases where there are no Ethernet PHYs. Changes in v3: - corrected of_mdio compile guards for prototypes vs. stubs - added a missing OF_MDIO dependency for MDIO_BCM_UNIMAC - fixed Kbuild bot reported errors against mdio-bitbang Changes in v2: - implement Russell's feedback - solve the circular dependency in the CONFIG_MDIO_DEVICE + CONFIG_PHYLIB case ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 12:51:05 -07:00
Florian Fainelli	90eff9096c	net: phy: Allow splitting MDIO bus/device support from PHYs Introduce a new configuration symbol: MDIO_DEVICE which allows building the MDIO devices and bus code, without pulling in the entire Ethernet PHY library and devices code. PHYLIB nows select MDIO_DEVICE and the relevant Makefile files are updated to reflect that. When MDIO_DEVICE (MDIO bus/device only) is selected, but not PHYLIB, we have mdio-bus.ko as a loadable module, and it does not have a module_exit() function because the safety of removing a bus class is unclear. When both MDIO_DEVICE and PHYLIB are enabled, we need to assemble everything into a common loadable module: libphy.ko because of nasty circular dependencies between phy.c, phy_device.c and mdio_bus.c which are really tough to untangle. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 12:51:05 -07:00
Florian Fainelli	17487eebaf	net: phy: MDIO_BCM_UNIMAC should depend on OF_MDIO The Broadcom MDIO UniMAC driver uses routines provided by of_mdio.c which is guarded by CONFIG_OF_MDIO. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 12:51:04 -07:00
Florian Fainelli	e6e14f63d7	of_mdio: Correct check against CONFIG_OF CONFIG_OF_MDIO is actually what triggers the build of drivers/of/of_mdio.c, so providing inline stubs when CONFIG_OF_MDIO=y should be based on that symbol as well. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 12:51:04 -07:00
Jiri Pirko	5952fde10c	net: sched: choke: remove dead filter classify code sch_choke is classless qdisc so it does not define cl_ops. Therefore filter_list cannot be ever changed, being NULL all the time. Reason is this check in tc_ctl_tfilter: /* Is it classful? */ cops = q->ops->cl_ops; if (!cops) return -EINVAL; So remove this dead code. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 12:47:10 -07:00
LABBE Corentin	270c7759fb	net: stmmac: add set_mac to the stmmac_ops Two different set_mac functions exists but stmmac_dwmac4_set_mac() is only used for enabling and never for disabling. So on dwmac4, the MAC RX/TX is never disabled. This patch add a generic function pointer set_mac() to stmmac_ops and replace all call to stmmac_set_mac/stmmac_dwmac4_set_mac by a call to this pointer. Since dwmac4_ops is const, set_mac cannot be modified after, and so dwmac4_ops is duplioacted like dwmac4_dma_ops. Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 12:36:42 -07:00
Geliang Tang	aff55a3638	isdn: use setup_timer Use setup_timer() instead of init_timer() to simplify the code. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 12:33:42 -07:00
David S. Miller	9096643856	Merge branch 'bridge-ext-learned-entries' Nikolay Aleksandrov says ==================== net: bridge: allow user-space to add ext learned entries This set adds the ability to add externally learned entries from user-space. For symmetry and proper function we need to allow SW entries to take over HW learned ones (similar to how HW can take over SW entries currently) which is needed for our use case (evpn) where we have pure SW ports and HW ports mixed in a single bridge. This does not play well with switchdev devices currently because there's no feedback when the entry is taken over, but this case has never worked anyway and feedback can be easily added when needed. Patch 02 simply allows to use NTF_EXT_LEARNED from user-space, we already have Quagga patches that make use of this functionality. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 12:30:22 -07:00
Nikolay Aleksandrov	eb100e0e24	net: bridge: allow to add externally learned entries from user-space The NTF_EXT_LEARNED flag was added for switchdev and externally learned entries, but it can also be used for entries learned via a software in user-space which requires dynamic entries that do not expire. One such case that we have is with quagga and evpn which need dynamic entries but also require to age them themselves. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 12:30:21 -07:00
Nikolay Aleksandrov	7e26bf45e4	net: bridge: allow SW learn to take over HW fdb entries Allow to take over an entry which was previously learned via HW when it shows up from a SW port. This is analogous to how HW takes over SW learned entries already. Suggested-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 12:30:21 -07:00
Ido Schimmel	9a32562bec	mlxsw: Remove debugfs interface We don't use it during development and we can't extend it either, so remove it. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-23 21:29:32 -07:00
Jacob Keller	584a88709b	i40e: make use of hlist_for_each_entry_continue Replace a complex if->continue->else->break construction in i40e_next_filter. We can simply use hlist_for_each_entry_continue instead. This drops a lot of confusing code. The resulting code is much easier to understand the intention, and follows the more normal pattern for using hlist loops. We could have also used a break with a "return next" at the end of the function, instead of return NULL, but the current implementation is explicitly clear that when you reach the end of the loop you get a NULL value. The alternative construction is less clear since the reader would have to know that next is NULL at the end of the loop. Change-Id: Ife74ca451dd79d7f0d93c672bd42092d324d4a03 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-23 21:13:33 -07:00
Jacob Keller	55877012d5	i40e: document drivers use of ntuple filters Add documentation describing the drivers use of ethtool ntuple filters, including the limitations that it has due to hardware, as well as how it reads and parses the user-def data block. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-23 21:13:33 -07:00
Jacob Keller	f223c8752a	i40e: add support for SCTPv4 FDir filters Enable FDir filters for SCTPv4 packets using the ethtool ntuple interface to enable filters. The ethtool API does not allow masking on the verification tag. Change-Id: I093e88a8143994c7e6f4b7b17a0bd5cf861d18e4 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-23 21:13:33 -07:00
Jacob Keller	0e588de17f	i40e: implement support for flexible word payload Add support for flexible payloads passed via ethtool user-def field. This support is somewhat limited due to hardware design. The input set can only be programmed once per filter type, and the flexible offset is part of this filter input set. This means that the user cannot program both a regular and a flexible filter at the same time for a given flow type. Additionally, the user may not program two flexible filters of the same flow type with different offsets, although they are allowed to configure different values at that offset location. We support a single flexible word (2byte) value per protocol type, and we handle the FLX_PIT register using a list of flexible entries so that each flow type may be configured separately. Due to hardware implementation, the flexible data is offset from the start of the packet payload, and thus may not be in part of the header data. For this reason, the offset provided by the user defined data is interpreted as a byte offset from the start of the matching payload. Previous implementations have tried to represent the offset as from the start of the frame, but this is not feasible because header sizes may change due to options. Change-Id: 36ed27995e97de63f9aea5ade5778ff038d6f811 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-23 21:13:33 -07:00
Jacob Keller	e793095e8a	i40e: add parsing of flexible filter fields from userdef Add code to parse the user-def field into a data structure format. This code is intended to allow future extensions of the user-def field by keeping all code that actually reads and writes the field into a single location. This ensures that we do not litter the driver with references to the user-def field and minimizes the amount of bitwise operations we need to do on the data. Add code which parses the lower 32bits into a flexible word and its offset. This will be used in a future patch to enable flexible filters which can match on some arbitrary data in the packet payload. For now, we just return -EOPNOTSUPP when this is used. Add code to fill in the user-def field when reporting the filter back, even though we don't actually implement any user-def fields yet. Additionally, ensure that we mask the extended FLOW_EXT bit from the flow_type now that we will be accepting filters which have the FLOW_EXT bit set (and thus make use of the user-def field). Change-Id: I238845035c179380a347baa8db8223304f5f6dd7 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-23 21:13:33 -07:00
Jacob Keller	43b15697a3	i40e: partition the ring_cookie to get VF index Do not use the user-def field for determining the VF target. Instead, similar to ixgbe, partition the ring_cookie value into 8bits of VF index, along with 32bits of queue number. This is better than using the user-def field, because it leaves the field open for extension in a future patch which will enable flexible data. Also, this matches with convention used by ixgbe and other drivers. Change-Id: Ie36745186d817216b12f0313b99ec95cb8a9130c Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-23 21:13:33 -07:00
Jacob Keller	9229e99334	i40e: allow changing input set for ntuple filters Add support to detect when we can update the input set for each flow type. Because the hardware only supports a single input set for all flows of that matching type, the driver shall only allow the input set to change if there are no other configured filters for that flow type. Thus, the first filter added for each flow type is allowed to change the input set, and all future filters must match the same input set. Display a diagnostic message whenever the filter input set changes, and a warning whenever a filter cannot be accepted because it does not match the configured input set. Change-Id: Ic22e1c267ae37518bb036aca4a5694681449f283 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-23 21:13:33 -07:00
Jacob Keller	3bcee1e653	i40e: restore default input set for each flow type Ensure that the default input set is correctly reprogrammed when cleaning up after disabling flow director support. This ensures that the programmed value will be in a clean state. Although we do not yet have support for SCTPv4 filters, a future patch will add support for this protocol, so we will correctly restore the SCTPv4 input set here as well. Note that strictly speaking the default hardware value for SCTP includes matching the verification tag. However, the ethtool API does not have support for specifying this value, so there is no reason to keep the verification field enabled. This patch is the next step on the way to enabling partial tuple filters which will be implemented in a following patch. Change-Id: Ic22e1c267ae37518bb036aca4a5694681449f283 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-23 21:13:33 -07:00
Jacob Keller	36777d9fa2	i40e: check current configured input set when adding ntuple filters Do not assume that hardware has been programmed with the default mask, but instead read the input set registers to determine what is currently programmed. This ensures that all programmed filters match exactly how the hardware will interpret them, avoiding confusion regarding filter behavior. This sets the initial ground-work for allowing custom input sets where some fields are disabled. A future patch will fully implement this feature. Instead of using bitwise negation, we'll just explicitly check for the correct value. The use of htonl and htons are used to silence sparse warnings. The compiler should be able to handle the constant value and avoid actually performing a byteswap. Change-Id: I3d8db46cb28ea0afdaac8c5b31a2bfb90e3a4102 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-23 21:13:33 -07:00
Jacob Keller	faa16e0f38	i40e: correctly honor the mask fields for ETHTOOL_SRXCLSRLINS The current implementation of .set_rxnfc does not properly read the mask field for filter entries. This results in incorrect driver behavior, as we do not reject filters which have masks set to ignore some fields. The current implementation simply assumes that every part of the tuple or "input set" is specified. This results in filters not behaving as expected, and not working correctly. As a first step in supporting some partial filters, add code which checks the mask fields and rejects any filters which do not have an acceptable mask. For now, we just assume that all fields must be set. This will get the driver one step towards allowing some partial filters. At a minimum, the ethtool commands which previously installed filters that would not function will now return a non-zero exit code indicating failure instead. We should now be meeting the minimum requirements of the .set_rxnfc API, by ensuring that all filters we program have a valid mask value for each field. Finally, add code to report the mask correctly so that the ethtool command properly reports the mask to the user. Note that the typecast to (__be16) when checking source and destination port masks is required because the ~ bitwise negation operator does not correctly handle variables other than integer size. Change-Id: Ia020149e07c87aa3fcec7b2283621b887ef0546f Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-23 21:13:33 -07:00
Davide Caratti	add641e7de	sched: act_csum: don't mangle TCP and UDP GSO packets after act_csum computes the checksum on skbs carrying GSO TCP/UDP packets, subsequent segmentation fails because skb_needs_check(skb, true) returns true. Because of that, skb_warn_bad_offload() is invoked and the following message is displayed: WARNING: CPU: 3 PID: 28 at net/core/dev.c:2553 skb_warn_bad_offload+0xf0/0xfd <...> [<ffffffff8171f486>] skb_warn_bad_offload+0xf0/0xfd [<ffffffff8161304c>] __skb_gso_segment+0xec/0x110 [<ffffffff8161340d>] validate_xmit_skb+0x12d/0x2b0 [<ffffffff816135d2>] validate_xmit_skb_list+0x42/0x70 [<ffffffff8163c560>] sch_direct_xmit+0xd0/0x1b0 [<ffffffff8163c760>] __qdisc_run+0x120/0x270 [<ffffffff81613b3d>] __dev_queue_xmit+0x23d/0x690 [<ffffffff81613fa0>] dev_queue_xmit+0x10/0x20 Since GSO is able to compute checksum on individual segments of such skbs, we can simply skip mangling the packet. Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-23 17:40:33 -07:00
Jie Deng	67ff2c71bb	net: dwc-xlgmac: use dual license The driver "dwc-xlgmac" is dual-licensed. Declare the dual license with MODULE_LICENSE(). Signed-off-by: Jie Deng <jiedeng@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-23 17:04:14 -07:00

1 2 3 4 5 ...

663115 Commits