linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-28 11:18:45 +07:00

Author	SHA1	Message	Date
Tariq Toukan	3b56f7b2af	net/mlx5e: Remove unnecessary fields in ICO SQ As of current design, in each NAPI, only a single UMR WQE completion could be available in the completion queue of the the internal control operations (ICO) send queue, in addition to nop operations that require no actions upon completion. This renders the consume index obsolete, as the wqe_counter field in CQE is sufficient. This helps removing a memory barrier, and obsoletes the need for tracking the num_wqebbs to update the consumer counter. In addition, remove other unused fields in icosq struct: pdev, dma_fifo_pc, and prev_cc. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-09-03 06:34:09 +03:00
Tariq Toukan	7cc6d77bb5	net/mlx5e: Type-specific optimizations for RX post WQEs function Separate the RX post WQEs function of the different RQ types. This enables RQ type-specific optimizations in data-path. Poll the ICOSQ completion queue only for Striding RQ, and only when a UMR post completion could be possibly available. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-09-03 06:34:09 +03:00
Tariq Toukan	a071cb9f25	net/mlx5e: Non-atomic RQ state indicator for UMR WQE in progress The indication for a UMR WQE in progress is needed only within the NAPI context, and hence no races possible and no need for the use of atomic operations. The only place the flag is read outside of NAPI context is in closure flow, after RQ is disabled flag is no more accessed in NAPI. Use a boolean instead of a bit in ring state, so that its non-atomic set operations do not race with the atomic sets of the other bits. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-09-03 06:34:09 +03:00
Tariq Toukan	a1eaba4c5c	net/mlx5e: Non-atomic indicator for ring enabled state Rings enabled state change occurs in control path only, and is always followed by a napi_sychronize(), so that following NAPIs read the new value. This read does not need to be atomic. The RQ auto-moderation bit is not set/cleared in data-path. No need for atomic read, a regular read operation is sufficient. In RQ creation time as well, there's no multiple threads trying to access it yet, hence a regular read can be used. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-09-03 06:34:09 +03:00
Tariq Toukan	604acb193b	net/mlx5e: Refactor data-path lro header function Refactor function mlx5e_lro_update_hdr() to reduce number of branches. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-09-03 06:34:09 +03:00
Tariq Toukan	4b7dfc9925	net/mlx5e: Early-return on empty completion queues NAPI context handles different kinds of completion queues (RX, TX, and others). Hence, upon a poll trial, some of them might be empty. Here we early-return upon empty completion queues, as well as full rx buffer, and save unnecessary logic and memory barriers. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-09-03 06:34:08 +03:00
Tariq Toukan	4cbb755801	net/mlx5e: NAPI busy-poll when UMR post is in progress If a UMR post is in progress, it means that there's a missing WQE in RQ, and that a completion will be shortly available in ICO SQ completion queue. Prefer busy-poll to handle it as soon as possible. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-09-03 06:34:08 +03:00
Tariq Toukan	4c2af5cc2b	net/mlx5e: Small enhancements for RX MPWQE allocation and free The dma offset of a MPWQE (Multi-Packet WQE) in memory region is fixed for all rounds. Calculate it once on creation time, instead of in runtime. This also obsoletes the wqe argument in the function. In addition, optimize dma_info iterator calculation. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-09-03 06:34:08 +03:00
Tariq Toukan	9bafe2adab	net/mlx5e: Use memset to init skbs_frags array to zeros In RX data-path, use memset() instead of loop assignment to init the whole skbs_frags array. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-09-03 06:34:08 +03:00
Tariq Toukan	b681c481f1	net/mlx5e: Remove unnecessary wqe_sz field from RQ buffer Field is used only locally within the RQ create function. The use of a local variable is sufficient. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-09-03 06:34:08 +03:00
Tariq Toukan	89e89f7a9f	net/mlx5e: Replace multiplication by stride size with a shift In RX data-path, use shift operations instead of a regular multiplication by stride size, as it is a power of two. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-09-03 06:34:08 +03:00
Tariq Toukan	b45d8b50b8	net/mlx5e: Reorganize struct mlx5e_rq Bring fast-path fields together, and combine RX WQE mutual exclusive fields into a union. Page-reuse and XDP are mutually exclusive and cannot be used at the same time. Use a union to combine their footprints. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-09-03 06:34:08 +03:00
David S. Miller	6026e043d0	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Three cases of simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-01 17:42:05 -07:00
Ido Schimmel	241bc859f6	mlxsw: spectrum_router: Set abort trap in all virtual routers When the abort mechanism is invoked a default route directing packets to the CPU is programmed in all the virtual routers currently in use. This can result in packet loss in case a new VRF is configured. Upon abort, program the default route in all virtual routers, whether they are in use or not. The patch is directed at net-next since post-abort fixes aren't critical and packet loss due to a missing default route will be insignificant compared to packet loss caused by the CPU port policer. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-01 10:01:36 -07:00
Ido Schimmel	d3b6d3774f	mlxsw: spectrum_router: Trap packets hitting anycast routes I relied on the fact that anycast routes use the loopback device as their nexthop device to trap packets hitting them to the CPU. After commit `4832c30d54` ("net: ipv6: put host and anycast routes on device with address") this is no longer the case and such routes are programmed with a forward action (note the 'offload' flag): anycast cafe:: dev enp3s0np7 proto kernel metric 0 offload pref medium This will prevent the router from locally receiving packets destined to the Subnet-Router anycast address. Fix this by specifically programming anycast routes with action trap, which results in the following output: anycast cafe:: dev enp3s0np7 proto kernel metric 0 pref medium Fixes: `4832c30d54` ("net: ipv6: put host and anycast routes on device with address") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-01 10:01:36 -07:00
Ido Schimmel	25cc72a338	mlxsw: spectrum: Forbid linking to devices that have uppers The mlxsw driver relies on NETDEV_CHANGEUPPER events to configure the device in case a port is enslaved to a master netdev such as bridge or bond. Since the driver ignores events unrelated to its ports and their uppers, it's possible to engineer situations in which the device's data path differs from the kernel's. One example to such a situation is when a port is enslaved to a bond that is already enslaved to a bridge. When the bond was enslaved the driver ignored the event - as the bond wasn't one of its uppers - and therefore a bridge port instance isn't created in the device. Until such configurations are supported forbid them by checking that the upper device doesn't have uppers of its own. Fixes: `0d65fc1304` ("mlxsw: spectrum: Implement LAG port join/leave") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Nogah Frankel <nogahf@mellanox.com> Tested-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-01 09:59:41 -07:00
Arkadi Sharshevsky	0fb5fe3c88	mlxsw: spectrum_dpipe: Add support for controlling IPv6 neighbor counters Add support for controlling IPv6 neighbor counters via dpipe. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-31 14:42:19 -07:00
Arkadi Sharshevsky	1ed5574c6d	mlxsw: spectrum_router: Add support for setting counters on IPv6 neighbors Add support for setting counters on IPv6 neighbors based on dpipe's host6 table counter status. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-31 14:42:19 -07:00
Arkadi Sharshevsky	410774bde1	mlxsw: spectrum_dpipe: Add support for IPv6 host table dump Add support for IPv6 host table dump. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-31 14:42:19 -07:00
Arkadi Sharshevsky	6049e5390c	mlxsw: spectrum_dpipe: Make host entry fill handler more generic Change the host entry filler helper to be applicable for both IPv4/6 addresses. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-31 14:42:19 -07:00
Arkadi Sharshevsky	0250768c6c	mlxsw: spectrum_router: Add IPv6 neighbor access helper Add helper for accessing destination IP in case of IPv6 neighbor. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-31 14:42:19 -07:00
Arkadi Sharshevsky	506f7dd56d	mlxsw: spectrum_dpipe: Add IPv6 host table initial support Add IPv6 host table initial support. The action behavior for both IPv4/6 tables is the same, thus the same action dump op is used. Neighbors with link local address are ignored. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-31 14:42:19 -07:00
Arkadi Sharshevsky	1d1056d80b	mlxsw: spectrum_router: Export IPv6 link local address check helper Neighbors with link local addresses are not offloaded to the host table, yet, the are maintained in the driver for adjacency table usage. When dumping the IPv6 host neighbors this link local neighbors should be ignored. This patch exports this helper for dpipe usage. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-31 14:42:19 -07:00
Gal Pressman	7b3722fa9e	net/mlx5e: Support RSS for GRE tunneled packets Introduce a new flow table and indirect TIRs which are used to hash the inner packet headers of GRE tunneled packets. When a GRE tunneled packet is received, the TTC flow table will match the new IPv4/6->GRE rules which will forward it to the inner TTC table. The inner TTC is similar to its counterpart outer TTC table, but matching the inner packet headers instead of the outer ones (and does not include the new IPv4/6->GRE rules). The new rules will not add steering hops since they are added to an already existing flow group which will be matched regardless of this patch. Non GRE traffic will not be affected. The inner flow table will forward the packet to inner indirect TIRs which hash the inner packet and thus result in RSS for the tunneled packets. Testing 8 TCP streams bandwidth over GRE: System: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz NIC: Mellanox Technologies MT28800 Family [ConnectX-5 Ex] Before: 21.3 Gbps (Single RQ) Now : 90.5 Gbps (RSS spread on 8 RQs) Signed-off-by: Gal Pressman <galp@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-31 01:54:15 +03:00
Gal Pressman	2729984149	net/mlx5e: Support TSO and TX checksum offloads for GRE tunnels Add TX offloads support for GRE tunneled packets by reporting the needed netdev features. Signed-off-by: Gal Pressman <galp@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-31 01:54:06 +03:00
Gal Pressman	888fcd9cd2	net/mlx5e: Use IP version matching to classify IP traffic This change adds the ability for flow steering to classify IPv4/6 packets with MPLS tag (Ethertype 0x8847 and 0x8848) as standard IP packets and hit IPv4/6 classification steering rules. Since IP packets with MPLS tag header have MPLS ethertype, they missed the IPv4/6 ethertype rule and ended up hitting the default filter forwarding all the packets to the same single RQ (No RSS). Since our device is able to look past the MPLS tag and identify the next protocol we introduce this solution which replaces ethertype matching by the device's capability to perform IP version identification and matching in order to distinguish between IPv4 and IPv6. Therefore, when driver is performing flow steering configuration on the device it will use IP version matching in IP classified rules instead of ethertype matching which will cause relevant MPLS tagged packets to hit this rule as well. If the device doesn't support IP version matching the driver will fall back to use legacy ethertype matching in the steering as before. Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-31 01:52:22 +03:00
Tal Gilboa	1213ad28f9	net/mlx5e: Fix CQ moderation mode not set properly cq_period_mode assignment was mistakenly removed so it was always set to "0", which is EQE based moderation, regardless of the device CAPs and requested value in ethtool. Fixes: `6a9764efb2` ("net/mlx5e: Isolate open_channels from priv->params") Signed-off-by: Tal Gilboa <talgi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-30 21:20:43 +03:00
Moshe Shemesh	6aace17e64	net/mlx5e: Fix inline header size for small packets Fix inline header size, make sure it is not greater than skb len. This bug effects small packets, for example L2 packets with size < 18. Fixes: `ae76715d15` ("net/mlx5e: Check the minimum inline header mode before xmit") Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-30 21:20:43 +03:00
Shahar Klein	191220396d	net/mlx5: E-Switch, Unload the representors in the correct order When changing from switchdev to legacy mode, all the representor port devices (uplink nic and reps) are cleaned up. Part of this cleaning process is removing the neigh entries and the hash table containing them. However, a representor neigh entry might be linked to the uplink port hash table and if the uplink nic is cleaned first the cleaning of the representor will end up in null deref. Fix that by unloading the representors in the opposite order of load. Fixes: `cb67b83292` ("net/mlx5e: Introduce SRIOV VF representors") Signed-off-by: Shahar Klein <shahark@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-30 21:20:43 +03:00
Paul Blakey	08820528c9	net/mlx5e: Properly resolve TC offloaded ipv6 vxlan tunnel source address Currently if vxlan tunnel ipv6 src isn't supplied the driver fails to resolve it as part of the route lookup. The resulting encap header is left with a zeroed out ipv6 src address so the packets are sent with this src ip. Use an appropriate route lookup API that also resolves the source ipv6 address if it's not supplied. Fixes: `ce99f6b97f` ('net/mlx5e: Support SRIOV TC encapsulation offloads for IPv6 tunnels') Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-30 21:20:43 +03:00
Inbar Karmy	5a8e12678c	net/mlx5e: Don't override user RSS upon set channels Currently, increasing the number of combined channels is changing the RSS spread to use the new created channels. Prevent the RSS spread change in case the user explicitly declare it, to avoid overriding user configuration. Tested: when RSS default: # ethtool -L ens8 combined 4 RSS spread will change and point to 4 channels. # ethtool -X ens8 equal 4 # ethtool -L ens8 combined 6 RSS will not change after increasing the number of the channels. Fixes: `8bf3686204` ('ethtool: ensure channel counts are within bounds during SCHANNELS') Signed-off-by: Inbar Karmy <inbark@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-30 21:20:43 +03:00
Eran Ben Elisha	0556ce72ab	net/mlx5e: Fix dangling page pointer on DMA mapping error Function mlx5e_dealloc_rx_wqe is using page pointer value as an indication to valid DMA mapping. In case that the mapping failed, we released the page but kept the dangling pointer. Store the page pointer only after the DMA mapping passed to avoid invalid page DMA unmap. Fixes: `bc77b240b3` ("net/mlx5e: Add fragmented memory support for RX multi packet WQE") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-30 21:20:43 +03:00
Huy Nguyen	10a8d00707	net/mlx5: Remove the flag MLX5_INTERFACE_STATE_SHUTDOWN MLX5_INTERFACE_STATE_SHUTDOWN is not used in the code. Fixes: `5fc7197d3a` ("net/mlx5: Add pci shutdown callback") Signed-off-by: Huy Nguyen <huyn@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-30 21:20:43 +03:00
Huy Nguyen	b3cb538849	net/mlx5: Skip mlx5_unload_one if mlx5_load_one fails There is an issue where the firmware fails during mlx5_load_one, the health_care timer detects the issue and schedules a health_care call. Then the mlx5_load_one detects the issue, cleans up and quits. Then the health_care starts and calls mlx5_unload_one to clean up the resources that no longer exist and causes kernel panic. The root cause is that the bit MLX5_INTERFACE_STATE_DOWN is not set after mlx5_load_one fails. The solution is removing the bit MLX5_INTERFACE_STATE_DOWN and quit mlx5_unload_one if the bit MLX5_INTERFACE_STATE_UP is not set. The bit MLX5_INTERFACE_STATE_DOWN is redundant and we can use MLX5_INTERFACE_STATE_UP instead. Fixes: `5fc7197d3a` ("net/mlx5: Add pci shutdown callback") Signed-off-by: Huy Nguyen <huyn@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-30 21:20:43 +03:00
Noa Osherovich	672d0880b7	net/mlx5: Fix arm SRQ command for ISSI version 0 Support for ISSI version 0 was recently broken as the arm_srq_cmd command, which is used only for ISSI version 0, was given the opcode for ISSI version 1 instead of ISSI version 0. Change arm_srq_cmd to use the correct command opcode for ISSI version 0. Fixes: `af1ba291c5` ('{net, IB}/mlx5: Refactor internal SRQ API') Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-30 21:20:42 +03:00
Huy Nguyen	9e10bf1d34	net/mlx5e: Fix DCB_CAP_ATTR_DCBX capability for DCBNL getcap. Current code doesn't report DCB_CAP_DCBX_HOST capability when query through getcap. User space lldptool expects capability to have HOST mode set when it wants to configure DCBX CEE mode. In absence of HOST mode capability, lldptool fails to switch to CEE mode. This fix returns DCB_CAP_DCBX_HOST capability when port's DCBX controlled mode is under software control. Fixes: `3a6a931dfb` ("net/mlx5e: Support DCBX CEE API") Signed-off-by: Huy Nguyen <huyn@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-30 21:20:42 +03:00
Huy Nguyen	33c52b6718	net/mlx5e: Check for qos capability in dcbnl_initialize qos capability is the master capability bit that determines if the DCBX is supported for the PCI function. If this bit is off, driver cannot run any dcbx code. Fixes: `e207b7e991` ("net/mlx5e: ConnectX-4 firmware support for DCBX") Signed-off-by: Huy Nguyen <huyn@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-30 21:20:42 +03:00
Moshe Shemesh	be59960395	net/mlx4: Add user mac FW update support Adding support for updating the FW on new port mac, when port mac change is requested by the user. This info is required by the FW as OEM management tools require this info directly from the NIC FW. Check device capability bit to verify the FW supports user mac. If the FW does support it, use set_port command to notify the FW on the new mac. The feature is relevant only to PF port mac. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-29 14:58:32 -07:00
Tariq Toukan	a434f1fd2c	net/mlx4_core: Fix misplaced brackets of sizeof When changing the sizeof style usage in the patch cited below, one brackets misplacement was introduced. Here we fix it. Fixes: `31975e27a4` ("mlx4: sizeof style usage") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-29 14:58:32 -07:00
Leon Romanovsky	187782eb58	net/mlx4_core: Make explicit conversion to 64bit value The "lg" variable is declared as int so in all places where this variable is used as a shift operand, the output will be int too. This produces the following smatch warning: drivers/net/ethernet/mellanox/mlx4/fw.c:1532 mlx4_map_cmd() warn: should '1 << lg' be a 64 bit type? Simple declaration of "1" to be "1ULL" will fix the issue. Fixes: `225c7b1fee` ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-29 14:58:32 -07:00
Eran Ben Elisha	c73c8b1e47	net/mlx4_core: Dynamically allocate structs at mlx4_slave_cap In order to avoid temporary large structs on the stack, allocate them dynamically. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Tal Alon <talal@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-29 14:58:32 -07:00
Artemy Kovalyov	5b3ec3fcb6	net/mlx5: Add XRQ support Add support to new XRQ(eXtended shared Receive Queue) hardware object. It supports SRQ semantics with addition of extended receive buffers topologies and offloads. Currently supports tag matching topology and rendezvouz offload. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Reviewed-by: Yossi Itigin <yosefe@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-08-29 08:30:20 -04:00
Arkadi Sharshevsky	18fed7e15d	mlxsw: spectrum_dpipe: Fix host table dump During the neighbor traversal the neighbors from different families should be ignored. Fixes: c58035a74aba ("mlxsw: spectrum_dpipe: Add support for IPv4 host table dump") Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-28 15:41:15 -07:00
Jiri Pirko	10bfec0a2b	mlxsw: spectrum: compile-in dpipe support only if devlink is enabled Makes no sense to have dpipe compiled in when devlink is not enabled, because the devlink dpipe registation is noop function. So don't compile it in. This also fixes missing extern structs errors. Reported-by: kbuild test robot <fengguang.wu@intel.com> Fixes: `a86f030915` ("mlxsw: spectrum_dpipe: Add support for IPv4 host table dump") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-28 15:41:15 -07:00
David S. Miller	0cf3f4c37d	mlx5-updates-2017-08-24 This series includes updates to mlx5 core driver. From Gal and Saeed, three cleanup patches. From Matan, Low level flow steering improvements and optimizations, - Use more efficient data structures for flow steering objects handling. - Add tracepoints to flow steering operations. - Overall these patches improve flow steering rule insertion rate by a factor of seven in large scales (~50K rules or more). -Saeed. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJZntEwAAoJEEg/ir3gV/o+ghgIAJ5UBPWvZspnbQJHBopsJh47 d4qt4HrcxxoA07d7QflGSmzqqvoX87eo6mVMQ/WkB+0D8KxggXYr75EOk4lQeYYo kiZ+4GdR6UaeQMhykcThKUyEpv60/8wLmXaHvhdWOaVsmzAFwQK0u5HGJlW14lzx LHvJGWG377zu+SdpR6wNDrwaHhk2B4Azqb5bomiGTPCg1RdZv3i37/hbF00X9GHB ZzPg3Mc5RQvF1fu9H35x4f15pturmMbtuGzmR2oKHMmNS2XQd6lFFlXfQxVUxtdg hvAj7RYFrmY1fAPp9cMZbB5ibKkFUFE6idebfrTIrVQrbxv9o0nwRvZTB4lbe9U= =hpBO -----END PGP SIGNATURE----- Merge tag 'mlx5-updates-2017-08-24' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2017-08-24 This series includes updates to mlx5 core driver. From Gal and Saeed, three cleanup patches. From Matan, Low level flow steering improvements and optimizations, - Use more efficient data structures for flow steering objects handling. - Add tracepoints to flow steering operations. - Overall these patches improve flow steering rule insertion rate by a factor of seven in large scales (~50K rules or more). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-24 21:49:56 -07:00
Bhumika Goyal	39a7e58924	net/mlx5e: make mlx5e_profile const Make this const as it is only passed as an argument to the function mlx5e_create_netdev and the corresponding argument is of type const. Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-24 12:33:31 -07:00
Bhumika Goyal	3f2c5fb2d8	net/mlx4_core: make mlx4_profile const Make these const as they are only used in a copy operation. Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-24 12:33:31 -07:00
Arkadi Sharshevsky	a481d71323	mlxsw: spectrum_dpipe: Add support for controlling neighbor counters Add support for controlling neighbor counters via dpipe. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-24 09:33:16 -07:00
Arkadi Sharshevsky	a86f030915	mlxsw: spectrum_dpipe: Add support for IPv4 host table dump Add support for IPv4 host table dump. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-24 09:33:16 -07:00
Arkadi Sharshevsky	7cfcbc7591	mlxsw: spectrum_router: Add support for setting counters on neighbors Add support for setting counters on neighbors based on dpipe's host table counter status. This patch also adds the ability for getting the counter value, which will be used by the dpipe host table implementation in the next patches. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-24 09:33:16 -07:00
Arkadi Sharshevsky	6bba7e20da	mlxsw: reg: Make flow counter set type enum to be shared This is done as a preparation before introducing support for neighbor counters. The flow counter's type enum is used by many registers, yet, until now it was used only by mgpc and thus it was private. This patch updates the namespace for more generic usage. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-24 09:33:16 -07:00
Arkadi Sharshevsky	6aecb36bc0	mlxsw: spectrum_dpipe: Add IPv4 host table initial support Add IPv4 host table initial support. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-24 09:33:16 -07:00
Arkadi Sharshevsky	7e57ae9fc5	mlxsw: spectrum_dpipe: Fix label name Change label name for case of erif table init failure. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-24 09:33:16 -07:00
Arkadi Sharshevsky	f17cc84d1c	mlxsw: spectrum_router: Add helpers for neighbor access This is done as a preparation before introducing the ability to dump the host table via dpipe, and to count the table size. The mlxsw's neighbor representative struct stays private to the router module. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-24 09:33:16 -07:00
Arkadi Sharshevsky	3580732448	devlink: Move dpipe entry clear function into devlink The entry clear routine can be shared between the drivers, thus it is moved inside devlink. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-24 09:33:16 -07:00
Arkadi Sharshevsky	ffd3cdccf2	devlink: Add support for dynamic table size Up until now the dpipe table's size was static and known at registration time. The host table does not have constant size and it is resized in dynamic manner. In order to support this behavior the size is changed to be obtained dynamically via an op. This patch also adjust the current dpipe table for the new API. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-24 09:33:16 -07:00
Arkadi Sharshevsky	23ca5ec3af	mlxsw: spectrum_dpipe: Fix erif table op name space Fix ERIF's table operations name space. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-24 09:33:16 -07:00
Matan Barak	4c03e69ab1	net/mlx5: Add tracepoints Add a tracepoint infrastructure for mlx5_core driver. Implemented flow steering tracepoints: 1. Add flow group 2. Remove flow group 3. Add flow table entry 4. Remove flow table entry 5. Add flow table rule 6. Remove flow table rule Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-24 16:02:58 +03:00
Matan Barak	693c6883bb	net/mlx5: Add hash table for flow groups in flow table When adding a flow table entry (fte) to a flow table (ft), we first need to find its flow group (fg). Currently, this is done by traversing a linear list of all flow groups in the flow table. Furthermore, since multiple flow groups which correspond to the same fte mask may exist in the same ft, we can't just stop at the first match. Converting the linear list to rhltable in order to speed things up. The last four patches increases the steering rules update rate by a factor of more than 7 (for insertion of 50K steering rules). Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-24 16:02:58 +03:00
Matan Barak	0d235c3fab	net/mlx5: Add hash table to search FTEs in a flow-group When adding a flow table entry (fte) to a flow group (fg), we first need to check whether this fte exist. In such a case we just merge the destinations (if possible). Currently, this is done by traversing the fte list available in a fg. This could take a lot of time when using large flow groups. Speeding this up by using rhashtable, which is much faster. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-24 16:02:58 +03:00
Matan Barak	667cb65ae5	net/mlx5: Don't store reserved part in FTEs and FGs The current code stores fte_match_param in the software representation of FTEs and FGs. fte_match_param contains a large reserved area at the bottom of the struct. Since downstream patches are going to hash this part, we would like to avoid doing so on a reserved part. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-24 16:02:58 +03:00
Matan Barak	8ebabaa02f	net/mlx5: Convert linear search for free index to ida When allocating a flow table entry, we need to allocate a free index in the flow group. Currently, this is done by traversing the existing flow table entries in the flow group, until a free index is found. Replacing this by using a ida, which allows us to find a free index much faster. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-24 16:02:58 +03:00
Gal Pressman	0980030550	net/mlx5e: Fix wrong code indentation in conditional statement Fix the following checkpatch warning in en_ethtool.c: WARNING: suspect code indent for conditional statements (8, 9) + for (i = 0; i < NUM_PCIE_PERF_STALL_COUNTERS(priv); i++) + strcpy(data + (idx++) * ETH_GSTRING_LEN, Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-24 16:02:58 +03:00
Saeed Mahameed	3c0045837d	net/mlx5: Add a blank line after declarations V2 The blank line should be after u32 val = ... and not after __be32 __iomem *addr = ... Fixes: `ad5b39a95c` ("net/mlx5: Add a blank line after declarations") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reported-by: Joe Perches <joe@perches.com>	2017-08-24 16:02:57 +03:00
Jiri Pirko	0ede6ba2a1	mlxsw: spectrum_flower: Offload goto_chain termination action If action is gact goto_chain, offload it to HW by jumping to another ruleset. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-23 20:44:32 -07:00
Jiri Pirko	dbec8ee95a	mlxsw: spectrum_acl: Provide helper to lookup ruleset We need to lookup ruleset in order to offload goto_chain termination action. This patch adds it. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-23 20:44:32 -07:00
Jiri Pirko	0ade3b6457	mlxsw: spectrum_acl: Allow to get group_id value for a ruleset For goto_chain action we need to know group_id of a ruleset to jump to. Provide infrastructure in order to get it. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-23 20:44:32 -07:00
Jiri Pirko	45b62742df	mlxsw: spectrum: Offload multichain TC rules Reflect chain index coming down from TC core and create a ruleset per chain. Note that only chain 0, being the implicit chain, is bound to the device for processing. The rest of chains have to be "jumped-to" by actions. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-23 20:44:32 -07:00
Nogah Frankel	4eb6a3bdb4	mlxsw: spectrum_switchdev: Fix mrouter flag update Update the value of the mrouter flag in struct mlxsw_sp_bridge_port when it is being changed. Fixes: `c57529e1d5` ("mlxsw: spectrum: Replace vPorts with Port-VLAN") Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-22 14:22:54 -07:00
Romain Perier	18c90df9f2	mlx5: Replace PCI pool old API The PCI pool API is deprecated. This commit replaces the PCI pool old API by the appropriate function with the DMA pool API. Signed-off-by: Romain Perier <romain.perier@collabora.com> Reviewed-by: Peter Senna Tschudin <peter.senna@collabora.com> Acked-by: Doug Ledford <dledford@redhat.com> Tested-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-08-22 13:59:46 -04:00
Romain Perier	b9f761aa78	mlx4: Replace PCI pool old API The PCI pool API is deprecated. This commit replaces the PCI pool old API by the appropriate function with the DMA pool API. Signed-off-by: Romain Perier <romain.perier@collabora.com> Acked-by: Peter Senna Tschudin <peter.senna@collabora.com> Tested-by: Peter Senna Tschudin <peter.senna@collabora.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Doug Ledford <dledford@redhat.com> Tested-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-08-22 13:59:46 -04:00
David S. Miller	e2a7c34fb2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2017-08-21 17:06:42 -07:00
Gal Pressman	9da5106c56	net/mlx5e: Use size_t to store byte offset in statistics descriptors The byte offset of counter descriptors should be stored in size_t variable instead of an integer. Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-20 12:57:20 +03:00
Gal Pressman	c045deef64	net/mlx5e: Use kernel types instead of uint*_t in ethtool callbacks Fix checkpatch errors: CHECK:PREFER_KERNEL_TYPES: Prefer kernel type 'u32' over 'uint32_t' Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-20 12:57:19 +03:00
Or Gerlitz	1afdb7718f	net/mlx5e: Place constants on the right side of comparisons To fix these checkpatch complaints: WARNING: Comparisons should place the constant on the right side of the test Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-20 12:57:19 +03:00
Or Gerlitz	12148f5ab0	net/mlx5e: Avoid using multiple blank lines To fix these checkpatch complaints: CHECK: Please don't use multiple blank lines Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-20 12:57:19 +03:00
Or Gerlitz	61bf212565	net/mlx5e: Properly indent within conditional statements To fix these checkpatch complaints: WARNING: suspect code indent for conditional statements (8, 24) + if (eth_proto & (MLX5E_PROT_MASK(MLX5E_10GBASE_SR) [...] + return PORT_FIBRE; Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-20 12:57:19 +03:00
Or Gerlitz	ad5b39a95c	net/mlx5: Add a blank line after declarations To fix these checkpatch complaints: WARNING: Missing a blank line after declarations Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-20 12:57:19 +03:00
Or Gerlitz	733d6c5149	net/mlx5: Avoid blank lines after/before open/close brace To fix these checkpatch complaints: CHECK: Blank lines aren't necessary after an open brace '{' CHECK: Blank lines aren't necessary before a close brace '}' Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-20 12:57:19 +03:00
Eran Ben Elisha	efae7f78c4	net/mlx5e: Add outbound PCI buffer overflow counter Add outbound_pci_buffer_overflow to ethtool output for monitoring the number of packets that were dropped due to lack of PCIe buffers on receive path from NIC port toward the host(s). This counter is valid only in case that tx_overflow_buffer_pkt is supported in MCAM enhanced features. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-20 12:57:19 +03:00
Gal Pressman	068aef33be	net/mlx5e: Add RX buffer fullness counters rx_buffer_passed_thres_phy - The number of events where the port RX buffer has passed a fullness threshold. rx_buffer_full_phy - The number of events where the port RX buffer has reached 100% fullness. Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-20 12:57:19 +03:00
Gal Pressman	73e90646a2	net/mlx5e: Add PCIe outbound stalls counters outbound_pci_stalled_rd - The percentage of time within the last second that the NIC had outbound non-posted read requests but could not perform the operation due to insufficient non-posted credits. outbound_pci_stalled_wr - The percentage of time within the last second that the NIC had outbound posted writes requests but could not perform the operation due to insufficient posted credits. outbound_pci_stalled_rd_events - The number of events where outbound_pci_stalled_rd was above the threshold. outbound_pci_stalled_wr_events - The number of events where outbound_pci_stalled_wr was above the threshold. Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-20 12:57:19 +03:00
Shalom Lagziel	eb234ee9d5	net/mlx5e: IPoIB, Add support for get_link_ksettings in ethtool Add support for "ethtool DEVNAME" over ipoib ports, Display standard port information for IPoIB netdevices using ethtool For example: $ ethtool ib2 > Settings for ib2: Supported ports: [ ] Supported link modes: Not reported Supported pause frame use: No Supports auto-negotiation: No Advertised link modes: Not reported Advertised pause frame use: No Advertised auto-negotiation: No Speed: 100000Mb/s Duplex: Full Port: Other PHYAD: 0 Transceiver: internal Auto-negotiation: off Link detected: yes Signed-off-by: Shalom Lagziel <shaloml@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-20 12:57:19 +03:00
Feras Daoud	1d1c343611	net/mlx5e: IPoIB, Fix driver name retrieved by ethtool Printing an enhanced IPoIB device information using "ethtool -i DEVNAME", prints the low level driver name: mlx5_core. This commit changes the name to mlx5_core [ib_ipoib], to include the ipoib device driver infromation. Fixes: `076b0936e5` ("net/mlx5e: IPoIB, Add ethtool support") Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-20 12:57:19 +03:00
Eran Ben Elisha	63bfd399de	net/mlx5e: Send PAOS command on interface up/down Upon interface up/down, driver will send PAOS (Ports Administrative and Operational Status Register) in order to inform the Firmware on the desired status of the port by the driver. Since now we might change physical link status on mlx5e_open/close, logical VF representor should not use mlx5e_open/close ndos as is, and should call the logical version mlx5e_open/closed_locked. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-20 12:57:19 +03:00
Colin Ian King	1547f538c1	mlx5: ensure 0 is returned when vport is zero Currently, if vport is zero then then an uninialized return status in err is returned. Since the only return status at the end of the function esw_add_uc_addr is zero for the current set of return paths we may as well just return 0 rather than err to fix this issue. Detected by CoverityScan, CID#1452698 ("Uninitialized scalar variable") Fixes: `eeb66cdb68` ("net/mlx5: Separate between E-Switch and MPFS") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-18 16:28:40 -07:00
Huy Nguyen	ca3d89a3eb	net/mlx4_core: Enable 4K UAR if SRIOV module parameter is not enabled enable_4k_uar module parameter was added in patch cited below to address the backward compatibility issue in SRIOV when the VM has system's PAGE_SIZE uar implementation and the Hypervisor has 4k uar implementation. The above compatibility issue does not exist in the non SRIOV case. In this patch, we always enable 4k uar implementation if SRIOV is not enabled on mlx4's supported cards. Fixes: `76e39ccf9c` ("net/mlx4_core: Fix backward compatibility on VFs") Signed-off-by: Huy Nguyen <huyn@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-18 16:15:37 -07:00
Chris Mi	7f3b39dafc	net/sched: Fix the logic error to decide the ingress qdisc The offending commit used a newly added helper function. But the logic is wrong. Without this fix, the affected NICs can't do HW offload. Error -EOPNOTSUPP will be returned directly. Fixes: `a2e8da9378` ("net/sched: use newly added classid identity helpers") Signed-off-by: Chris Mi <chrism@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-18 10:29:04 -07:00
Colin Ian King	ba5c4dac03	net/mlx4: fix spelling mistake: "availible" -> "available" Trivial fix to spelling mistakes in the mlx4 driver Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-16 14:23:45 -07:00
stephen hemminger	31975e27a4	mlx4: sizeof style usage The kernel coding style is to treat sizeof as a function (ie. with parenthesis) not as an operator. Also use kcalloc and kmalloc_array Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-16 11:01:57 -07:00
Ido Schimmel	df9a21f11f	mlxsw: spectrum_router: Use correct config option I made an embarrassing mistake and used 'IPV6' instead of 'CONFIG_IPV6' around the function that updates the kernel about IPv6 neighbours activity. This can be a problem if the kernel has more neighbours than a certain threshold and it starts deleting those that are supposedly inactive. Fixes: `b5f3e0d430` ("mlxsw: spectrum_router: Fix build when IPv6 isn't enabled") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-15 17:05:03 -07:00
Ido Schimmel	fe40079995	ipv6: fib: Provide offload indication using nexthop flags IPv6 routes currently lack nexthop flags as in IPv4. This has several implications. In the forwarding path, it requires us to check the carrier state of the nexthop device and potentially ignore a linkdown route, instead of checking for RTNH_F_LINKDOWN. It also requires capable drivers to use the user facing IPv6-specific route flags to provide offload indication, instead of using the nexthop flags as in IPv4. Add nexthop flags to IPv6 routes in the 40 bytes hole and use it to provide offload indication instead of the RTF_OFFLOAD flag, which is removed while it's still not part of any official kernel release. In the near future we would like to use the field for the RTNH_F_{LINKDOWN,DEAD} flags, but this change is more involved and might not be ready in time for the current cycle. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-15 17:05:03 -07:00
Zhu Yanjun	26d159482a	mlx5: remove unnecessary pci_set_drvdata() The driver core clears the driver data to NULL after device_release or on probe failure. Thus, it is not necessary to manually clear the device driver data to NULL. Cc: Joe Jin <joe.jin@oracle.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-15 16:46:48 -07:00
Zhu Yanjun	e084a8b89c	mlx4: remove unnecessary pci_set_drvdata() The driver core clears the driver data to NULL after device_release or on probe failure. Thus, it is not necessary to manually clear the device driver data to NULL. Cc: Joe Jin <joe.jin@oracle.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-15 16:46:48 -07:00
Arkadi Sharshevsky	e6f3b379c0	mlxsw: spectrum_router: Add support for nexthop group consolidation for IPv6 Due to limited ASIC resources the maximum number of routes is limited by the nexthop resource. In order to improve the routing scale nexthop consolidation should be performed. This patch adds support for IPv6 neighbor consolidation. The hash value is calculated based on the nexthop set, by performing bitwise xor on the ifindexs of the nexthops, in a similar way to IPv4's kernel implementation. In case of collision a full match is performed between the sets which include address and ifindex comparison. Non gateway nexthop groups are not inserted to the hash table due to lack of nexthop device (ifindex). Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-14 22:23:32 -07:00
Arkadi Sharshevsky	ba31d36669	mlxsw: spectrum_router: Prepare nexthop group's hash table for IPv6 This patch does preparation before introducing IPv6 nexthop group consolidation. Currently the nexthop group hash table is used only by IPv4 and uses fixed key size. In order to support the IPv6's variable length key the current table is changed. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-14 22:23:32 -07:00
Ohad Oz	a656d34a6e	Change Kconfig description This patch apply Mellanox network vendor which includes: - Mellanox card devices: ConnectX-4, ConnectX-5 and Connect-IB cards. - Mellanox switch device: SwitchX-2 Switch-IB, Spectrum. Therefore rephrasing help. Signed-off-by: Ohad Oz <ohado@mellanox.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-14 11:18:16 -07:00
Ohad Oz	8122e08b1d	Allow Mellanox switch devices to be configured if only I2C bus is set Mellanox switches (mlxsw) supports I2C systems without PCI, in order to give the ability to the users to use such functionality, there is need to update Kconfig. Signed-off-by: Ohad Oz <ohado@mellanox.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-14 11:18:16 -07:00
Ido Schimmel	fc922bb0dd	mlxsw: spectrum_router: Use one LPM tree for all virtual routers The number of LPM trees available for lookup is much smaller than the number of virtual routers, which are used to implement VRFs. In addition, an LPM tree can only be used by one protocol - either IPv4 or IPv6. Therefore, in order to increase the number of supported virtual routers to the maximum we need to be able to share LPM trees across virtual routers instead of trying to find an optimized tree for each. Do that by allocating one LPM tree for each protocol, but make sure it will only include prefixes that are actually used, so as to not perform unnecessary lookups. Since changing the structure of a bound tree isn't recommended, whenever a new tree it required, it's first created and then bound to each virtual router, replacing the old one. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-14 11:14:03 -07:00
Ido Schimmel	0adb214ba2	mlxsw: spectrum_router: Pass argument explicitly Instead of relying on the LPM tree to be assigned to the virtual router before binding the two, lets pass it explicitly. This will later allow us to return upon binding error instead of having to perform a rollback of the assignment. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-14 11:14:03 -07:00
Ido Schimmel	cc70267008	mlxsw: spectrum_router: Return void from deletion functions There is no point in returning a value from function whose return value is never checked. Even if the return value was checked, there wouldn't be anything to do about it, as these functions are either called from error or deletion paths. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-14 11:14:03 -07:00
Bhumika Goyal	159fe88efd	mlxsw: make mlxsw_config_profile const Make these structures const as they only stored in the profile field of a mlxsw_driver structure, which is of type const. Done using Coccinelle. Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-11 14:57:04 -07:00
Jiri Pirko	a2e8da9378	net: sched: use newly added classid identity helpers Instead of checking handle, which does not have the inner class information and drivers wrongly assume clsact->egress as ingress, use the newly introduced classid identification helpers. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-11 13:47:01 -07:00
Doug Ledford	320438301b	Merge branches '32bit_lid' and 'irq_affinity' into k.o/merge-test Conflicts: drivers/infiniband/hw/mlx5/main.c - Both add new code include/rdma/ib_verbs.h - Both add new code Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-08-10 14:31:29 -04:00
David S. Miller	3118e6e19d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net The UDP offload conflict is dealt with by simply taking what is in net-next where we have removed all of the UFO handling code entirely. The TCP conflict was a case of local variables in a function being removed from both net and net-next. In netvsc we had an assignment right next to where a missing set of u64 stats sync object inits were added. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-09 16:28:45 -07:00
Davide Caratti	e718fe450e	net/mlx4_en: don't set CHECKSUM_COMPLETE on SCTP packets if the NIC fails to validate the checksum on TCP/UDP, and validation of IP checksum is successful, the driver subtracts the pseudo-header checksum from the value obtained by the hardware and sets CHECKSUM_COMPLETE. Don't do that if protocol is IPPROTO_SCTP, otherwise CRC32c validation fails. V2: don't test MLX4_CQE_STATUS_IPV6 if MLX4_CQE_STATUS_IPV4 is set Reported-by: Shuang Li <shuali@redhat.com> Fixes: `f8c6455bb0` ("net/mlx4_en: Extend checksum offloading by CHECKSUM COMPLETE") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-08 17:59:57 -07:00
Sagi Grimberg	a435393aca	mlx5: move affinity hints assignments to generic code generic api takes care of spreading affinity similar to what mlx5 open coded (and even handles better asymmetric configurations). Ask the generic API to spread affinity for us, and feed him pre_vectors that do not participate in affinity settings (which is an improvement to what we had before). The affinity assignments should match what mlx5 tried to do earlier but now we do not set affinity to async, cmd and pages dedicated vectors. Also, remove mlx5e_get_cpu and introduce mlx5e_get_node (used for allocation purposes) and mlx5_get_vector_affinity (for indirection table construction) as they provide the needed information. Luckily, we have generic helpers to get cpumask and node given a irq vector. mlx5_get_vector_affinity will be used by mlx5_ib in a subsequent patch. Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-08-08 14:55:56 -04:00
Sagi Grimberg	a85e5474f4	mlx5e: don't assume anything on the irq affinity mappings of the device mlx5e currently assumes that irq affinity is really spread first irq vectors across device home node cpus, with the new generic affinity mappings this is no longer the case, hence mlxe should not rely on this anymore. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-08-08 14:53:05 -04:00
Sagi Grimberg	78249c4215	mlx5: convert to generic pci_alloc_irq_vectors Now that we have a generic code to allocate an array of irq vectors and even correctly spread their affinity, correctly handle cpu hotplug events and more, were much better off using it. Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-08-08 14:53:05 -04:00
David S. Miller	fde6af4729	mlx5-shared-2017-08-07 This series includes some mlx5 updates for both net-next and rdma trees. From Saeed, Core driver updates to allow selectively building the driver with or without some large driver components, such as - E-Switch (Ethernet SRIOV support). - Multi-Physical Function Switch (MPFs) support. For that we split E-Switch and MPFs functionalities into separate files. From Erez, Delay mlx5_core events when mlx5 interfaces, namely mlx5_ib, registration is taking place and until it completes. From Rabie, Increase the maximum supported flow counters. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJZiDoAAAoJEEg/ir3gV/o+594H/RH5kRwC719s/5YQFJXvGsVC fjtj3UUJPLrWB8XBh7a4PRcxXPIHaFKJuY3MU7KHFIeZQFklJcit3njjpxDlUINo F5S1LHBSYBkeMD/ksWBA8OLCBprNGN6WQ2tuFfAjZlQQ44zqv8LJmegoDtW9bGRy aGAkjUmALEblQsq81y0BQwN2/8DA8HAywrs8L2dkH1LHwijoIeYMZFOtKugv1FbB ABSKxcU7D/NYw6rsVdZG59fHFQ+eKOspDFqBZrUzfQ+zUU2hFFo96ovfXBfIqYCV 7BtJuKXu2LeGPzFLsuw4h1131iqFT1iSMy9fEhf/4OwaL/KPP/+Umy8vP/XfM+U= =wCpd -----END PGP SIGNATURE----- Merge tag 'mlx5-shared-2017-08-07' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Saeed Mahameed says: ==================== mlx5-shared-2017-08-07 This series includes some mlx5 updates for both net-next and rdma trees. From Saeed, Core driver updates to allow selectively building the driver with or without some large driver components, such as - E-Switch (Ethernet SRIOV support). - Multi-Physical Function Switch (MPFs) support. For that we split E-Switch and MPFs functionalities into separate files. From Erez, Delay mlx5_core events when mlx5 interfaces, namely mlx5_ib, registration is taking place and until it completes. From Rabie, Increase the maximum supported flow counters. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-07 10:42:09 -07:00
Jiri Pirko	de4784ca03	net: sched: get rid of struct tc_to_netdev Get rid of struct tc_to_netdev which is now just unnecessary container and rather pass per-type structures down to drivers directly. Along with that, consolidate the naming of per-type structure variables in cls_*. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-07 09:42:37 -07:00
Jiri Pirko	38cf0426e5	net: sched: change return value of ndo_setup_tc for driver supporting mqprio only Change the return value from -EINVAL to -EOPNOTSUPP. The rest of the drivers have it like that, so be aligned. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-07 09:42:37 -07:00
Jiri Pirko	d7c1c8d2e5	net: sched: move prio into cls_common prio is not cls_flower specific, but it is meaningful for all classifiers. Seems that only mlxsw cares about the value. Obviously, cls offload in other drivers is broken. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-07 09:42:37 -07:00
Jiri Pirko	5fd9fc4e20	net: sched: push cls related args into cls_common structure As ndo_setup_tc is generic offload op for whole tc subsystem, does not really make sense to have cls-specific args. So move them under cls_common structurure which is embedded in all cls structs. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-07 09:42:37 -07:00
Jiri Pirko	9cbf14ede2	mlxsw: spectrum: rename cls arg in matchall processing To sync-up with the naming in the rest of the driver, rename the cls arg. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-07 09:42:36 -07:00
Jiri Pirko	fd33f1dfed	mlxsw: spectrum: push cls_flower and cls_matchall setup_tc processing into separate functions Let mlxsw_sp_setup_tc be a splitter for specific setup_tc types and push out cls_flower and cls_matchall specific codes into separate functions. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-07 09:42:36 -07:00
Jiri Pirko	8c818c27f3	mlx5e_rep: push cls_flower setup_tc processing into a separate function Let mlx5e_rep_setup_tc (former mlx5e_rep_ndo_setup_tc) be a splitter for specific setup_tc types and push out cls_flower specific code into a separate function. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-07 09:42:36 -07:00
Jiri Pirko	0cf0f6d3d3	mlx5e: push cls_flower and mqprio setup_tc processing into separate functions Let mlx5e_setup_tc (former mlx5e_ndo_setup_tc) be a splitter for specific setup_tc types and push out cls_flower and mqprio specific codes into separate functions. Also change the return values so they are the same as in the rest of the drivers. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-07 09:42:36 -07:00
Jiri Pirko	3e0e826643	net: sched: make egress_dev flag part of flower offload struct Since this is specific to flower now, make it part of the flower offload struct. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-07 09:42:35 -07:00
Jiri Pirko	ade9b65884	net: sched: rename TC_SETUP_MATCHALL to TC_SETUP_CLSMATCHALL In order to be aligned with the rest of the types, rename TC_SETUP_MATCHALL to TC_SETUP_CLSMATCHALL. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-07 09:42:35 -07:00
Jiri Pirko	2572ac53c4	net: sched: make type an argument for ndo_setup_tc Since the type is always present, push it to be a separate argument to ndo_setup_tc. On the way, name the type enum and use it for arg type. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-07 09:42:35 -07:00
Rabie Loulou	a8ffcc741a	net/mlx5: Increase the maximum flow counters supported Read new NIC capability field which represnts 16 MSBs of the max flow counters number supported (max_flow_counter_31_16). Backward compatibility with older firmware is preserved, the modified driver reads max_flow_counter_31_16 as 0 from the older firmware and uses up to 64K counters. Changed flow counter id from 16 bits to 32 bits. Backward compatibility with older firmware is preserved as we kept the 16 LSBs of the counter id in place and added 16 MSBs from reserved field. Changed the background bulk reading of flow counters to work in chunks of at most 32K counters, to make sure we don't attempt to allocate very large buffers. Signed-off-by: Rabie Loulou <rabiel@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-07 10:47:07 +03:00
Erez Shitrit	97834eba7c	net/mlx5: Delay events till ib registration ends When mlx5_ib registers itself to mlx5_core as an interface, it will call mlx5_add_device which will call mlx5_ib interface add callback, in case the latter successfully returns, only then mlx5_core will add it to the interface list and async events will be forwarded to mlx5_ib. Between mlx5_ib interface add callback and mlx5_core adding the mlx5_ib interface to its devices list, arriving mlx5_core events can be missed by the new mlx5_ib registering interface. In other words: thread 1: mlx5_ib: mlx5_register_interface(dev) thread 1: mlx5_core: mlx5_add_device(dev) thread 1: mlx5_core: ctx = dev->add => (mlx5_ib)->mlx5_ib_add thread 2: mlx5_core_event: *new event arrives, forward to dev_list thread 1: mlx5_core: add_ctx_to_dev_list(ctx) / previous event was missed by the new interface.*/ It is ok to miss events before dev->add (mlx5_ib)->mlx5_ib_add_device but not after. We fix this race by accumulating the events that come between the ib_register_device (inside mlx5_add_device->(dev->add)) till the adding to the list completes and fire them to the new registering interface after that. Fixes: `f1ee87fe55` ("net/mlx5: Organize device list API in one place") Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-07 10:47:07 +03:00
Saeed Mahameed	e80541ecab	net/mlx5: Add CONFIG_MLX5_ESWITCH Kconfig Allow to selectively build the driver with or without sriov eswitch, VF representors and TC offloads. Also remove the need of two ndo ops structures (sriov & basic) and keep only one unified ndo ops, compile out VF SRIOV ndos when not needed (MLX5_ESWITCH=n), and for VF netdev calling those ndos will result in returning -EPERM. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Cc: Jes Sorensen <jsorensen@fb.com> Cc: kernel-team@fb.com	2017-08-07 10:47:06 +03:00
Saeed Mahameed	eeb66cdb68	net/mlx5: Separate between E-Switch and MPFS Multi-Physical Function Switch (MPFs) is required for when multi-PF configuration is enabled to allow passing user configured unicast MAC addresses to the requesting PF. Before this patch eswitch.c used to manage the HW MPFS l2 table, E-Switch always (regardless of sriov) enabled vport(0) (NIC PF) vport's contexts update on unicast mac address list changes, to populate the PF's MPFS L2 table accordingly. In downstream patch we would like to allow compiling the driver without E-Switch functionalities, for that we move MPFS l2 table logic out of eswitch.c into its own file, and provide Kconfig flag (MLX5_MPFS) to allow compiling out MPFS for those who don't want Multi-PF support. NIC PF netdevice will now directly update MPFS l2 table via the new MPFS API. VF netdevice has no access to MPFS L2 table, so E-Switch will remain responsible of updating its MPFS l2 table on behalf of its VFs. Due to this change we also don't require enabling vport(0) (PF vport) unicast mac changes events anymore, for when SRIOV is not enabled. Which means E-Switch is now activated only on SRIOV activation, and not required otherwise. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Cc: Jes Sorensen <jsorensen@fb.com> Cc: kernel-team@fb.com	2017-08-07 10:47:06 +03:00
Saeed Mahameed	a9f7705ffd	net/mlx5: Unify vport manager capability check Expose MLX5_VPORT_MANAGER macro to check for strict vport manager E-switch and MPFS (Multi Physical Function Switch) abilities. VPORT manager must be a PF with an ethernet link and with FW advertised vport group manager capability Replace older checks with the new macro and use it where needed in eswitch.c and mlx5e netdev eswitch related flows. The same macro will be reused in MPFS separation downstream patch. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-08-07 10:47:06 +03:00
Saeed Mahameed	07c9f1e578	net/mlx5e: NIC netdev init flow cleanup Remove redundant call to unregister vport representor in mlx5e_add error flow. Hide the representor priv and eswitch internal structures from en_main.c as preparation step for downstream patches which would allow building the driver without support for representors and eswitch. Fixes: `6f08a22c5f` ("net/mlx5e: Register/unregister vport representors on interface attach/detach") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>	2017-08-07 10:47:06 +03:00
Saeed Mahameed	706b358348	net/mlx5e: Rearrange netdevice ops structures Since we are going to allow building the driver without eswitch support, it would be possible to compile out the sriov netdevice ops struct such that the basic ops instance will be used for non VF devices too. Add missing udp tunnel ndos into mlx5e_netdev_ops_basic. While here, rearrange some ndos in the sriov ops struct and put vf/eswitch related ndos towards the end of it. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>	2017-08-07 10:47:06 +03:00
Jiri Pirko	3bcc0cec81	net: sched: change names of action number helpers to be aligned with the rest The rest of the helpers are named tcf_exts_*, so change the name of the action number helpers to be aligned. While at it, change to inline functions. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-04 11:21:23 -07:00
Ido Schimmel	852cfeed0e	mlxsw: spectrum_switchdev: Release multicast groups during fini Each multicast group (MID) stores a bitmap of ports to which a packet should be forwarded to in case an MDB entry associated with the MID is hit. Since the initial introduction of IGMP snooping in commit `3a49b4fde2` ("mlxsw: Adding layer 2 multicast support") the driver didn't correctly free these multicast groups upon ungraceful situations such as the removal of the upper bridge device or module removal. The correct way to fix this is to associate each MID with the bridge ports member in it and then drop the reference in case the bridge port is destroyed, but this will result in a lot more code and will be fixed in net-next. For now, upon module removal, traverse the MID list and release each one. Fixes: `3a49b4fde2` ("mlxsw: Adding layer 2 multicast support") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-04 11:15:07 -07:00
Ido Schimmel	17b334a876	mlxsw: spectrum_switchdev: Don't warn about valid situations Some operations in the bridge driver such as MDB deletion are preformed in an atomic context and thus deferred to a process context by the switchdev infrastructure. Therefore, by the time the operation is performed by the underlying device driver it's possible the bridge port context is already gone. This is especially true for removal flows, but theoretically can also be invoked during addition. Remove the warnings in such situations and return normally. Fixes: `c57529e1d5` ("mlxsw: spectrum: Replace vPorts with Port-VLAN") Fixes: `3922285d96` ("net: bridge: Add support for offloading port attributes") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-04 11:15:07 -07:00
Ido Schimmel	65e65ec137	mlxsw: spectrum_router: Don't ignore IPv6 notifications We now have all the necessary IPv6 infrastructure in place, so stop ignoring these notifications. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-03 15:36:01 -07:00
Ido Schimmel	f36f5ac677	mlxsw: spectrum_router: Abort on source-specific routes Without resorting to ACLs, the device performs route lookup solely based on the destination IP address. In case source-specific routing is needed, an error is returned and the abort mechanism is activated, thus allowing the kernel to take over forwarding decisions. Instead of aborting, we can trap specific destination prefixes where source-specific routes are present, but this will result in a lot more code that is unlikely to ever be used. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-03 15:36:01 -07:00
Ido Schimmel	0a7fd1ac2a	mlxsw: spectrum_router: Add support for route replace In case we got a replace event, then the replaced route must exist. If the route isn't capable of multipath, then replace first matching non-multipath capable route. If the route is capable of multipath and matching multipath capable route is found, then replace it. Otherwise, replace first matching non-multipath capable route. The new route is inserted before the replaced one. In case the replaced route is currently offloaded, then it's overwritten in the device's table by the new route and later deleted, thus not impacting routed traffic. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-03 15:36:01 -07:00
Ido Schimmel	428b851f56	mlxsw: spectrum_router: Add support for IPv6 routes addition / deletion Allow directly connected and remote unicast IPv6 routes to be programmed to the device's tables. As with IPv4, identical routes - sharing the same destination prefix - are ordered in a FIB node according to their table ID and then the metric. While the kernel doesn't share the same trie for the local and main table, this does happen in the device, so ordering according to table ID is needed. Since individual nexthops can be added and deleted in IPv6, each FIB entry stores a linked list of the rt6_info structs it represents. Upon the addition or deletion of a nexthop, a new nexthop group is allocated according to the new configuration and the old one is destroyed. Identical groups aren't currently consolidated, but will be in a follow-up patchset. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-03 15:36:00 -07:00
Ido Schimmel	583419fdf2	mlxsw: spectrum_router: Sanitize IPv6 FIB rules We only allow FIB offload in the presence of default rules or an l3mdev rule. In a similar fashion to IPv4 FIB rules, sanitize IPv6 rules. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-03 15:36:00 -07:00
Ido Schimmel	66a5763ac1	mlxsw: spectrum_router: Demultiplex FIB event based on family The FIB notification block currently only handles IPv4 events, but we want to start handling IPv6 events soon, so lay the groundwork now. Do that by preparing the work item and process it according to the notified address family. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-03 15:36:00 -07:00
Ido Schimmel	64e5e8252d	mlxsw: spectrum_router: Ignore address families other than IPv4 We're about to add IPv6 notifications in the FIB notification chain, but the driver currently doesn't support these, so ignore them. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-03 15:35:59 -07:00
Ido Schimmel	04b1d4e50e	net: core: Make the FIB notification chain generic The FIB notification chain is currently soley used by IPv4 code. However, we're going to introduce IPv6 FIB offload support, which requires these notification as well. As explained in commit `c3852ef7f2` ("ipv4: fib: Replay events when registering FIB notifier"), upon registration to the chain, the callee receives a full dump of the FIB tables and rules by traversing all the net namespaces. The integrity of the dump is ensured by a per-namespace sequence counter that is incremented whenever a change to the tables or rules occurs. In order to allow more address families to use the chain, each family is expected to register its fib_notifier_ops in its pernet init. These operations allow the common code to read the family's sequence counter as well as dump its tables and rules in the given net namespace. Additionally, a 'family' parameter is added to sent notifications, so that listeners could distinguish between the different families. Implement the common code that allows listeners to register to the chain and for address families to register their fib_notifier_ops. Subsequent patches will implement these operations in IPv6. In the future, ipmr and ip6mr will be extended to provide these notifications as well. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-03 15:35:59 -07:00
Ido Schimmel	77d964e66c	mlxsw: spectrum_router: Refresh offload indication upon group refresh Now that we provide offload indication using the nexthop's flags we must refresh the offload indication whenever the offload state within the group changes. This didn't matter until now, as offload indication was provided using the FIB info flags and multipath routes were marked as offloaded as long as one of the nexthops was offloaded. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Tested-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-02 17:00:24 -07:00
Ido Schimmel	1353ee7073	mlxsw: spectrum_router: Don't check state when refreshing offload indication Previous patch removed the reliance on the counter in the FIB info to set the offload indication, so we no longer need to keep an offload state on each FIB entry and can just set or unset the RTNH_F_OFFLOAD flag in each nexthop. This is also necessary because we're going to need to refresh the offload indication whenever the nexthop group associated with the FIB entry is refreshed. Current check would prevent us from marking a newly resolved nexthop as offloaded if the FIB entry is already marked as offloaded. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Tested-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-02 17:00:23 -07:00
Ido Schimmel	3984d1a89f	mlxsw: spectrum_router: Provide offload indication using nexthop flags In a similar fashion to previous patch, use the nexthop flags to provide offload indication instead of the FIB info's flags. In case a nexthop in a multipath route can't be offloaded (gateway's MAC can't be resolved, for example), then its offload flag isn't set. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Tested-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-02 17:00:23 -07:00
Ido Schimmel	9820355f69	mlxsw: core: Use correct EMAD transaction ID in debug message 'trans->tid' is only assigned later in the function, resulting in a zero transaction ID. Use 'tid' instead. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-02 16:58:49 -07:00
Jack Morgenstein	bff0c6840c	net/mlx4_core: Fixes missing capability bit in flags2 capability dump The cited commit introduced the following new enum value in file include/linux/mlx4/device.h: QUERY_DEV_CAP_DIAG_RPRT_PER_PORT However, it failed to introduce a corresponding entry in function dump_dev_cap_flags2() for outputting a line in the message log when this capability bit is set. The change here fixes that omission. Fixes: `c7c122ed67` ("net/mlx4: Add diagnostic counters capability bit") Reported-by: Mukesh Kacker <mukesh.kacker@oracle.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-02 10:44:09 -07:00
Jack Morgenstein	f9fb9d0b85	net/mlx4_core: Fix namespace misalignment in QinQ VST support commit The cited commit introduced the following new enum value in file include/linux/mlx4/device.h: MLX4_DEV_CAP_FLAG2_SVLAN_BY_QP However the value of MLX4_DEV_CAP_FLAG2_SVLAN_BY_QP needs to stay consistent with the value used in another namespace in function dump_dev_cap_flags2(), which is manually kept in sync. The change here restores that consistency. Fixes: `7c3d21c815` ("net/mlx4_core: Preparation for VF vlan protocol 802.1ad") Reported-by: Mukesh Kacker <mukesh.kacker@oracle.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-02 10:44:09 -07:00
Jack Morgenstein	5886259c12	net/mlx4_core: Fix sl_to_vl_change bit offset in flags2 dump The index value in function dump_dev_cap_flags2() for outputting "sl to vl mapping table change event support" needs to be consistent with the value of the enumerated constant MLX4_DEV_CAP_FLAG2_SL_TO_VL_CHANGE_EVENT defined in file include/linux/mlx4_device.h The change here restores that consistency. Fixes: `fd10ed8e6f` ("IB/mlx4: Fix possible vl/sl field mismatch in LRH header in QP1 packets") Reported-by: Mukesh Kacker <mukesh.kacker@oracle.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-02 10:44:09 -07:00
Inbar Karmy	c994f778bb	net/mlx4_en: Fix wrong indication of Wake-on-LAN (WoL) support Currently when WoL is supported but disabled, ethtool reports: "Supports Wake-on: d". Fix the indication of Wol support, so that the indication remains "g" all the time if the NIC supports WoL. Tested: As accepted, when NIC supports WoL- ethtool reports: Supports Wake-on: g Wake-on: d when NIC doesn't support WoL- ethtool reports: Supports Wake-on: d Wake-on: d Fixes: `14c07b1358` ("mlx4: Wake on LAN support") Signed-off-by: Inbar Karmy <inbark@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-02 10:44:09 -07:00
David S. Miller	29fda25a2d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Two minor conflicts in virtio_net driver (bug fix overlapping addition of a helper) and MAINTAINERS (new driver edit overlapping revamp of PHY entry). Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-01 10:07:50 -07:00
Petr Machata	213666a356	mlxsw: spectrum_router: Simplify a piece of code Express the same logic more succinctly. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-31 14:44:33 -07:00
Petr Machata	56b8a9ed27	mlxsw: spectrum_router: Clarify a piece of code Prefer logical operator that expresses the intent to bitwise one that happens to give the same result. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-31 14:44:33 -07:00
Petr Machata	f1b1f273ae	mlxsw: spectrum_router: Simplify a piece of code Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-31 14:44:33 -07:00
Petr Machata	83930cd76a	mlxsw: reg.h: Namespace IP2ME registers This renames IP2ME-specific registers reg_ralue_v and reg_ralue_tunnel_ptr to reg_ralue_ip2me_*. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-31 14:44:33 -07:00
Petr Machata	78676ad4fb	mlxsw: Update specification of reg_ritr_type The comments really belong to the individual enumerators. The comment at the register should instead reference the enum. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-31 14:44:33 -07:00
Petr Machata	8de3c17819	mlxsw: spectrum_router: Fix a typo Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-31 14:44:33 -07:00
Petr Machata	806a1c1ab1	mlxsw: reg.h: Fix a typo Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-31 14:44:33 -07:00
Petr Machata	4bb51bd64f	mlxsw: spectrum_acl: Fix a typo Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-31 14:44:33 -07:00
Paul Blakey	bcec601f30	net/mlx5: Fix mlx5_add_flow_rules call with correct num of dests When adding ethtool steering rule with action DISCARD we wrongly pass a NULL dest with dest_num 1 to mlx5_add_flow_rules(). What this error seems to have caused is sending VPORT 0 (MLX5_FLOW_DESTINATION_TYPE_VPORT) as the fte dest instead of no dests. We have fte action correctly set to DROP so it might been ignored anyways. To reproduce use: # sudo ethtool --config-nfc <dev> flow-type ether \ dst aa:bb:cc:dd:ee:ff action -1 Fixes: `74491de937` ("net/mlx5: Add multi dest support") Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-07-27 16:40:17 +03:00
Eugenia Emantayev	f08c39ed0b	net/mlx5e: Schedule overflow check work to mlx5e workqueue This is done in order to ensure that work will not run after the cleanup. Fixes: `ef9814deaf` ('net/mlx5e: Add HW timestamping (TS) support') Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-07-27 16:40:17 +03:00
Eugenia Emantayev	d439c84509	net/mlx5e: Fix wrong delay calculation for overflow check scheduling The overflow_period is calculated in seconds. In order to use it for delayed work scheduling translation to jiffies is needed. Fixes: `ef9814deaf` ('net/mlx5e: Add HW timestamping (TS) support') Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-07-27 16:40:17 +03:00
Eugenia Emantayev	cf5033089b	net/mlx5e: Add missing support for PTP_CLK_REQ_PPS request Add the missing option to enable the PTP_CLK_PPS function. In this case pin should be configured as 1PPS IN first and then it will be connected to PPS mechanism. Events will be reported as PTP_CLOCK_PPSUSR events to relevant sysfs. Fixes: `ee7f12205a` ('net/mlx5e: Implement 1PPS support') Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-07-27 16:40:17 +03:00
Eugenia Emantayev	4272f9b88d	net/mlx5e: Change 1PPS out scheme In order to fix the drift in 1PPS out need to adjust the next pulse. On each 1PPS out falling edge driver gets the event, then the event handler adjusts the next pulse starting time. Fixes: `ee7f12205a` ('net/mlx5e: Implement 1PPS support') Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-07-27 16:40:17 +03:00
Eugenia Emantayev	49c5031ca6	net/mlx5e: Fix broken disable 1PPS flow Need to disable the MTPPS and unsubscribe from the pulse events when user disables the 1PPS functionality. Fixes: `ee7f12205a` ('net/mlx5e: Implement 1PPS support') Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-07-27 16:40:17 +03:00
Eugenia Emantayev	fa3676885e	net/mlx5e: Add field select to MTPPS register In order to mark relevant fields while setting the MTPPS register add field select. Otherwise it can cause a misconfiguration in firmware. Fixes: `ee7f12205a` ('net/mlx5e: Implement 1PPS support') Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-07-27 16:40:17 +03:00
Ilan Tayari	0242f4a0bb	net/mlx5e: Fix outer_header_zero() check size outer_header_zero() routine checks if the outer_headers match of a flow-table entry are all zero. This function uses the size of whole fte_match_param, instead of just the outer_headers member, causing failure to detect all-zeros if any other members of the fte_match_param are non-zero. Use the correct size for zero check. Fixes: `6dc6071cfc` ("net/mlx5e: Add ethtool flow steering support") Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-07-27 16:40:16 +03:00
Alex Vesker	58569ef8f6	net/mlx5e: IPoIB, Modify add/remove underlay QPN flows On interface remove, the clean-up was done incorrectly causing an error in the log: "SET_FLOW_TABLE_ROOT(0x92f) op_mod(0x0) failed...syndrome (0x7e9f14)" This was caused by the following flow: -ndo_uninit: Move QP state to RST (this disconnects the QP from FT), the QP cannot be attached to any FT unless it is in RTS. -mlx5_rdma_netdev_free: cleanup_rx: Destroy FT cleanup_tx: Destroy QP and remove QPN from FT This caused a problem when destroying current FT we tried to re-attach the QP to the next FT which is not needed. The correct flow is: -mlx5_rdma_netdev_free: cleanup_rx: remove QPN from FT & Destroy FT cleanup_tx: Destroy QP Fixes: `508541146a` ("net/mlx5: Use underlay QPN from the root name space") Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-07-27 16:40:16 +03:00
Moshe Shemesh	219c81f7d1	net/mlx5: Fix command bad flow on command entry allocation failure When driver fail to allocate an entry to send command to FW, it must notify the calling function and release the memory allocated for this command. Fixes: `e126ba97db` ('mlx5: Add driver for Mellanox Connect-IB adapters') Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Cc: kernel-team@fb.com Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-07-27 16:40:16 +03:00
Moshe Shemesh	061870800e	net/mlx5: Fix command completion after timeout access invalid structure Completion on timeout should not free the driver command entry structure as it will need to access it again once real completion event from FW will occur. Fixes: `73dd3a4839` ('net/mlx5: Avoid using pending command interface slots') Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Cc: kernel-team@fb.com Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-07-27 16:40:16 +03:00
Aviv Heller	dc798b4cc0	net/mlx5: Consider tx_enabled in all modes on remap The tx_enabled lag event field is used to determine whether a slave is active. Current logic uses this value only if the mode is active-backup. However, LACP mode, although considered a load balancing mode, can mark a slave as inactive in certain situations (e.g., LACP timeout). This fix takes the tx_enabled value into account when remapping, with no respect to the LAG mode (this should not affect the behavior in XOR mode, since in this mode both slaves are marked as active). Fixes: `7907f23adc` (net/mlx5: Implement RoCE LAG feature) Signed-off-by: Aviv Heller <avivh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-07-27 16:40:16 +03:00
Eran Ben Elisha	079adf0539	net/mlx5: Clean SRIOV eswitch resources upon VF creation failure Upon sriov enable, eswitch is always enabled. Currently, if enable hca failed over all VFs, we would skip eswitch disable as part of sriov disable, which will lead to resources leak. Fix it by disabling eswitch if it was enabled (use indication from eswitch mode). Fixes: `6b6adee3da` ('net/mlx5: SRIOV core code refactoring') Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-07-27 16:40:16 +03:00
Zhu Yanjun	f575a02ee7	mlx4_en: remove unnecessary error check The function mlx4_en_get_profile always return zero. So it is not necessary to check its return value. CC: Joe Jin <joe.jin@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-24 17:26:32 -07:00
Ido Schimmel	b5f3e0d430	mlxsw: spectrum_router: Fix build when IPv6 isn't enabled When IPv6 isn't enabled the following error is generated: ERROR: "nd_tbl" [drivers/net/ethernet/mellanox/mlxsw/mlxsw_spectrum.ko] undefined! Fix it by replacing 'arp_tbl' and 'nd_tbl' with 'tbl->family' wherever possible and reference 'nd_tbl' only when IPV6 is enabled. Fixes: `d5eb89cf68` ("mlxsw: spectrum_router: Reflect IPv6 neighbours to the device") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-24 17:15:17 -07:00
Zhu Yanjun	f3eebe8819	mlx4_en: remove unnecessary returned value The function mlx4_en_arm_cq always returns zero. So change the return type of the function mlx4_en_arm_cq to void. CC: Joe Jin <joe.jin@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-24 16:29:55 -07:00
Ido Schimmel	4a3c67a6e7	mlxsw: spectrum_router: Don't batch neighbour deletion Current firmware supported by the driver doesn't support batch deletion of IPv6 neighbours on a given router interface (RIF). Until a new version that supports this functionality is made available, delete neighbours one by one. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-24 16:16:20 -07:00
Ido Schimmel	1819ae3dfe	mlxsw: spectrum_router: Don't offload routes next in list Each FIB node holds a linked list of routes sharing the same prefix and length. In the case of IPv4 it's ordered according to table ID, metric and TOS and only the first route in the list is actually programmed to the device. In case a gatewayed route is added somewhere in the list, then after its nexthop group will be refreshed and become valid (due to the resolution of its gateway), it'll mistakenly overwrite the existing entry. Example: 192.168.200.0/24 dev enp3s0np3 scope link metric 1000 offload 192.168.200.0/24 via 192.168.100.1 dev enp3s0np3 metric 1000 offload Both routes are marked as offloaded despite the fact only the first one should actually be present in the device's table. When refreshing the nexthop group, don't write the route to the device's table unless it's the first in its node. Fixes: `9aecce1c7d` ("mlxsw: spectrum_router: Correctly handle identical routes") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-24 14:14:48 -07:00
Moshe Shemesh	f330187016	(IB, net)/mlx4: Add resource utilization support Adding visibility of resource usage of QPs, CQs and counters used by virtual functions. This feature will be used to give the PF administrator more data while debugging VF status. Usage info was added to ALLOC_RES command, to notify the PF if the resource which is being reserved or allocated for the VF will be used by kernel driver or by user verbs. Updated reservation and allocation functions of QP, CQ and counter with additional usage parameter. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-07-24 10:41:35 -04:00
Yishai Hadas	4ce749bd94	net/mlx5: Report enhanced capabilities for IPoIB Report 'ipoib_enhanced_offloads' capabilities from the core layer, it will be used in the next patch from this series. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-07-24 10:40:46 -04:00
Maor Gottlieb	246ac9814c	net/mlx5: Introduce general notification event When delay drop timeout is expired, the firmware raises general notification event of DELAY_DROP_TIMEOUT subtype. In addition the feature is disable so the driver have to reactivate the timeout. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-07-24 10:35:15 -04:00
Maor Gottlieb	c1e0bfc131	net/mlx5: Introduce set delay drop command Add support to SET_DELAY_DROP command. This command will be used in downstream patches for delay packet drop. The timeout value should be indicated by delay_drop_timeout field. Packet processing will be delayed till timeout value passed or until more WQEs are posted. Setting this value to 0 disables the feature. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-07-24 10:34:28 -04:00
Bodong Wang	7ecf6d8ff1	IB/mlx5: Restore IB guid/policy for virtual functions When a user sets port_guid, node_guid or policy of an IB virtual function, save this information in "struct mlx5_vf_context". This information will be restored later when pci_resume is called. To make sure this works, one can use aer-inject to generate PCI errors on mlx5 devices and verify if relevant fields are restored after PCI resume. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-07-24 10:34:28 -04:00
Huy Nguyen	2c43c5a036	net/mlx5e: Enable local loopback in loopback selftest Before running the ethtool's loopback selftest, we need to make sure that the local loopback is enabled. Signed-off-by: Huy Nguyen <huyn@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-07-24 10:29:18 -04:00
Huy Nguyen	c85023e153	IB/mlx5: Add raw ethernet local loopback support Currently, unicast/multicast loopback raw ethernet (non-RDMA) packets are sent back to the vport. A unicast loopback packet is the packet with destination MAC address the same as the source MAC address. For multicast, the destination MAC address is in the vport's multicast filter list. Moreover, the local loopback is not needed if there is one or none user space context. After this patch, the raw ethernet unicast and multicast local loopback are disabled by default. When there is more than one user space context, the local loopback is enabled. Note that when local loopback is disabled, raw ethernet packets are not looped back to the vport and are forwarded to the next routing level (eswitch, or multihost switch, or out to the wire depending on the configuration). Signed-off-by: Huy Nguyen <huyn@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-07-24 10:29:18 -04:00
Huy Nguyen	bded747bb4	net/mlx5: Add raw ethernet local loopback firmware command Add support for raw ethernet local loopback firmware command. Signed-off-by: Huy Nguyen <huyn@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-07-24 10:26:16 -04:00
David S. Miller	7a68ada6ec	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2017-07-21 03:38:43 +01:00
Linus Torvalds	96080f6977	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) BPF verifier signed/unsigned value tracking fix, from Daniel Borkmann, Edward Cree, and Josef Bacik. 2) Fix memory allocation length when setting up calls to ->ndo_set_mac_address, from Cong Wang. 3) Add a new cxgb4 device ID, from Ganesh Goudar. 4) Fix FIB refcount handling, we have to set it's initial value before the configure callback (which can bump it). From David Ahern. 5) Fix double-free in qcom/emac driver, from Timur Tabi. 6) A bunch of gcc-7 string format overflow warning fixes from Arnd Bergmann. 7) Fix link level headroom tests in ip_do_fragment(), from Vasily Averin. 8) Fix chunk walking in SCTP when iterating over error and parameter headers. From Alexander Potapenko. 9) TCP BBR congestion control fixes from Neal Cardwell. 10) Fix SKB fragment handling in bcmgenet driver, from Doug Berger. 11) BPF_CGROUP_RUN_PROG_SOCK_OPS needs to check for null __sk, from Cong Wang. 12) xmit_recursion in ppp driver needs to be per-device not per-cpu, from Gao Feng. 13) Cannot release skb->dst in UDP if IP options processing needs it. From Paolo Abeni. 14) Some netdev ioctl ifr_name[] NULL termination fixes. From Alexander Levin and myself. 15) Revert some rtnetlink notification changes that are causing regressions, from David Ahern. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (83 commits) net: bonding: Fix transmit load balancing in balance-alb mode rds: Make sure updates to cp_send_gen can be observed net: ethernet: ti: cpsw: Push the request_irq function to the end of probe ipv4: initialize fib_trie prior to register_netdev_notifier call. rtnetlink: allocate more memory for dev_set_mac_address() net: dsa: b53: Add missing ARL entries for BCM53125 bpf: more tests for mixed signed and unsigned bounds checks bpf: add test for mixed signed and unsigned bounds checks bpf: fix up test cases with mixed signed/unsigned bounds bpf: allow to specify log level and reduce it for test_verifier bpf: fix mixed signed/unsigned derived min/max value bounds ipv6: avoid overflow of offset in ip6_find_1stfragopt net: tehuti: don't process data if it has not been copied from userspace Revert "rtnetlink: Do not generate notifications for CHANGEADDR event" net: dsa: mv88e6xxx: Enable CMODE config support for 6390X dt-binding: ptp: Add SoC compatibility strings for dte ptp clock NET: dwmac: Make dwmac reset unconditional net: Zero terminate ifr_name in dev_ifname(). wireless: wext: terminate ifr name coming from userspace netfilter: fix netfilter_net_init() return ...	2017-07-20 16:33:39 -07:00
Ido Schimmel	7dcc18adad	mlxsw: spectrum_router: Update prefix count for IPv6 The number of possible prefix lengths for IPv6 is 129 and not 128. Fixes following warning from UBSAN when /128 routes are offloaded: UBSAN: Undefined behaviour in drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:2510:27 index 128 is out of range for type 'long unsigned int [128]' Fixes: `5e9c16cc83` ("mlxsw: spectrum_router: Implement private fib") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:34 -07:00
Ido Schimmel	80c238f91b	mlxsw: spectrum_router: Rename functions to add / delete a FIB entry These functions aren't specific to IPv4 and can be re-used for IPv6. Drop the '4' designation from their name. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:34 -07:00
Ido Schimmel	9efbee6fea	mlxsw: spectrum_router: Drop unnecessary parameter Functions that take as argument a FIB entry don't need to take FIB node as well, as it can be extracted from the entry. Remove unnecessary FIB node parameter. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:34 -07:00
Ido Schimmel	0e6ea2a4ea	mlxsw: spectrum_router: Mark IPv4 specific function accordingly The functions to create and destroy a nexthop group are IPv4 specific and should be renamed accordingly, so that they won't be confused with the IPv6 specific functions in follow-up patches. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:33 -07:00
Ido Schimmel	4f1c7f1f2e	mlxsw: spectrum_router: Create IPv4 specific entry struct Some of the parameters stored in the FIB entry structure are specific to IPv4 and therefore better placed in an IPv4 specific structure. Create an IPv4 specific structure that encapsulates the common FIB entry structure and contains IPv4 specific parameters. In a follow-up patchset an IPv6 specific structure will be introduced. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:33 -07:00
Ido Schimmel	bc65a8a4f4	mlxsw: spectrum_router: Set abort trap for IPv6 When we fail to insert a route we invoke the abort mechanism which flushes all the tables and inserts a default route in each, so that all packets incoming to the router will be trapped to the CPU. Upon abort, add an IPv6 default route to the IPv6 tables. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:33 -07:00
Ido Schimmel	9dbf4d76d0	mlxsw: spectrum_router: Allow IPv6 routes to be programmed Take advantage of previous patch and allow the RALUE register to be called with IPv6 routes. In order to re-use as much code as possible between IPv4 and IPv6, only the lowest-level function that actually does the register packing is demuxed based on the passed protocol. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:33 -07:00
Ido Schimmel	62547f407f	mlxsw: reg: Update RALUE register with IPv6 support Update the register so that IPv6 LPM entries could be programmed to the device's table. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:33 -07:00
Ido Schimmel	a3d9bc506d	mlxsw: spectrum_router: Extend virtual routers with IPv6 support A Virtual Router (VR) is an entity which corresponds to a VRF and performs FIB lookup in an LPM tree according to the {VR, IP Proto} -> Tree binding. Extend the virtual router data structure towards IPv6 FIB offload. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:33 -07:00
Ido Schimmel	731ea1ca42	mlxsw: spectrum_router: Make FIB node retrieval family agnostic A FIB node is an entity which stores routes sharing the same prefix and length. The data structure itself is already family agnostic, but we make some of its operations agnostic as well and thus re-use them for IPv6 offload. Instead of passing an IPv4-specific structure to fib4_node_get(), pass general routing parameters and rename the function accordingly. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:33 -07:00
Ido Schimmel	160e22aa26	mlxsw: spectrum_router: Don't create FIB node during lookup When looking up a FIB entry we shouldn't create the FIB node where it's supposed to be linked in case the node doesn't already exist. Instead, lookup the node and fail if it doesn't exist. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:33 -07:00
Ido Schimmel	58adf2c480	mlxsw: spectrum_router: Don't assume neighbour type Thankfully, the neighbour subsystem is agnostic to the upper protocol and used by both IPv4 and IPv6. By removing assumptions regarding the neighbour type we can thus re-use much of the neighbour-related code for both IPv4 and IPv6. For each nexthop, store its gateway IP and for nexthop group store the neighbour table used by its nexthops. Use this information throughout the code and remove assumption about the neighbour type. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:33 -07:00
Arkadi Sharshevsky	a6c9b5d199	mlxsw: spectrum_router: Set activity interval according to both neighbour tables The neighbours' activity is currently dumped according to the ARP table's DELAY_PROBE time, but with the introduction of IPv6 offload we should set the interval according to the minimum between the ARP and ndisc tables. Signed-off-by: Arkadi Sharshvesky <arkadis@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:33 -07:00
Arkadi Sharshevsky	60f040ca11	mlxsw: spectrum_router: Periodically dump active IPv6 neighbours In addition to IPv4, periodically dump IPv6 neighbours and update the kernel about them. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:33 -07:00
Arkadi Sharshevsky	72e8ebe1b3	mlxsw: reg: Update RAUHTD register with IPv6 support Update the register so that the active IPv6 neighbours could be dumped from the device's neighbour table. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:33 -07:00
Arkadi Sharshevsky	d5eb89cf68	mlxsw: spectrum_router: Reflect IPv6 neighbours to the device As with IPv4, listen to NEIGH_UPDATE events from the ndisc table and program relevant neighbours to the device's neighbour table. Note that neighbours with a link-local IP address aren't programmed, as packets with a link-local destination IP are trapped after LPM lookup and never reach the neighbour table. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:32 -07:00
Arkadi Sharshevsky	6929e50736	mlxsw: reg: Update RAUHT register with IPv6 support Update the register, so the IPv6 neighbours could be programmed to the device's neighbour table. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:32 -07:00
Arkadi Sharshevsky	5ea1237f94	mlxsw: spectrum_router: Configure RIFs based on IPv6 addresses When a netdev is configured with an IP address a router interface (RIF) should be configured for it in the device. Allow configuration of RIFs based on IPv6 address notifications as well as IPv4. Note that the RIF exists as long as an IP address is configured on the netdev, regardless of the address family. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:32 -07:00
Ido Schimmel	0d284818af	mlxsw: spectrum_router: Flood unregistered multicast packets to router Up until now we only flooded broadcast packets to the router when an L3 interface was configured on top of a bridge. However, IPv6 Neighbour Discovery packets are trapped to the CPU inside the router and these can be sent with a multicast address. Flood unregistered multicast packets to the router port, so that relevant packets could be trapped there. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:32 -07:00
Arkadi Sharshevsky	8d54814e52	mlxsw: spectrum: Add support for IPv6 traps Before we can start using IPv6, we need to trap certain control packets to the CPU. Among others, these include Neighbour Discovery, DHCP and neighbour misses. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:32 -07:00
Arkadi Sharshevsky	e717e011ff	mlxsw: reg: Enable IPv6 on router interfaces Enable IPv6 and IPv6 forwarding on router interfaces (RIFs), so that they will be able to receive and forward IPv6 traffic. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:32 -07:00
Arkadi Sharshevsky	e29237e7bb	mlxsw: spectrum_router: Enable IPv6 router Before we add IPv6 constructs like traps and router interfaces, we first need to enable IPv6 routing in the device. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:32 -07:00
Leon Romanovsky	8900b894e7	{net, IB}/mlx4: Remove gfp flags argument The caller to the driver marks GFP_NOIO allocations with help of memalloc_noio-* calls now. This makes redundant to pass down to the driver gfp flags, which can be GFP_KERNEL only. The patch removes the gfp flags argument and updates all driver paths. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-07-17 21:21:24 -04:00
Arkadi Sharshevsky	9df552ef3e	mlxsw: spectrum: Improve IPv6 unregistered multicast flooding Up until now IPv6 unregistered multicast traffic would be flooded like broadcast, even when MLD snooping was enabled on the bridge. This was intentional as MLD packet traps were missing, preventing the bridge driver from programming MDB entries to the device. Previous patch added these traps, so we can now finally flood IPv6 unregistered multicast packets to specific ports via the multicast table instead of flooding them to all ports via the broadcast table. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-17 09:19:39 -07:00
Arkadi Sharshevsky	588823f97d	mlxsw: spectrum: Add support for IPv6 MLDv1/2 traps Add support for IPv6 MLDv1/2 packet trapping. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-17 09:19:39 -07:00
Ido Schimmel	7607dd35fc	mlxsw: spectrum: Trap IPv4 packets with Router Alert option In case local sockets have the IP_ROUTER_ALERT socket option set, then they expect to get packets with the Router Alert option. Trap such packets, so that the kernel could inspect them and potentially send them to interested sockets. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-17 09:19:39 -07:00
Ido Schimmel	0fcc484748	mlxsw: spectrum: Mark packets trapped in router In commit `1c6c6d221e` ("mlxsw: spectrum: Mirror certain packets to CPU") we marked packets that were mirrored to the CPU, so that they won't be flooded again by the bridge driver. However, certain packets are trapped in the device's router block, after passing through the bridge block where they were potentially flooded. Mark all packets coming from L3 traps, so that they won't be potentially flooded again by the bridge driver. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-17 09:19:39 -07:00
Or Gerlitz	87996f91f7	mlxsw: spectrum_flower: Add support for ip tos Support offloading rules that match on ip tos. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-17 09:18:23 -07:00
Or Gerlitz	abac7b011d	mlxsw: spectrum: Add tos to the ipv4 acl block Add ecn and dscp fields to the ipv4 acl block. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-17 09:18:23 -07:00
Or Gerlitz	80d0fe4710	mlxsw: acl: Add ip tos acl element Define new element for ip tos (ecn, dscp) and place it into scratch area. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-17 09:18:23 -07:00
Or Gerlitz	fcbca8217d	mlxsw: spectrum_flower: Add support for ip ttl Support offloading rules that match on ip ttl. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-17 09:18:23 -07:00
Or Gerlitz	046759a6cf	mlxsw: spectrum: Add ttl to the ipv4 acl block Add ttl field to the ipv4 acl block. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-17 09:18:23 -07:00
Or Gerlitz	5f57e09091	mlxsw: acl: Add ip ttl acl element Define new element for ip ttl and place it into scratch area. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-17 09:18:23 -07:00
Zhu Yanjun	e36fef66f4	mlx4_en: remove unnecessary returned value check The function __mlx4_zone_remove_one_entry always returns zero. So it is not necessary to check it. Cc: Joe Jin <joe.jin@oracle.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-15 14:29:49 -07:00
Ido Schimmel	6f497930af	mlxsw: spectrum_switchdev: Check status of memory allocation We can't rely on kzalloc() always succeeding, so check its return value. Suppresses the following smatch error: mlxsw_sp_switchdev_event() error: potential null dereference 'switchdev_work->fdb_info.addr'. (kzalloc returns null) Fixes: `af06137892` ("mlxsw: spectrum_switchdev: Add support for learning FDB through notification") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-12 08:15:52 -07:00
Ido Schimmel	a9265b804d	mlxsw: spectrum_switchdev: Remove unused variable Commit `10e23eb299` ("mlxsw: spectrum: Remove support for bypass bridge port attributes/vlan set") removed statements that used 'bridge_vlan', but didn't remove the variable itself resulting in the following warning with W=1: warning: variable ‘bridge_vlan’ set but not used [-Wunused-but-set-variable] Remove the variable and suppress the warning. Fixes: `10e23eb299` ("mlxsw: spectrum: Remove support for bypass bridge port attributes/vlan set") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-12 08:15:52 -07:00
Ido Schimmel	7387dbbcdb	mlxsw: spectrum_router: Fix use-after-free in route replace While working on IPv6 route replace I realized we can have a use-after-free in IPv4 in case the replaced route is offloaded and the only one using its FIB info. The problem is that fib_table_insert() drops the reference on the FIB info of the replaced routes which is eventually freed via call_rcu(). Since the driver doesn't hold a reference on this FIB info it can cause a use-after-free when it tries to clear the RTNH_F_OFFLOAD flag stored in fi->fib_flags. After running the following commands in a loop for enough time with a KASAN enabled kernel I finally got the below trace. $ ip route add 192.168.50.0/24 via 192.168.200.1 dev enp3s0np3 $ ip route replace 192.168.50.0/24 dev enp3s0np5 $ ip route del 192.168.50.0/24 dev enp3s0np5 BUG: KASAN: use-after-free in mlxsw_sp_fib_entry_offload_unset+0xa7/0x120 [mlxsw_spectrum] Read of size 4 at addr ffff8803717d9820 by task kworker/u4:2/55 [...] ? mlxsw_sp_fib_entry_offload_unset+0xa7/0x120 [mlxsw_spectrum] ? mlxsw_sp_fib_entry_offload_unset+0xa7/0x120 [mlxsw_spectrum] ? mlxsw_sp_router_neighs_update_work+0x1cd0/0x1ce0 [mlxsw_spectrum] ? mlxsw_sp_fib_entry_offload_unset+0xa7/0x120 [mlxsw_spectrum] __asan_load4+0x61/0x80 mlxsw_sp_fib_entry_offload_unset+0xa7/0x120 [mlxsw_spectrum] mlxsw_sp_fib_entry_offload_refresh+0xb6/0x370 [mlxsw_spectrum] mlxsw_sp_router_fib_event_work+0xd1c/0x2780 [mlxsw_spectrum] [...] Freed by task 5131: save_stack_trace+0x16/0x20 save_stack+0x46/0xd0 kasan_slab_free+0x70/0xc0 kfree+0x144/0x570 free_fib_info_rcu+0x2e7/0x410 rcu_process_callbacks+0x4f8/0xe30 __do_softirq+0x1d3/0x9e2 Fix this by taking a reference on the FIB info when creating the nexthop group it represents and drop it when the group is destroyed. Fixes: `599cf8f95f` ("mlxsw: spectrum_router: Add support for route replace") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-12 08:15:52 -07:00
Ido Schimmel	a4e75b76b2	mlxsw: spectrum_router: Add missing rollback With this patch the error path of mlxsw_sp_nexthop_init() is symmetric with mlxsw_sp_nexthop_fini(). Noticed during code review. Fixes: `a8c9701427` ("mlxsw: spectrum_router: Refactor nexthop init routine") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-12 08:15:51 -07:00
Arnd Bergmann	de92cd6cf4	net/mlx5: IPSec, fix 64-bit division correctly The new IPSec offload code introduced a build error: drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.o: In function `mlx5e_ipsec_build_inverse_table': ipsec_rxtx.c:(.text+0x556): undefined reference Another patch was added on top to fix the build error, but that introduced a new bug, as we now use the remainder of the division rather than the result. This makes it use the correct helper function instead. Fixes: `5dfd87b67c` ("net/mlx5: IPSec, Fix 64-bit division on 32-bit builds") Fixes: `2ac9cfe782` ("net/mlx5e: IPSec, Add Innova IPSec offload TX data path") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-10 19:34:00 +01:00
Huy Nguyen	d968f0f2e4	net/mlx5e: Initialize CEE's getpermhwaddr address buffer to 0xff Latest change in open-lldp code uses bytes 6-11 of perm_addr buffer as the Ethernet source address for the host TLV packet. Since our driver does not fill these bytes, they stay at zero and the open-lldp code ends up sending the TLV packet with zero source address and the switch drops this packet. The fix is to initialize these bytes to 0xff. The open-lldp code considers 0xff:ff:ff:ff:ff:ff as the invalid address and falls back to use the host's mac address as the Ethernet source address. Fixes: `3a6a931dfb` ("net/mlx5e: Support DCBX CEE API") Signed-off-by: Huy Nguyen <huyn@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-07-06 15:13:20 +03:00
Ilan Tayari	fb000f7817	net/mlx5: Add Makefiles for subdirectories Currently it is not possible to build just one .o file inside a subdirectory, because the subdirectories lack a Makefile. Add a Makefile to the mlx5 subdirectories. Fixes: `e29341fb3a` ("net/mlx5: FPGA, Add basic support for Innova") Signed-off-by: Ilan Tayari <ilant@mellanox.com> Reported-by: David Miller <davem@davemloft.net> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-07-06 15:13:20 +03:00
Ilan Tayari	111a676367	net/mlx5: Build wq.o even if MLX5_CORE_EN is not selected Both the ethernet and FPGA portions of MLX5 now require the wq functions, and we get a link error when CONFIG_MLX5_CORE_EN is disabled: drivers/net/ethernet/mellanox/mlx5/core/fpga/conn.o: In function `mlx5_fpga_conn_create_cq': conn.c:(.text+0x10b3): undefined reference to `mlx5_cqwq_create' conn.c:(.text+0x10c6): undefined reference to `mlx5_cqwq_get_size' conn.c:(.text+0x12bc): undefined reference to `mlx5_cqwq_destroy' Build wq.o even if MLX5_CORE_EN is not selected. Fixes: `537a505741` ("net/mlx5: FPGA, Add high-speed connection routines") Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-07-06 15:13:20 +03:00
Ilan Tayari	2a41d15b79	net/mlx5: FPGA, Fix datatype mismatch Fix warnings when building with -Wall: drivers/net/ethernet/mellanox/mlx5/core/fpga/ipsec.c:313:36: warning: cast to restricted __be32 drivers/net/ethernet/mellanox/mlx5/core/fpga/ipsec.c:314:37: warning: cast to restricted __be32 Fixes: `bebb23e6cb` ("net/mlx5: Accel, Add IPSec acceleration interface") Reported-by: Or Gerlitz <gerlitz.or@gmail.com> Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-07-06 15:13:20 +03:00
Ilan Tayari	c8af01692e	net/mlx5: FPGA, make mlx5_fpga_device_brb static Fix warning when building with -Wall: drivers/net/ethernet/mellanox/mlx5/core/fpga/core.c:105:5: warning: symbol 'mlx5_fpga_device_brb' was not declared. Should it be static? Fixes: `c43051d72a` ("net/mlx5: FPGA, Add SBU bypass and reset flows") Reported-by: Or Gerlitz <gerlitz.or@gmail.com> Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-07-06 15:13:20 +03:00
Ilan Tayari	5dfd87b67c	net/mlx5: IPSec, Fix 64-bit division on 32-bit builds Fix warnings when building 386 kernel: >> ERROR: "__udivdi3" [drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.ko] undefined! Fixes: `2ac9cfe782` ("net/mlx5e: IPSec, Add Innova IPSec offload TX data path") Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-07-06 15:13:19 +03:00
Ilan Tayari	aa07b63384	net/mlx5: Add missing include in lib/gid.c Fix warnings when building with -Wall: drivers/net/ethernet/mellanox/mlx5/core/lib/gid.c:38:6: warning: symbol 'mlx5_init_reserved_gids' was not declared. Should it be static? drivers/net/ethernet/mellanox/mlx5/core/lib/gid.c:47:6: warning: symbol 'mlx5_cleanup_reserved_gids' was not declared. Should it be static? drivers/net/ethernet/mellanox/mlx5/core/lib/gid.c:55:5: warning: symbol 'mlx5_core_reserve_gids' was not declared. Should it be static? drivers/net/ethernet/mellanox/mlx5/core/lib/gid.c:79:6: warning: symbol 'mlx5_core_unreserve_gids' was not declared. Should it be static? drivers/net/ethernet/mellanox/mlx5/core/lib/gid.c:92:5: warning: symbol 'mlx5_core_reserved_gid_alloc' was not declared. Should it be static? drivers/net/ethernet/mellanox/mlx5/core/lib/gid.c:109:6: warning: symbol 'mlx5_core_reserved_gid_free' was not declared. Should it be static? Fixes: `52ec462eca` ("net/mlx5: Add reserved-gids support") Reported-by: Or Gerlitz <gerlitz.or@gmail.com> Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-07-06 15:13:19 +03:00
David S. Miller	3a3f7d130e	Merge https://git.kernel.org/pub/scm/linux/kernel/git/davem/net Some overlapping changes in the mlx5 driver. A merge conflict resolution posted by Stephen Rothwell was used as a guide. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-03 03:42:10 -07:00
Zhu Yanjun	3b68067bd2	mlx4_en: make mlx4_log_num_mgm_entry_size static The variable mlx4_log_num_mgm_entry_size is only called in main.c. CC: Joe Jin <joe.jin@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-03 02:41:26 -07:00
Or Gerlitz	c1c1d86bde	net/mlxfw: Properly handle dependancy with non-loadable mlx5 If mlx5 is set to be built-in and mlxfw as a module, we get a link error: drivers/built-in.o: In function `mlx5_firmware_flash': (.text+0x5aed72): undefined reference to `mlxfw_firmware_flash' Since we don't want to mandate selecting mlxfw for mlx5 users, we use the IS_REACHABLE macro to make sure that a stub is exposed to the caller. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reported-by: Jakub Kicinski <kubakici@wp.pl> Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-03 02:32:25 -07:00
Stephen Rothwell	6992c6c5dd	net/mlx5: fix memcpy limit? Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-03 01:57:27 -07:00
Colin Ian King	4120dab095	net/mlx5: fix spelling mistake: "Allodating" -> "Allocating" Trivial fix to spelling mistake in mlx5_core_dbg debug message Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-01 14:36:43 -07:00
David S. Miller	ea23b42739	mlx5-fixes-2017-06-28 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJZU3mtAAoJEEg/ir3gV/o+4k0IAKj5XCn3cviZlXMJRMHBvamt yWrMI90XgjoPhGPx3K9mf+bMhHOGiZR0Q2DFDJZa5U64DDBVPNvag7fy74GYgj1D Cet1zohkQ2xdb/R3jfML8tG2IVfvETWo3cgJGFtGUBlOULvpwinSK4A+8oUUGszc K1vAY0j3+Ncfjk+CZJ8hWqaIk1dyYtjtyn0ACOUOftqBa6+UZY7LbLTTOI7hOZoX 3M35W7ntgGoBScONlxpDUXNUewia4ADTiQPWwHdT9+xNlwz1fzmCHlYi5pY+z9TC PKbbe1O4l1nsMftwqJVQNHrFnq+x/X69J5vlgobWkk0dQCRQWE9qanG8BfXPykY= =DUG8 -----END PGP SIGNATURE----- Merge tag 'mlx5-fixes-2017-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== Mellanox, mlx5 fixes 2017-06-28 This series contains some fixes for the mlx5 core and netdev driver. Please pull and let me know if there's any problem. For -stable: ("net/mlx5e: Fix TX carrier errors report in get stats ndo") Kernels >= v4.7 ("net/mlx5: Cancel delayed recovery work when unloading the driver") Kernels >= v4.10 * When applied to net-next this will introduce a contextual conflict, it should be easy to resolve, (a spin_lock was changed to spin_lock_irqsave in net-next), if you need any help with this please let me know. ("net/mlx5: Fix driver load error flow when firmware is stuck") Kernels >= v4.4* * This patch fixes: `6c780a0267` ("net/mlx5: Wait for FW readiness before initializing command interface") which was submitted two weeks ago and queued up for v4.4. Sorry about the mess, but other than the above, this series doesn't introduce any conflict with the current mlx5 IPSec offload series. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-01 14:11:48 -07:00
David S. Miller	b079115937	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net A set of overlapping changes in macvlan and the rocker driver, nothing serious. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-30 12:43:08 -04:00
Inbar Karmy	ec327f7a43	net/mlx4_en: Do not allocate redundant TX queues when TC is disabled Currently the number of TX queues that are allocated doesn't depend on the number of TCs, the module always loads with max num of UP per channel. In order to prevent the allocation of unnecessary memory, the module will load with minimum number of UPs per channel, and the user will be able to control the number of TX queues per channel by changing the number of TC to 8 using the tc command. The variable num_up will hold the information about the current number of UPs. Due to the change, needed to remove the lines that set the value of UP to be different than zero in the func "mlx4_en_select_queue", since now the num of TX queues that are allocated is only one per channel in default. In order not to force the UP to be zero in case of only one TC, added a condition before forcing it in the func "mlx4_en_fill_qp_context". Tested: After the module is loaded with minimum number of UP per channel, to increase num of TCs to 8, use: tc qdisc add dev ens8 root mqprio num_tc 8 In order to decrease the number of TCs to minimum number of UP per channel, use: tc qdisc del dev ens8 root Signed-off-by: Inbar Karmy <inbark@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Cc: Tarick Bedeir <tarick@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-29 15:56:15 -04:00
Inbar Karmy	f21ad61424	net/mlx4_en: Add dynamic variable to hold the number of user priorities (UP) Until this patch, the number of UPs was hard coded for eight. Replace this with a variable in struct "mlx4_en_port_profile". Currently, the variable will hold the maximum number of UP, as before. The patch creates an infrastructure to add an option for dynamic change of the actual number of TCs. Signed-off-by: Inbar Karmy <inbark@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Cc: Tarick Bedeir <tarick@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-29 15:56:15 -04:00
Ido Schimmel	6b27c8adf2	mlxsw: spectrum_router: Fix NULL pointer dereference In case a VLAN device is enslaved to a bridge we shouldn't create a router interface (RIF) for it when it's configured with an IP address. This is already handled by the driver for other types of netdevs, such as physical ports and LAG devices. If this IP address is then removed and the interface is subsequently unlinked from the bridge, a NULL pointer dereference can happen, as the original 802.1d FID was replaced with an rFID which was then deleted. To reproduce: $ ip link set dev enp3s0np9 up $ ip link add name enp3s0np9.111 link enp3s0np9 type vlan id 111 $ ip link set dev enp3s0np9.111 up $ ip link add name br0 type bridge $ ip link set dev br0 up $ ip link set enp3s0np9.111 master br0 $ ip address add dev enp3s0np9.111 192.168.0.1/24 $ ip address del dev enp3s0np9.111 192.168.0.1/24 $ ip link set dev enp3s0np9.111 nomaster Fixes: `99724c18fc` ("mlxsw: spectrum: Introduce support for router interfaces") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Petr Machata <petrm@mellanox.com> Tested-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-29 12:59:48 -04:00
David S. Miller	5185ad616b	mlx5-updates-2017-06-27 (Innova IPsec offload support) This patchset adds support for Innova IPSec network interface card. About Innova device: -------------------- Innova is a network card with a ConnectX chip and an FPGA chip as a bump-on-the-wire. Internal +----------+ Link +-----------------+ \| +--------------+ FPGA \| +------+ \| ConnectX \| \| Shell +--+ QSFP \| \| +--------------+ +-------+ \| \| Port \| +----------+ I2C \| \| SBU \| \| +------+ \| +-------+ \| +--+----------+---+ \| \| +--+--+ +---+---+ \| DDR \| \| Flash \| +-----+ +-------+ The FPGA synthesized logic is loaded from dedicated flash storage and has access to its own dedicated DDR RAM. The ConnectX chip firmware programs the FPGA by accessing its configuration space over either the slow internal I2C link or the high-speed internal link. The FPGA logic is divided into a "Shell" and a "Sandbox Unit" (SBU). mlx5_core driver (with CONFIG_MLX5_FPGA) handles all shell functionality, while other components may handle the various SBU functionalities. The driver opens high-speed reliable communication channels with the shell and the SBU over the internal link. These channels may be used for high-bandwidth configuration or for SBU-specific out-of-band data paths. About Innova IPSec device: -------------------------- Innova IPSec is a network card that allows offloading IPSec cryptography operations from the host CPU to the NIC. It is an Innova card with an IPSec SBU. The hardware keeps the database of IPSec Security Associations (SADB) in the FPGA's DDR memory. Internal +----------+ Link +-----------------+ \| +--------------+ FPGA \| +------+ \| ConnectX \| \| Shell +--+ QSFP \| \| +--------------+ +-------+ \| \| Port \| +----------+ Internal I2C \| \| IPSec \| \| +------+ \| \| SBU \| \| \| +-------+ \| +--+----------+---+ \| \| +--+--+ +---+---+ \| DDR \| \| \| \| \| \| Flash \| \|SADB \| \| \| +-----+ +-------+ Modes and ciphers: Currently the following modes and ciphers are supported: IPv4 and IPv6 ESP tunnel and transport modes AES 128 and 256 bit encryption, with GCM authentication (RFC4106) IV is generated using seqiv, in sync with Linux's geniv. More modes and ciphers may be added later. Notes: In the future similar functionality will be included in a single-chip NIC. About the driver: ----------------- Patches 1-4 prepare some existing driver code for the new feature: * Add support for reserved GIDs in the hardware GID table * Allow multiple modules to enable hardware RoCE support independently Patches 5-6 define structs and helper functions for QP work-queues. Patches 7-11 add various FPGA-related features required for Innova. IPSec. Patch 12 adds abstraction layer for Mellanox IPSec-offload capable devices. atches 13-16 add IPSec offload support to the mlx5 netdevice. This driver services the new IPSec offload API introduced in commit `d77e38e612` ("xfrm: Add an IPsec hardware offloading API") Configuration Path: If Innova IPSec device is detected, the mlx5e netdevice gets the new NETIF_F_HW_ESP feature and the xdo callbacks, indicating ESP offload capabilities, and also the matching TX checksum and GSO features. The driver configures offloaded Security Associations (SAs) by sending an ADD_SA or DEL_SA message to the IPSec SBU, which updates the SADB in DDR. These messages and their responses are sent over a high-speed channel. Counters for ethtool are retrieved by the driver from the SBU. Data path: On receive path, the SBU decrypts ESP packets which match the offloaded SADB, but keeps them encapsulated. The SBU injects metadata (Mellanox owned ethertype) indicating that crypto-offload has taken place, the SA with which it was done, and the authentication result. The ConnectX chip performs RX checksum offload on the packet, and RSS using the ESP SPI value. The driver detects the special ethertype, and attaches a struct secpath to the RX SKB, including flags to indicate that crypto offload took place, the authentication result, and which xfrm_state was used for decryption, in the olen and ovec members. The RX SKB may have useful CHECKSUM_COMPLETE. A separate patchset will add support for that in the xfrm stack. On transmit path, the stack encapsulates the packet but does not encrypt it, and indicates in the SKB's secpath that crypto offload is to be performed and the SA to use to do so. The driver avoids performing crypto-offload for ESP fragments, and packets with IP options, as the SBU cannot currently do that. For eligible packets, the driver prepends a special ethertype with metadata instructing the hardware to perform crypto offload. The stack builds regular (non-GSO) SKBs so that they contain a placeholder for the ESP trailer. The driver trims it off, because the SBU automatically appends the trailer for offloaded packets. The ConnectX chip performs TX checksum offload on inner UDP or TCP packets, and GSO for TCP packets (duplicating the prepended metadata). The segmented packets then undergo encryption in the SBU before going on the wire. Performance: We measure single stream of TCP on Intel(R) Xeon(R) CPU E5-2643 v2 @3.50GHz Using AES-NI with ESP GSO we get constant 4.1 Gbps. Using crypto offload we get constant 18 Gbps. Note that these numbers require CHECKSUM_COMPLETE support in XFRM, which we submit separately. - Ilan Tayari -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJZUmf1AAoJEEg/ir3gV/o+ukIIALp/5+E1W0cC9xvY1X9dTETW cKsHvDJ7G1CxUy18W8Mf9z+WOqC6hGCqS+yicOb+umfIqkTcLHDb2irlqprYLC+F oYl1HqgHTaiAYByqL90qiyPcFbfsaNIqA9KOsED2qdZ1yxjoYBiJnSDZDAdO/0lN Lt1czNswFc5ovnEUGn8bkjLZZH2pJoJWEI4g4hN9cq33BLLq8A795F/ZjwCJTQ1X qXdKcEmktBrgZiSiTVFxxpQVhO/uB0HmzaZzrY1k1P5e6yhHEr422mcOcF9KcSL4 aeyRYHjoIh51vPMbScPjvfbO/PwooU3LWLlxLVNLG0MmkSaGyJeUXg/wHsGI910= =JN0A -----END PGP SIGNATURE----- Merge tag 'mlx5-updates-2017-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2017-06-27 (Innova IPsec offload support) This patchset adds support for Innova IPSec network interface card. About Innova device: -------------------- Innova is a network card with a ConnectX chip and an FPGA chip as a bump-on-the-wire. Internal +----------+ Link +-----------------+ \| +--------------+ FPGA \| +------+ \| ConnectX \| \| Shell +--+ QSFP \| \| +--------------+ +-------+ \| \| Port \| +----------+ I2C \| \| SBU \| \| +------+ \| +-------+ \| +--+----------+---+ \| \| +--+--+ +---+---+ \| DDR \| \| Flash \| +-----+ +-------+ The FPGA synthesized logic is loaded from dedicated flash storage and has access to its own dedicated DDR RAM. The ConnectX chip firmware programs the FPGA by accessing its configuration space over either the slow internal I2C link or the high-speed internal link. The FPGA logic is divided into a "Shell" and a "Sandbox Unit" (SBU). mlx5_core driver (with CONFIG_MLX5_FPGA) handles all shell functionality, while other components may handle the various SBU functionalities. The driver opens high-speed reliable communication channels with the shell and the SBU over the internal link. These channels may be used for high-bandwidth configuration or for SBU-specific out-of-band data paths. About Innova IPSec device: -------------------------- Innova IPSec is a network card that allows offloading IPSec cryptography operations from the host CPU to the NIC. It is an Innova card with an IPSec SBU. The hardware keeps the database of IPSec Security Associations (SADB) in the FPGA's DDR memory. Internal +----------+ Link +-----------------+ \| +--------------+ FPGA \| +------+ \| ConnectX \| \| Shell +--+ QSFP \| \| +--------------+ +-------+ \| \| Port \| +----------+ Internal I2C \| \| IPSec \| \| +------+ \| \| SBU \| \| \| +-------+ \| +--+----------+---+ \| \| +--+--+ +---+---+ \| DDR \| \| \| \| \| \| Flash \| \|SADB \| \| \| +-----+ +-------+ Modes and ciphers: Currently the following modes and ciphers are supported: IPv4 and IPv6 ESP tunnel and transport modes AES 128 and 256 bit encryption, with GCM authentication (RFC4106) IV is generated using seqiv, in sync with Linux's geniv. More modes and ciphers may be added later. Notes: In the future similar functionality will be included in a single-chip NIC. About the driver: ----------------- Patches 1-4 prepare some existing driver code for the new feature: * Add support for reserved GIDs in the hardware GID table * Allow multiple modules to enable hardware RoCE support independently Patches 5-6 define structs and helper functions for QP work-queues. Patches 7-11 add various FPGA-related features required for Innova. IPSec. Patch 12 adds abstraction layer for Mellanox IPSec-offload capable devices. atches 13-16 add IPSec offload support to the mlx5 netdevice. This driver services the new IPSec offload API introduced in commit `d77e38e612` ("xfrm: Add an IPsec hardware offloading API") Configuration Path: If Innova IPSec device is detected, the mlx5e netdevice gets the new NETIF_F_HW_ESP feature and the xdo callbacks, indicating ESP offload capabilities, and also the matching TX checksum and GSO features. The driver configures offloaded Security Associations (SAs) by sending an ADD_SA or DEL_SA message to the IPSec SBU, which updates the SADB in DDR. These messages and their responses are sent over a high-speed channel. Counters for ethtool are retrieved by the driver from the SBU. Data path: On receive path, the SBU decrypts ESP packets which match the offloaded SADB, but keeps them encapsulated. The SBU injects metadata (Mellanox owned ethertype) indicating that crypto-offload has taken place, the SA with which it was done, and the authentication result. The ConnectX chip performs RX checksum offload on the packet, and RSS using the ESP SPI value. The driver detects the special ethertype, and attaches a struct secpath to the RX SKB, including flags to indicate that crypto offload took place, the authentication result, and which xfrm_state was used for decryption, in the olen and ovec members. The RX SKB may have useful CHECKSUM_COMPLETE. A separate patchset will add support for that in the xfrm stack. On transmit path, the stack encapsulates the packet but does not encrypt it, and indicates in the SKB's secpath that crypto offload is to be performed and the SA to use to do so. The driver avoids performing crypto-offload for ESP fragments, and packets with IP options, as the SBU cannot currently do that. For eligible packets, the driver prepends a special ethertype with metadata instructing the hardware to perform crypto offload. The stack builds regular (non-GSO) SKBs so that they contain a placeholder for the ESP trailer. The driver trims it off, because the SBU automatically appends the trailer for offloaded packets. The ConnectX chip performs TX checksum offload on inner UDP or TCP packets, and GSO for TCP packets (duplicating the prepended metadata). The segmented packets then undergo encryption in the SBU before going on the wire. Performance: We measure single stream of TCP on Intel(R) Xeon(R) CPU E5-2643 v2 @3.50GHz Using AES-NI with ESP GSO we get constant 4.1 Gbps. Using crypto offload we get constant 18 Gbps. Note that these numbers require CHECKSUM_COMPLETE support in XFRM, which we submit separately. - Ilan Tayari ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-29 12:30:16 -04:00
Colin Ian King	46ccf725bf	net/mlx4: fix spelling mistake: "enforcment" -> "enforcement" Trivial fix to spelling mistake in mlx4_dbg debug message Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-29 12:25:01 -04:00
Ilan Tayari	164f16f702	net/mlx5e: IPSec, Add IPSec ethtool stats Add Innova IPSec SBU counters to the ethtool -S stats. Add IPSec offload error counters to the ethtool -S stats. Signed-off-by: Ilan Tayari <ilant@mellanox.com> Reviewed-by: Boris Pismenny <borisp@mellanox.com> Reviewed-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-06-27 16:36:48 +03:00
Ilan Tayari	2ac9cfe782	net/mlx5e: IPSec, Add Innova IPSec offload TX data path In the TX data path, prepend a special metadata ethertype which instructs the hardware to perform cryptography. In addition, fill Software-Parser segment in TX descriptor so that the hardware may parse the ESP protocol, and perform TX checksum offload on the inner payload. Support GSO, by providing the inverse of gso_size in the metadata. This allows the FPGA to update the ESP header (seqno and seqiv) on the resulting packets, by calculating the packet number within the GSO back from the TCP sequence number. Note that for GSO SKBs, the stack does not include an ESP trailer, unlike the non-GSO case. Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Yossi Kuperman <yossiku@mellanox.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@mellanox.com> Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-06-27 16:36:48 +03:00
Ilan Tayari	899a59d301	net/mlx5e: IPSec, Add Innova IPSec offload RX data path In RX data path, the hardware prepends a special metadata ethertype which indicates that the packet underwent decryption, and the result of the authentication check. Communicate this to the stack in skb->sp. Make wqe_size large enough to account for the injected metadata. Support only Linked-list RQ type. IPSec offload RX packets may have useful CHECKSUM_COMPLETE information, which the stack may not be able to use yet. Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Yossi Kuperman <yossiku@mellanox.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@mellanox.com> Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-06-27 16:36:47 +03:00
Ilan Tayari	547eede070	net/mlx5e: IPSec, Innova IPSec offload infrastructure Add Innova IPSec ESP crypto offload configuration paths. Detect Innova IPSec device and set the NETIF_F_HW_ESP flag. Configure Security Associations using the API introduced in a previous patch. Add Software-parser hardware descriptor layout Software-Parser (swp) is a hardware feature in ConnectX which allows the host software to specify protocol header offsets in the TX path, thus overriding the hardware parser. This is useful for protocols that the ASIC may not be able to parse on its own. Note that due to inline metadata, XDP is not supported in Innova IPSec. Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Yossi Kuperman <yossiku@mellanox.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@mellanox.com> Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-06-27 16:36:47 +03:00
Ilan Tayari	bebb23e6cb	net/mlx5: Accel, Add IPSec acceleration interface Add routines for manipulating the hardware IPSec SA database (SADB). In Innova IPSec, a Security Association (SA) is added or deleted via a command message over the SBU connection. The HW then sends a response message over the same connection. Add implementation for Innova IPSec (FPGA-based) hardware. These routines will be used by the IPSec offload support in a later patch However they may also be used by others such as RDMA and RoCE IPSec. mlx5/accel is a middle acceleration layer to allow mlx5e and other ULPs to work directly with mlx5_core rather than Innova FPGA or other mlx5 acceleration providers. In this patchset we add Innova IPSec support and mlx5/accel delegates IPSec offloads to Innova routines. In the future, when IPSec/TLS or any other acceleration gets integrated into ConnectX chip, mlx5/accel layer will provide the integrated acceleration, rather than the Innova one. Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-06-27 16:36:47 +03:00
Ilan Tayari	a9956d35d1	net/mlx5: FPGA, Add SBU infrastructure Add interface to initialize and interact with Innova FPGA SBU connections. A client driver may use these functions to set up a high-speed DMA connection with its SBU hardware logic, and send/receive messages over this connection. A later patch in this patchset will make use of these functions for Innova IPSec offload in mlx5 Ethernet driver. Add commands to retrieve Innova FPGA SBU capabilities, and to read/write Innova FPGA configuration space registers and memory, over internal I2C. At high level, the FPGA configuration space is divided such: 0x00000000 - 0x007fffff is reserved for the SBU 0x00800000 - 0xffffffff is reserved for the Shell 0x400000000 - ... is DDR memory A later patchset will add support for accessing FPGA CrSpace and memory over a high-speed connection. This is the reason for the ACCESS_TYPE enumeration, which currently only supports I2C. Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-06-27 16:36:47 +03:00
Ilan Tayari	c43051d72a	net/mlx5: FPGA, Add SBU bypass and reset flows The Innova FPGA includes shell hardware and Sandbox-Unit (SBU) hardware. The shell hardware is handled by mlx5_core itself, while the SBU is handled by a client driver. Reset the SBU to a well-known initial state when initializing a new device, and set the FPGA to bypass mode when uninitializing a device. This allows the client driver to assume that its device has been reset when a new device is detected. During SBU reset, the FPGA is put into SBU-bypass mode. In this mode packets do not pass through the SBU, so it cannot affect the network data stream at all. A factory-image does not have an SBU, so skip these flows. Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-06-27 16:36:47 +03:00
Ilan Tayari	537a505741	net/mlx5: FPGA, Add high-speed connection routines An FPGA high-speed connection has two endpoints, an FPGA QP and a ConnectX QP. Add library routines to create and connect the endpoints of an FPGA high-speed connection. These routines allow creating and interacting with both types of connections: Shell and Sandbox Unit (SBU). Shell connection provides an interface to the FPGA's address space, which includes the configuration space and the DDR. Use of the shell connection will be introduced in a later patchset. SBU connection provides a command and/or data interface to the application-specific logic within the FPGA. Use of the SBU connection will be introduced in a later patch in this patchset. Some struct definitions are added to a new header file sdk.h, which will be extended in later patches in the patchset. This header file will contain the in-kernel FPGA client driver API. Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-06-27 16:36:47 +03:00

... 3 4 5 6 7 ...

3176 Commits