linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-28 11:18:45 +07:00

Author	SHA1	Message	Date
Or Gerlitz	c415f704c8	net/mlx5: E-Switch, Correctly deal with inline mode on ConnectX-5 On ConnectX5 the wqe inline mode is "none" and hence the FW reports MLX5_CAP_INLINE_MODE_NOT_REQUIRED. Fix our devlink callbacks to deal with that on get and set. Also fix the tc flow parsing code not to fail anything when inline isn't required. Fixes: `bffaa91658` ('net/mlx5: E-Switch, Add control for inline mode') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-04-22 21:52:37 +03:00
Mohamad Haj Yahia	55378a238e	net/mlx5: Fix driver load bad flow when having fw initializing timeout If FW is stuck in initializing state we will skip the driver load, but current error handling flow doesn't clean previously allocated command interface resources. Fixes: `e3297246c2` ('net/mlx5_core: Wait for FW readiness on startup') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-04-22 21:52:37 +03:00
Stephen Hemminger	8bf3198a5e	mlx5: fix warning about missing prototype Fix sparse warning about missing prototypes. The rx/tx code path defines functions with prototypes in ipoib.h. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-04-22 20:26:42 +03:00
Stephen Hemminger	a7082ef066	mlx5: hide unused functions Fix sparse warnings in recent ipoib support. The RDMA functions are not used yet, hide behind #ifdef. Based on comment, they will eventually be local so make static. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-04-22 20:26:41 +03:00
Roi Dayan	7768d1971d	net/mlx5: E-Switch, Add control for encapsulation Implement the devlink e-switch encapsulation control set and get callbacks. Apply the value set by the user on the switchdev offloads mode when creating the fast FDB table where offloaded rules will be set. Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-04-22 20:26:40 +03:00
Or Gerlitz	1967ce6ea5	net/mlx5: E-Switch, Refactor fast path FDB table creation in switchdev mode Refactor the creation of the fast path FDB table that holds the offloaded rules in SRIOV switchdev mode into it's own function. This will be used in the next patch to be able and re-create the table under different settings without going through legacy mode. This patch doesn't change any functionality. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-04-22 20:26:38 +03:00
Dan Carpenter	6905e5a5c8	net/mlx5e: IPoIB, Fix error handling in mlx5_rdma_netdev_alloc() The labels were out of order, so it either could result in an Oops or a leak. Fixes: `48935bbb7a` ("net/mlx5e: IPoIB, Add netdevice profile skeleton") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-20 16:26:10 -04:00
Jiri Pirko	9d41accc1e	mlxsw: spectrum: Add FID miss trap When there is no FID set for a specific packet, the HW will drop it. However, by default these packets are useful to be delivered to CPU as it can inspect them and program HW accordingly. So add this trap. This would only ever happen when port is enslaved to an OVS master. Otherwise, packets would be dropped during VLAN / STP filtering, before FID classification. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-20 15:32:31 -04:00
Jiri Pirko	2b94e58df5	mlxsw: spectrum: Allow ports to work under OVS master >From now on, a port can become a slave of OVS master. All vlans are enabled, STP state is set to "forwarding". It is up to the OVS userspace daemon to setup the flows either in kernel or in HW using TC flower offload. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-20 15:32:31 -04:00
Jiri Pirko	93cd081305	mlxsw: spectrum: Teach mlxsw_sp_port_vlan_set to accept any vlan range So far, mlxsw_sp_port_vlan_set range is limited by MLXSW_REG_SPVM_REC_MAX_COUNT. In spectrum_switchdev code this is wrapped up by a helper function which actually does multiple calls to FW for bigger ranges. Move the code into mlxsw_sp_port_vlan_set and use it always. That allows caller not to care about the range. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-20 15:32:30 -04:00
Jiri Pirko	cedbb8b259	mlxsw: spectrum_flower: Set dummy FID before forward action HW requires the FID to be valid in order for the forward action to work. So regardless of the current FID validity, just set the dummy FID which would do the trick. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-20 15:32:30 -04:00
Jiri Pirko	202d6f423c	mlxsw: spectrum: Add dummy FID initialization For forwarding using ACL action, HW needs a valid FID to be setup. It does not actually use it, so it can be any valid FID. So create a dummy FID only for this purpose. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-20 15:32:30 -04:00
Jiri Pirko	ac44dd43d8	mlxsw: spectrum: Implement action to set FID Implement part of multipurpose Virtual Router and Forwarding Domain Action that takes care of setting up FID. We need to use it to be able to forward packets using ACL action when no FID is associated on RX. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-20 15:32:30 -04:00
Jiri Pirko	b51df799f6	mlxsw: spectrum: Fix indent in mlxsw_sp_netdevice_port_upper_event Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-20 15:32:29 -04:00
Greg Thelen	8dc7d11f1d	net/mlx4: suppress 'may be used uninitialized' warning gcc 4.8.4 complains that mlx4_SW2HW_MPT_wrapper() uses an uninitialized 'mpt' variable: drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'mlx4_SW2HW_MPT_wrapper': drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:2802:12: warning: 'mpt' may be used uninitialized in this function [-Wmaybe-uninitialized] mpt->mtt = mtt; I think this warning is a false complaint. mpt is only used when mr_res_start_move_to() return zero, and in all such cases it initializes mpt. But apparently gcc cannot see that. Initialize mpt to avoid the warning. Signed-off-by: Greg Thelen <gthelen@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-20 13:19:21 -04:00
Saeed Mahameed	955bc48081	net/mlx5e: E-switch vport manager is valid for ethernet only Currently the driver support only ethernet eswitch, and we want to protect downstream IPoIB netdev from trying to access it in IB link. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-17 11:08:32 -04:00
Saeed Mahameed	9d6bd752c6	net/mlx5e: IPoIB, RX handler Implement IPoIB RX SKB handler. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-17 11:08:32 -04:00
Saeed Mahameed	20fd0c193f	net/mlx5e: RX handlers per netdev profile In order to have different RX handler per profile, fix and refactor the current code to take the rx handler directly from the netdevice profile rather than computing it on runtime as it was done with the switchdev mode representor rx handler. This will also remove the current wrong assumption in mlx5e_alloc_rq code that mlx5e_priv->ppriv is of the type vport_rep. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-17 11:08:31 -04:00
Saeed Mahameed	258545449b	net/mlx5e: IPoIB, Xmit flow Implement mlx5e's IPoIB SKB transmit using the helper functions provided by mlx5e ethernet tx flow, the only difference in the code between mlx5e_xmit and mlx5i_xmit is that IPoIB has some extra fields to fill (UD datagram segment) in the TX descriptor (WQE) and it doesn't need to have any vlan handling. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-17 11:08:31 -04:00
Saeed Mahameed	77bdf8950b	net/mlx5e: Xmit flow break down Break current mlx5e xmit flow into smaller blocks (helper functions) in order to reuse them for IPoIB SKB transmission. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-17 11:08:31 -04:00
Saeed Mahameed	ec8fd927b7	net/mlx5e: IPoIB, Underlay QP Create IPoIB underlay QP needed by the IPoIB netdevice profile for RSS and TX HW context to perform on IPoIB traffic. Reset the underlay QP on dev_uninit ndo to stop IPoIB traffic going through this QP when the ULP IPoIB decides to cleanup. Implement attach/detach mcast RDMA netdev callbacks for later RDMA netdev use. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-17 11:08:31 -04:00
Saeed Mahameed	603f4a4521	net/mlx5e: IPoIB, Basic netdev ndos open/close Implement open/close of IPoIB netdevice ndos using mlx5e's channels API to manage data path resources (RQs/SQs/CQs). Set IPoIB netdev address on dev_init ndo. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-17 11:08:30 -04:00
Saeed Mahameed	5426a0b274	net/mlx5e: IPoIB, TX TIS creation Modify mlx5e tis creation function to accept underlay qp number, which will be needed by IPoIB. Implement mlx5i (IPoIB) tx init/cleanup netdevice profile flows to create one TIS with the IPoIB underlay qp, for IPoIB TX SQs. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-17 11:08:30 -04:00
Saeed Mahameed	bc81b9d326	net/mlx5e: IPoIB, RSS flow steering tables Like the mlx5e ethernet mode, on IPoIB mode we need to create RX steering tables, but IPoIB do not require MAC and VLAN steering tables so the only tables we create in here are: 1. TTC Table (Traffic Type Classifier table for RSS steering) 2. ARFS Table (for accelerated RFS support) Creation of those tables is identical to mlx5e ethernet mode, hence the use of mlx5e_create_ttc_table and mlx5e_arfs_create_tables. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-17 11:08:30 -04:00
Saeed Mahameed	8f493ffd88	net/mlx5e: IPoIB, RX steering RSS RQTs and TIRs Implement IPoIB RX RSS (RQTs and TIRs) HW objects creation, All we do here is simply reuse the mlx5e implementation to create direct and indirect (RSS) steering HW objects. For that we just expose mlx5e_{create,destroy}_{direct,indirect}_{rqt,tir} functions into en.h and call them from ipoib.c in init/cleanup_rx IPoIB netdevice profile callbacks. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-17 11:08:30 -04:00
Saeed Mahameed	48935bbb7a	net/mlx5e: IPoIB, Add netdevice profile skeleton Create mlx5e IPoIB netdevice profile skeleton in the new ipoib.c file with empty implementation. Downstream patches will provide the full mlx5 rdma netdevice acceleration support for IPoIB into this new file, by using the mlx5e netdevice profile and new mlx5_channels APIs and infrastructures. Same as already done in mlx5e NIC netdevice and switchdev mode VF representors. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-17 11:08:30 -04:00
Saeed Mahameed	2c3b5beec4	net/mlx5e: More generic netdev management API In preparation for mlx5e RDMA net_device support, here we generalize mlx5e_attach/detach in a way that those functions will be agnostic to link type. For that we move ethernet specific NIC net device logic out of those functions into {nic,rep}_{enable/disable} mlx5e NIC and representor profiles callbacks. Also some of the logic was moved only to NIC profile since it is not right to have this logic for representor net device (e.g. set port MTU). Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-17 11:08:29 -04:00
Erez Shitrit	ffdb8827ec	net/mlx5: Enable flow-steering for IB link Get the relevant capabilities if supports ipoib_enhanced_offloads and init the flow steering table accordingly. Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-17 11:08:29 -04:00
Erez Shitrit	b3ba51498b	net/mlx5: Refactor create flow table method to accept underlay QP IB flow tables need the underlay qp to perform flow steering. Here we change the API of the flow tables creation to accept the underlay QP number as a parameter in order to support IB (IPoIB) flow steering. Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-17 11:08:29 -04:00
Christoph Hellwig	3680b1f655	mlxsw: convert to pci_alloc_irq_vectors Trivial conversion as only one vector is supported, but at least we lose the useless msix_entry member in the per-device structure. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Tested-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-11 11:16:03 -04:00
Saeed Mahameed	457fcd8a11	net/mlx5e: Set default RX moderation parameters on driver load RX moderation default parameters shouldn't be set in mlx5e_build_rx_cq_param since it would reset the values every time on netdev open/close. Instead, it should be set in mlx5e_set_rx_cq_mode_params which is called on driver load only. Fixes: `6a9764efb2` ("net/mlx5e: Isolate open_channels from priv->params") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-04-07 01:21:28 +03:00
Eran Ben Elisha	95b6c6a519	net/mlx5e: Reuse alloc cq code for all CQs allocation Reuse the code for mlx5e_alloc_cq and mlx5e_alloc_drop_cq, as they have a similar flow. Prior to this patch, the CQEs in the "drop CQ" were not initialized, fixed it with the shared flow of alloc CQ. This is not a critical bug as the RQ connected to this CQ never moved to RTS, but still better to have this right. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-04-07 01:21:27 +03:00
Inbar Karmy	84e11edb71	net/mlx5e: Show board id in ethtool driver information Add the board id (PSID) to the firmware-version field in the ethtool -i (driver information). The PSID is shown in parentheses, next to the fw-version. $ ethtool -i ens6 firmware-version: 12.14.1101 (MT_2190110032) Signed-off-by: Inbar Karmy <inbark@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-04-07 01:21:26 +03:00
Eran Ben Elisha	6543b78ea1	net/mlx5e: Change FW sub_minor display to 4 zeros padding FW version should be reported as X.Y.ZZZZ, add leading zeroes to sub minor in order to fix it. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-04-07 01:21:25 +03:00
Guy Ergas	f6d96a2092	net/mlx5e: Make mlx5e_modify_rqs_vsd a static function Make mlx5e_modify_rqs_vsd a static function and remove from en.h in order to reduce redundant exposure of functions. Signed-off-by: Guy Ergas <guye@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-04-07 01:21:00 +03:00
Guy Ergas	102722fc68	net/mlx5e: Add support for RXFCS feature flag Add support for rx-fcs flag from ethtool. In case this flag is set, update all RQs to scatter the FCS data into the packet. Signed-off-by: Guy Ergas <guye@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-04-07 01:20:59 +03:00
Majd Dibbiny	d0dd989f97	net/mlx5: Update the list of the PCI supported devices Rename the ConnectX-5 PCIe 4.0 to be ConnectX-5 Ex. Also add the upcoming ConnectX-6 and it's VF IDs to the list. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-04-07 01:20:58 +03:00
Eric Dumazet	75d04aa368	mlx4: trust shinfo->gso_segs mlx4 is the only driver in the tree making a point to recompute shinfo->gso_segs. Lets remove superfluous code. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tariq Toukan <tariqt@mellanox.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-06 13:27:49 -07:00
Tobias Regnery	053ee0a703	net/mlx5e: fix build error without CONFIG_SYSFS Commit `9008ae0748` ("net/mlx5e: Minimize mlx5e_{open/close}_locked") copied the calls to netif_set_real_num_{tx,rx}_queues from mlx5e_open_locked to mlx5e_activate_priv_channels and wraps them in an if condition to test for netdev->real_num_{tx,rx}_queues. But netdev->real_num_rx_queues is conditionally compiled in if CONFIG_SYSFS is set. Without CONFIG_SYSFS the build fails: drivers/net/ethernet/mellanox/mlx5/core/en_main.c: In function 'mlx5e_activate_priv_channels': drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2515:12: error: 'struct net_device' has no member named 'real_num_rx_queues'; did you mean 'real_num_tx_queues'? Fix this by unconditionally call netif_set_real_num{tx,rx}_queues like before commit `9008ae0748`. Fixes: `9008ae0748` ("net/mlx5e: Minimize mlx5e_{open/close}_locked") Signed-off-by: Tobias Regnery <tobias.regnery@gmail.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-06 12:55:36 -07:00
David S. Miller	6f14f443d3	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Mostly simple cases of overlapping changes (adding code nearby, a function whose name changes, for example). Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-06 08:24:51 -07:00
Andrew Morton	e270e96686	drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c: fix build with gcc-4.4.4 drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c: In function 'mlx5e_set_rxfh': drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c:1067: error: unknown field 'rss' specified in initializer drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c:1067: warning: missing braces around initializer drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c:1067: warning: (near initialization for 'rrp.<anonymous>') drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c:1068: error: unknown field 'rss' specified in initializer drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c:1069: warning: excess elements in struct initializer drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c:1069: warning: (near initialization for 'rrp') gcc-4.4.4 has issues with anonymous union initializers. Work around this. Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-02 19:32:57 -07:00
Andrew Morton	956327913c	drivers/net/ethernet/mellanox/mlx5/core/en_main.c: fix build with gcc-4.4.4 drivers/net/ethernet/mellanox/mlx5/core/en_main.c: In function 'mlx5e_redirect_rqts': drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2210: error: unknown field 'rqn' specified in initializer drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2211: warning: missing braces around initializer drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2211: warning: (near initialization for 'direct_rrp.<anonymous>') drivers/net/ethernet/mellanox/mlx5/core/en_main.c: In function 'mlx5e_redirect_rqts_to_channels': drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2227: error: unknown field 'rss' specified in initializer drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2227: warning: missing braces around initializer drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2227: warning: (near initialization for 'rrp.<anonymous>') drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2227: warning: initialization makes integer from pointer without a cast drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2228: error: unknown field 'rss' specified in initializer drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2229: warning: excess elements in struct initializer drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2229: warning: (near initialization for 'rrp') drivers/net/ethernet/mellanox/mlx5/core/en_main.c: In function 'mlx5e_redirect_rqts_to_drop': drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2238: error: unknown field 'rqn' specified in initializer drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2239: warning: missing braces around initializer drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2239: warning: (near initialization for 'drop_rrp.<anonymous>') gcc-4.4.4 has issues with anonymous union initializers. Work around this. Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-02 19:32:57 -07:00
David S. Miller	6c2257062e	mlx5e-pedit 2017-03-28 Or Gerlitz says: This series adds support for offloading modifications of packet headers using ConnectX-5 HW header re-write as an action applied during packet steering. The offloaded SW mechanism is TC's pedit action. The offloading is supported for E-Switch steering of VF traffic in the SRIOV switchdev mode and for NIC (non eswitch) RX. One use-case for this offload on virtual networks, is when the hypervisor implements flow based router such as Open-Stack's DVR, where L2 headers of guest packets re-written with routers' MAC addresses and the IP TTL is decremented. Another use case (which can be applied in parallel with routing) is stateless NAT where guest L3/L4 headers are re-written. The series is built as follows: the 1st six patches are preperations which don't yet add new functionality, patches 7-8 add the FW APIs (data-structures and commands) for header re-write, and patch nine allows offloading driver to access pedit keys. The 10th patch is somehow the core of the series, where we translate from the pedit way to represent set of header modification elements to the FW API for that same matter. Once a set of HW modification is established, we register it with the FW and get a modify header ID. When this ID is used with an action during packet steering, the HW applies the header modification on the packet. Patches 11 and 12 implement the above logic as an offload for pedit action for the NIC and E-Switch use-cases. I'd like to thanks Elijah Shakkour <elijahs@mellanox.com> for implementing and helping me testing this functionality on HW simulator, before it could be done with FW. - Or. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJY2lwEAAoJEEg/ir3gV/o+rFIH+wdwGawEjoDhpihLqJHoRtwo Wvy88Lczj++Pfzt9E0kgwgmOdnj7j+GVOh6ALjneE3PDBJEFWG/GWY5aRYonlhhf zibafMTYf+8Dmm9qHW/C4OvhQowSrkG1RDucM2eyjXJfnAShZCh7dV4CDD7paxhu N2rlDdSEl0Im4aPCNHzyrdGg06Fy3A0DQkDvVLIQhKV0cLPIoC0U/i+ymVtsCUY/ sSEEuSohvwdD5Ga5ZZdKicCo61lIRSi2rX5v4sK0exhAO3S8xyrKnwbiN7nVAQqg eVZ/ekbBiksD8MRMKctt/zGxd0X4PDaQ8J9XyF9CL6pRC5VipsDy+P/GEhj/x8U= =l2Qo -----END PGP SIGNATURE----- Merge tag 'mlx5e-pedit' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Or Gerlitz says: ==================== mlx5e-pedit 2017-03-28 This series adds support for offloading modifications of packet headers using ConnectX-5 HW header re-write as an action applied during packet steering. The offloaded SW mechanism is TC's pedit action. The offloading is supported for E-Switch steering of VF traffic in the SRIOV switchdev mode and for NIC (non eswitch) RX. One use-case for this offload on virtual networks, is when the hypervisor implements flow based router such as Open-Stack's DVR, where L2 headers of guest packets re-written with routers' MAC addresses and the IP TTL is decremented. Another use case (which can be applied in parallel with routing) is stateless NAT where guest L3/L4 headers are re-written. The series is built as follows: the 1st six patches are preperations which don't yet add new functionality, patches 7-8 add the FW APIs (data-structures and commands) for header re-write, and patch nine allows offloading driver to access pedit keys. The 10th patch is somehow the core of the series, where we translate from the pedit way to represent set of header modification elements to the FW API for that same matter. Once a set of HW modification is established, we register it with the FW and get a modify header ID. When this ID is used with an action during packet steering, the HW applies the header modification on the packet. Patches 11 and 12 implement the above logic as an offload for pedit action for the NIC and E-Switch use-cases. I'd like to thanks Elijah Shakkour <elijahs@mellanox.com> for implementing and helping me testing this functionality on HW simulator, before it could be done with FW. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-29 11:24:21 -07:00
Talat Batheesh	e497ec680c	net/mlx5: Avoid dereferencing uninitialized pointer In NETDEV_CHANGEUPPER event the upper_info field is valid only when linking is true. Otherwise it should be ignored. Fixes: `7907f23adc` (net/mlx5: Implement RoCE LAG feature) Signed-off-by: Talat Batheesh <talatb@mellanox.com> Reviewed-by: Aviv Heller <avivh@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-28 18:07:15 -07:00
Arkadi Sharshevsky	2ba5999f00	mlxsw: spectrum: Add Support for erif table entries access Implement dpipe's table ops for erif table which provide: 1. Getting the entries in the table with the associate values. - match on "mlxsw_meta:erif_index" - action on "mlxsw_meta:forwared_out" 2. Synchronize the hardware in case of enabling/disabling counters which mean removing erif counters from all interfaces. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-28 17:11:55 -07:00
Arkadi Sharshevsky	fd1b9d4192	mlxsw: spectrum_router: Add rif helper functions Add rif helper function to access the rif index and rif devices ifindex. This functions will be used by dpipe in order to dump the rif table. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-28 17:11:55 -07:00
Arkadi Sharshevsky	e0c0afd8aa	mlxsw: spectrum: Support for counters on router interfaces Add support for counter allocation on router interfaces. The allocation depends on the counter state of relevant table. In case the counting is disabled or no counters left the counter index will be set as invalid. Also a counter pool for router allocation is added. Signed-off-by: Arakdi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-28 17:11:55 -07:00
Arkadi Sharshevsky	ba73e97a63	mlxsw: reg: Add Router Interface Counter Register The RICNT register retrieves per port performance counter. It will be used to query the router interfaces statistics. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-28 17:11:55 -07:00
Arkadi Sharshevsky	d54b70feb6	mlxsw: spectrum: Add definition for egress rif table Add definition for egress router interface table. This table describes the final part in the routing pipeline. This table matches the egress interface index (rif index, which is set by the previous stages and determine the out port) and makes the decision of forwarding the packet towards the L2 logic or dropping it. The metadata header is added to represent this internal information. The rif index field is mapped logically to netdevice ifindex. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-28 17:11:54 -07:00
Arkadi Sharshevsky	230ead0141	mlxsw: spectrum: Add placeholder for dpipe Add placeholder for dpipe. Support for specific tables and headers will be introduced in following patches. The headers are shared between all mlxsw_sp instances. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-28 17:11:54 -07:00
Arkadi Sharshevsky	0f630fcbe5	mlxsw: reg: Add counter fields to RITR register Update RITR for counter support. This allows adding counters for ASIC's router ports. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-28 17:11:54 -07:00
Or Gerlitz	d7e75a325c	net/mlx5e: Add offloading of E-Switch TC pedit (header re-write) actions This includes calling the parsing code that translates from pedit speak to the HW API, allocation (deallocation) of a modify header context and setting the modify header id associated with this context to the FTE of that flow. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-03-28 15:34:10 +03:00
Or Gerlitz	2f4fe4cab0	net/mlx5e: Add offloading of NIC TC pedit (header re-write) actions This includes calling the parsing code that translates from pedit speak to the HW API, allocation (deallocation) of a modify header context and setting the modify header id associated with this context to the FTE of that flow. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-03-28 15:34:08 +03:00
Or Gerlitz	d79b6df6b1	net/mlx5e: Add parsing of TC pedit actions to HW format Parse/translate a set of TC pedit actions to be formed in the HW API format. User-space provides set of keys where each one of them is made of: command (add or set), header-type, byte offset within that header along with a 32 bit mask and value. The mask dictates what bits in the 32 bit word that starts on the offset we should be dealing with, but under negative polarity (unset bits are to be modified). We do a 1st pass over the set of keys while using the header-type and offset to fill the masks and the values into a data-structure containting all the supported network headers. We then do a 2nd pass over the set of fields to re-write supported by the HW, where for each such candidate field, we use the masks filled on the 1st pass to realize if we should offloading re-write it. In case offloading is required, we fill a HW descriptor with the following: (1) the header field to modify (2) the bit offset within the field from where to modify (set command only) (3) the value to set/add (4) the length in bits 1...32 to modify (set command only) Note that it's possible for a given pedit mask to dictate modifying the same header field multiple times or to modify multiple header fields. Currently such combinations are not supported for offloading, hence, for set commands, the offset within the field is always zero, and the length to modify is the field size. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Amir Vadai <amir@vadai.me> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-03-28 15:34:07 +03:00
Or Gerlitz	2de24fedb9	net/mlx5: Introduce alloc/dealloc modify header context commands Implement the low-level commands to support packet header re-write. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-03-28 15:34:05 +03:00
Or Gerlitz	2a69cb9ff7	net/mlx5: Introduce modify header structures, commands and steering action definitions Add the definitions related to creation/deletion of a modify header context and the modify header steering action which are used for HW packet header modify (re-write) as part of steering. Add as well the modify header id into two intermediate structs and set it to the FTE. Note that as the push/pop vlan steering actions are emulated by the ewitch management code, we're not breaking any compatibility while changing their values to make room for the modify header action which is not emulated and whose value is part of the FW API. The new bit values for the emulated actions are at the end of the possible range. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-03-28 15:34:04 +03:00
Or Gerlitz	a750276f81	net/mlx5: Reorder few command cases to reflect their natural order Move the commands related to scheduling elements and vport qos to a suitable location (according to the MLX5_CMD_OP enum values) in the command string and internal error helpers. This patch doesn't change any functionality. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-03-28 15:34:03 +03:00
Or Gerlitz	e753b2b50d	net/mlx5: Add helper to initialize a flow steering actions struct instance There are bunch of places in the code where the intermediate struct that keeps the elements related to flow actions is initialized with the same default values. Put that into a small DECLARE type helper. This patch doesn't change any functionality. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-03-28 15:34:01 +03:00
Or Gerlitz	aa0cbbae5d	net/mlx5e: Properly deal with resource cleanup when adding TC flow fails The code for adding tc fdb flows leaves things half set when it fails in the middle. Currently we are not leaking things (e.g eswitch vlan reference, encap reference and HW resources) since the main code to add flower rules does a cleanup by calling mlx5e_tc_del_flow(). This cleanup further works just b/c we're checking there if the HW rule for the flow we are attempting to delete is valid before touching it, and since under the current possible combinations of supported actions it's okay to go and blidnly deref or delete all the action related resources (encap, vlan). Instead, do things properly, namely make sure that if add flow fails we clean all what was allocated or referenced. Now, the flow delete code can blindly deref/deallocate both the rule and the actions related resources and when more action combinations are introduced (such as the upcoming header re-write) we are fine with clear and robust code. While here, align all of nic/fdb parse actions/add flow functions to get mlx5e_tc_flow struct param and pick the attributes or whatever else needed from there. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-03-28 15:34:00 +03:00
Or Gerlitz	17091853fc	net/mlx5e: Add intermediate struct for TC flow parsing attributes Add intermediate structure to store attributes parsed from TC filter matching/actions parts which are soon to be configured into the HW. Currently put there the flow matching spec after being parsed. More content to be added in down-stream patch. This patch doesn't change any functionality. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-03-28 15:33:59 +03:00
Or Gerlitz	3bc4b7bfa0	net/mlx5e: Add NIC attributes for offloaded TC flows Add structure that contains the attributes related to offloaded NIC flows. Currently it has the actions and flow tag. While here, do xmas tree cleanup of the TC configure function. This patch doesn't change any functionality. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-03-28 15:33:58 +03:00
Or Gerlitz	ecf5bb796b	net/mlx5e: Add prefix for e-switch offloaded TC flow attributes Add esw_ prefix to the flow attributes attached to offloaded e-switch TC flows. This is a pre-step to add attributes to offloaded NIC TC flows. Also, save one pointer space by using gcc's zero size array, this would be beneficial for environments where 100Ks (or Ms) of flows are offloaded. This patch doesn't change any functionality. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-03-28 15:33:57 +03:00
Saeed Mahameed	2e20a15120	net/mlx5e: Fail safe mtu and lro setting Use the new fail-safe channels switch mechanism to set new netdev mtu and lro settings. MTU and lro settings demand some HW configuration changes after new channels are created and ready for action. In order to unify switch channels routine for LRO and MTU changes, and maybe future configuration features, we now pass to it a modify HW function pointer to be invoked directly after old channels are de-activated and before new channels are activated. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com>	2017-03-27 15:08:24 +03:00
Saeed Mahameed	6f9485af40	net/mlx5e: Fail safe tc setup Use the new fail-safe channels switch mechanism to set up new tc parameters. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com>	2017-03-27 15:08:23 +03:00
Saeed Mahameed	be7e87f92b	net/mlx5e: Fail safe cqe compressing/moderation mode setting Use the new fail-safe channels switch mechanism to set new CQE compressing and CQE moderation mode settings. We also move RX CQE compression modify function out of en_rx file to a more appropriate place. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com>	2017-03-27 15:08:22 +03:00
Saeed Mahameed	546f18ed3f	net/mlx5e: Fail safe ethtool settings Use the new fail-safe channels switch mechanism to set new ethtool settings: - ring parameters - coalesce parameters - tx copy break parameters Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com>	2017-03-27 15:08:21 +03:00
Saeed Mahameed	55c2503dae	net/mlx5e: Introduce switch channels A fail safe helper functions that allows switching to new channels on the fly, In simple words: make_new_config(new_params) { new_channels = open_channels(new_params); if (!new_channels) return "Failed, but current channels are still active :)" switch_channels(new_channels); return "SUCCESS"; } Demonstrate mlx5e_switch_priv_channels usage in set channels ethtool callback and make it fail-safe using the new switch channels mechanism. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com>	2017-03-27 15:08:20 +03:00
Saeed Mahameed	9008ae0748	net/mlx5e: Minimize mlx5e_{open/close}_locked mlx5e_redirect_rqts_to_{channels,drop} and mlx5e_{add,del}_sqs_fwd_rules and Set real num tx/rx queues belong to mlx5e_{activate,deactivate}_priv_channels, for that we move those functions and minimize mlx5e_open/close flows. This will be needed in downstream patches to replace old channels with new ones without the need to call mlx5e_close/open. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com>	2017-03-27 15:08:19 +03:00
Saeed Mahameed	a43b25daef	net/mlx5e: CQ and RQ don't need priv pointer Remove mlx5e_priv pointer from CQ and RQ structs, it was needed only to access mdev pointer from priv pointer. Instead we now pass mdev where needed. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com>	2017-03-27 15:08:18 +03:00
Saeed Mahameed	6a9764efb2	net/mlx5e: Isolate open_channels from priv->params In order to have a clean separation between channels resources creation flows and current active mlx5e netdev parameters, make sure each resource creation function do not access priv->params, and only works with on a new fresh set of parameters. For this we add "new" mlx5e_params field to mlx5e_channels structure and use it down the road to mlx5e_open_{cq,rq,sq} and so on. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com>	2017-03-27 15:08:18 +03:00
Saeed Mahameed	acc6c5953a	net/mlx5e: Split open/close channels to stages As a foundation for safe config flow, a simple clear API such as (Open then Activate) where the "Open" handles the heavy unsafe creation operation and the "activate" will be fast and fail safe, to enable the newly created channels. For this we split the RQs/TXQ SQs and channels open/close flows to open => activate, deactivate => close. This will simplify the ability to have fail safe configuration changes in downstream patches as follows: make_new_config(new_params) { old_channels = current_active_channels; new_channels = create_channels(new_params); if (!new_channels) return "Failed, but current channels still active :)" deactivate_channels(old_channels); /* Can't fail / activate_channels(new_channels); / Can't fail */ close_channels(old_channels); current_active_channels = new_channels; return "SUCCESS"; } Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com>	2017-03-27 15:08:17 +03:00
Saeed Mahameed	b676f65389	net/mlx5e: Refactor refresh TIRs Rename mlx5e_refresh_tirs_self_loopback to mlx5e_refresh_tirs, as it will be used in downstream (Safe config flow) patches, and make it fail safe on mlx5e_open. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com>	2017-03-27 15:08:16 +03:00
Saeed Mahameed	a5f97fee74	net/mlx5e: Redirect RQT refactoring RQ Tables are always created once (on netdev creation) pointing to drop RQ and at that stage, RQ tables (indirection tables) are always directed to drop RQ. We don't need to use mlx5e_fill_{direct,indir}_rqt_rqns to fill the drop RQ in create RQT procedure. Instead of having separate flows to redirect direct and indirect RQ Tables to the current active channels Receive Queues (RQs), we unify the two flows by introducing mlx5e_redirect_rqt function and redirect_rqt_param struct. Combined, they provide one generic logic to fill the RQ table RQ numbers regardless of the RQ table purpose (direct/indirect). Demonstrated the usage with mlx5e_redirect_rqts_to_channels which will be called on mlx5e_open and with mlx5e_redirect_rqts_to_drop which will be called on mlx5e_close. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com>	2017-03-27 15:08:15 +03:00
Saeed Mahameed	ff9c852f91	net/mlx5e: Introduce mlx5e_channels Have a dedicated "channels" handler that will serve as channels (RQs/SQs/etc..) holder to help with separating channels/parameters operations, for the downstream fail-safe configuration flow, where we will create a new instance of mlx5e_channels with the new requested parameters and switch to the new channels on the fly. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com>	2017-03-27 15:08:14 +03:00
Saeed Mahameed	be4891af77	net/mlx5e: Set netdev->rx_cpu_rmap on netdev creation To simplify mlx5e_open_locked flow we set netdev->rx_cpu_rmap on netdev creation rather on netdev open, it is redundant to set it every time on mlx5e_open_locked. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com>	2017-03-27 15:08:13 +03:00
Saeed Mahameed	7f859ecfa8	net/mlx5e: Set SQ max rate on mlx5e_open_txqsq rather on open_channel Instead of iterating over the channel SQs to set their max rate, do it on SQ creation per TXQ SQ. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com>	2017-03-27 15:08:12 +03:00
Arkadi Sharshevsky	1312444374	mlxsw: spectrum_kvdl: Cosmetic kvdl allocator API change Currently the return allocated index and err value are multiplexed. This patch changes the API to decouple the ret value from the allocated index. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-25 19:56:15 -07:00
Saeed Mahameed	3139104861	net/mlx5e: Different SQ types Different SQ types (tx, xdp, ico) are growing apart, we separate them and remove unwanted parts in each one of them, to simplify data path and utilize data cache. Remove DB union from SQ structures since it is not needed anymore as we now have different SQ data type for each SQ. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 19:11:46 -07:00
Saeed Mahameed	33ad971186	net/mlx5e: Generalize SQ create/modify/destroy functions In the next patches we will introduce different SQ types, and we would want to reuse those functions, in this patch we make them agnostic to SQ type (txq, xdp, ico). Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 19:11:46 -07:00
Saeed Mahameed	3b77235b94	net/mlx5e: Proper names for SQ/RQ/CQ functions Rename mlx5e_{create,destroy}_{sq,rq,cq} to mlx5e_{alloc,free}_{sq,rq,cq}. Rename mlx5e_{enable,disable}_{sq,rq,cq} to mlx5e_{create,destroy}_{sq,rq,cq}. mlx5e_{enable,disable}_{sq,rq,cq} used to actually create/destroy the SQ in FW, so we rename them to align the functions names with FW semantics. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 19:11:46 -07:00
Saeed Mahameed	864b2d7153	net/mlx5e: Generalize tx helper functions for different SQ types In the next patches we will introduce different SQ types, for that we here generalize some TX helper functions to work with more basic SQ parameters, in order to re-use them for the different SQ types. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 19:11:46 -07:00
Saeed Mahameed	2239185ccd	net/mlx5e: Optimize XDP frame xmit XDP SQ has a fixed size WQE (MLX5E_XDP_TX_WQEBBS = 1) and only posts one kind of WQE (MLX5_OPCODE_SEND), Also we initialize SQ descriptors static fields once on open_xdpsq, rather than every time on critical path. Optimize the code in light of those facts and add a prefetch of the TX descriptor first thing in the xdp xmit function. Performance improvement: System: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz Test case Before Now improvement --------------------------------------------------------------- XDP TX (1 core) 13Mpps 13.7Mpps 5% Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 19:11:46 -07:00
Saeed Mahameed	39e12351a3	net/mlx5e: Poll XDP TX CQ before RX CQ Handle XDP TX completions before handling RX packets, to make sure more free space is available for XDP TX packets a moment before handling RX packets. Performance improvement: System: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz Test case Before Now improvement --------------------------------------------------------------- XDP Drop (1 core) 16.9Mpps 16.9Mpps No change XDP TX (1 core) 12Mpps 13Mpps 8% Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 19:11:46 -07:00
Saeed Mahameed	31871f87bb	net/mlx5e: Move XDP SQ instance into RQ To save many rq->channel->sq dereferences in fast-path. And rename it to xdpsq. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 19:11:45 -07:00
Saeed Mahameed	eba2db2bd2	net/mlx5e: Move mlx5e_rq struct declaration Move struct mlx5e_rq and friends to appear after mlx5e_sq declaration in en.h. We will need this for next patch to move the mlx5e_sq instance into mlx5e_rq struct for XDP SQs. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 19:11:45 -07:00
Saeed Mahameed	1c4bf94045	net/mlx5e: Move XDP completion functions to rx file XDP code belongs to RX path, move mlx5e_poll_xdp_tx_cq and mlx5e_free_xdp_tx_descs to en_rx.c. Rename them to mlx5e_poll_xdpsq_cq and mlx5e_free_xdpsq_descs. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 19:11:45 -07:00
Saeed Mahameed	aff2615763	net/mlx5e: Single bfreg (UAR) for all mlx5e SQs and netdevs One is sufficient since Blue Flame is not supported anymore. This will also come in handy for switchdev mode to save resources, since VF representors will use same single UAR as well for their own SQs. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 19:11:45 -07:00
Saeed Mahameed	6982ab6097	net/mlx5e: Xmit, no write combining mlx5e netdev Blue Flame (write combining) support demands a lot of overhead for a little latency gain for some special cases, this overhead is hurting the common case. Here we remove xmit Blue Flame support by creating all bfregs with no write combining for all SQs, and we remove a lot of BF logic and conditions from xmit data path. Simplify mlx5e_tx_notify_hw (doorbell function) by removing BF related code and by removing one memory barrier needed for WC mapped SQ doorbell buffers, which no longer exist. Performance improvement: System: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz Test case Before Now improvement --------------------------------------------------------------- TX packets (24 threads) 50Mpps 54Mpps 8% Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 19:11:45 -07:00
Saeed Mahameed	80fe326ab8	net/mlx5e: Use dma_rmb rather than rmb in CQE fetch routine Use dma_rmb in mlx5e_get_cqe rather than aggressive rmb (at least on some architectures), this should help improve the performance on such CPU archs where dma_rmb is optimized. Performance improvement: System: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz Test case Baseline Now improvement --------------------------------------------------------------- TX packets (24 threads) 45Mpps 50Mpps 11% TC stack Drop (1 core) 3.45Mpps 3.6Mpps 5% XDP Drop (1 core) 14Mpps 16.9Mpps 20% XDP TX (1 core) 10.4Mpps 12Mpps 15% Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 19:11:44 -07:00
Ido Schimmel	18281f2dab	mlxsw: spectrum: Query cell size from firmware As explained in the previous patch, the cell size may change in future devices, so query it from the firmware instead of hard coding it. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 13:53:29 -07:00
Ido Schimmel	f417f04da5	mlxsw: spectrum: Refactor port buffer configuration The sizes and thresholds of the priority group (PG) buffers are configured in cells, which represent a specific amount of bytes. The cell size can vary in different devices, so it's better to query it from the firmware than hard coding it. Refactor the code dealing with this value into different functions, so that it will be easier to make the conversion in the next patch. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 13:53:29 -07:00
Ido Schimmel	d3daae1b08	mlxsw: spectrum_buffers: Query shared buffer size from firmware Instead of hard coding the size of the shared buffer in the driver, query it from the firmware, as it may change in future devices. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 13:53:28 -07:00
Ido Schimmel	5ec2ee7dd2	mlxsw: Query maximum number of ports from firmware We currently hard code the maximum number of ports in the driver, but this may change in future devices, so query it from the firmware instead. Fallback to a maximum of 64 ports in case this number can't be queried. This should only happen in SwitchX-2 for which this number is correct. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 13:53:28 -07:00
Ido Schimmel	8494ab06e0	mlxsw: spectrum_router: Query number of LPM trees from firmware Instead of hard coding the number of LPM trees in the driver, query it from the firmware, as it may change in future devices. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 13:53:28 -07:00
Ido Schimmel	9a32562bec	mlxsw: Remove debugfs interface We don't use it during development and we can't extend it either, so remove it. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-23 21:29:32 -07:00
David S. Miller	16ae1f2236	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/broadcom/genet/bcmmii.c drivers/net/hyperv/netvsc.c kernel/bpf/hashtab.c Almost entirely overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-23 16:41:27 -07:00
Gal Pressman	8ab7e2ae15	net/mlx5e: Count LRO packets correctly RX packets statistics ('rx_packets' counter) used to count LRO packets as one, even though it contains multiple segments. This patch will increment the counter by the number of segments, and align the driver with the behavior of other drivers in the stack. Note that no information is lost in this patch due to 'rx_lro_packets' counter existence. Before, ethtool showed: $ ethtool -S ens6 \| egrep "rx_packets\|rx_lro_packets" rx_packets: 435277 rx_lro_packets: 35847 rx_packets_phy: 1935066 Now, we will see the more logical statistics: $ ethtool -S ens6 \| egrep "rx_packets\|rx_lro_packets" rx_packets: 1935066 rx_lro_packets: 35847 rx_packets_phy: 1935066 Fixes: `e586b3b0ba` ("net/mlx5: Ethernet Datapath files") Signed-off-by: Gal Pressman <galp@mellanox.com> Cc: kernel-team@fb.com Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-22 12:11:13 -07:00
Gal Pressman	d3a4e4da54	net/mlx5e: Count GSO packets correctly TX packets statistics ('tx_packets' counter) used to count GSO packets as one, even though it contains multiple segments. This patch will increment the counter by the number of segments, and align the driver with the behavior of other drivers in the stack. Note that no information is lost in this patch due to 'tx_tso_packets' counter existence. Before, ethtool showed: $ ethtool -S ens6 \| egrep "tx_packets\|tx_tso_packets" tx_packets: 61340 tx_tso_packets: 60954 tx_packets_phy: 2451115 Now, we will see the more logical statistics: $ ethtool -S ens6 \| egrep "tx_packets\|tx_tso_packets" tx_packets: 2451115 tx_tso_packets: 60954 tx_packets_phy: 2451115 Fixes: `e586b3b0ba` ("net/mlx5: Ethernet Datapath files") Signed-off-by: Gal Pressman <galp@mellanox.com> Cc: kernel-team@fb.com Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-22 12:11:13 -07:00
Maor Gottlieb	5f40b4ed97	net/mlx5: Increase number of max QPs in default profile With ConnectX-4 sharing SRQs from the same space as QPs, we hit a limit preventing some applications to allocate needed QPs amount. Double the size to 256K. Fixes: `e126ba97db` ('mlx5: Add driver for Mellanox Connect-IB adapters') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-22 12:11:13 -07:00
Paul Blakey	1ad9a00ae0	net/mlx5e: Avoid supporting udp tunnel port ndo for VF reps This was added to allow the TC offloading code to identify offloading encap/decap vxlan rules. The VF reps are effectively related to the same mlx5 PCI device as the PF. Since the kernel invokes the (say) delete ndo for each netdev, the FW erred on multiple vxlan dst port deletes when the port was deleted from the system. We fix that by keeping the registration to be carried out only by the PF. Since the PF serves as the uplink device, the VF reps will look up a port there and realize if they are ok to offload that. Tested: <SETUP VFS> <SETUP switchdev mode to have representors> ip link add vxlan1 type vxlan id 44 dev ens5f0 dstport 9999 ip link set vxlan1 up ip link del dev vxlan1 Fixes: `4a25730eb2` ('net/mlx5e: Add ndo_udp_tunnel_add to VF representors') Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-22 12:11:13 -07:00
Or Gerlitz	09c91ddf2c	net/mlx5e: Use the proper UAPI values when offloading TC vlan actions Currently we use the non UAPI values and we miss erring on the modify action which is not supported, fix that. Fixes: `8b32580df1` ('net/mlx5e: Add TC vlan action for SRIOV offloads') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reported-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-22 12:11:13 -07:00
Roi Dayan	375f51e2b5	net/mlx5: E-Switch, Don't allow changing inline mode when flows are configured Changing the eswitch inline mode can potentially cause already configured flows not to match the policy. E.g. set policy L4, add some L4 rules, set policy to L2 --> bad! Hence we disallow it. Keep track of how many offloaded rules are now set and refuse inline mode changes if this isn't zero. Fixes: `bffaa91658` ("net/mlx5: E-Switch, Add control for inline mode") Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-22 12:11:12 -07:00
Or Gerlitz	d85cdccbb3	net/mlx5e: Change the TC offload rule add/del code path to be per NIC or E-Switch Refactor the code to deal with add/del TC rules to have handler per NIC/E-switch offloading use case, and push the latter into the e-switch code. This provides better separation and is to be used in down-stream patch for applying a fix. Fixes: `bffaa91658` ("net/mlx5: E-Switch, Add control for inline mode") Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-22 12:11:12 -07:00
Or Gerlitz	1f30a86c58	net/mlx5: Add missing entries for set/query rate limit commands The switch cases for the rate limit set and query commands were missing, which could get us wrong under fw error or driver reset flow, fix that. Fixes: `1466cc5b23` ('net/mlx5: Rate limit tables support') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-22 12:11:12 -07:00
Colin Ian King	c7cd4c9bf8	mlxsw: spectrum: fix swapped order of arguments packets and bytes The arguments packets and bytes to call mlxsw_sp_acl_rule_get_stats are in the wrong order. Fix this by swapping them. Detected by CoverityScan, CID#1419705 ("Arguments in wrong order") Fixes: `7c1b8eb175` ("mlxsw: spectrum: Add support for TC flower offload statistics") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-22 10:59:12 -07:00
Or Gerlitz	abbdf4bd7d	mlxsw: spectrum: Align the matchall default case returned value Align the default case for matchall offload with what's there for flower. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Acked-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-21 17:16:10 -07:00
Arkadi Sharshevsky	bf95233e20	mlxsw: spectrum: Cosmetic naming change Currently the struct representing router interface "mlxsw_sp_rif" is reffered as "r" in various places in the driver. Furthermore it contains a member which specify the index which is called "rif". This patch change "r" to "rif" and "rif" to "rif_index". Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-21 17:16:10 -07:00
Jack Morgenstein	4cbe4dac82	net/mlx4_core: Avoid delays during VF driver device shutdown Some Hypervisors detach VFs from VMs by instantly causing an FLR event to be generated for a VF. In the mlx4 case, this will cause that VF's comm channel to be disabled before the VM has an opportunity to invoke the VF device's "shutdown" method. For such Hypervisors, there is a race condition between the VF's shutdown method and its internal-error detection/reset thread. The internal-error detection/reset thread (which runs every 5 seconds) also detects a disabled comm channel. If the internal-error detection/reset flow wins the race, we still get delays (while that flow tries repeatedly to detect comm-channel recovery). The cited commit fixed the command timeout problem when the internal-error detection/reset flow loses the race. This commit avoids the unneeded delays when the internal-error detection/reset flow wins. Fixes: `d585df1c5c` ("net/mlx4_core: Avoid command timeouts during VF driver device shutdown") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reported-by: Simon Xiao <sixiao@microsoft.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-16 20:14:51 -07:00
Ido Schimmel	c7f6e6658b	mlxsw: spectrum_router: Don't abort on l3mdev rules Now that port netdevs can be enslaved to a VRF master we need to make sure the device's routing tables won't be flushed upon the insertion of a l3mdev rule. Note that we assume the notified l3mdev rule is a simple rule as used by the VRF master. We don't check for the presence of other selectors such as 'iif' and 'oif'. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-16 10:18:35 -07:00
Ido Schimmel	3d70e458be	mlxsw: spectrum_router: Add support for VRFs on top of bridges In a similar fashion to the previous patch, allow bridges and VLAN devices on top of bridges to be enslaved to a VRF master device. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-16 10:18:35 -07:00
Ido Schimmel	7179eb5acd	mlxsw: spectrum_router: Add support for VRFs Allow port netdevs, LAG and VLAN devices stacked on top of these to be enslaved to a VRF master device. Upon enslavement, create a router interface (RIF) for the enslaved netdev and associate it with a virtual router (VR) based on the VRF's table ID. If a RIF already exists for the netdev (f.e., due to the existence of an IP address), then it's deleted and a new one is created with the appropriate VR binding. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-16 10:18:35 -07:00
Ido Schimmel	9db032bb1e	mlxsw: spectrum_router: Don't destroy RIF if L3 slave We usually destroy the netdev's router interface (RIF) when the last IP address is removed from it. However, we shouldn't do that if it's enslaved to an L3 master device. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-16 10:18:34 -07:00
Ido Schimmel	57837885e3	mlxsw: spectrum_router: Associate RIFs with correct VR When a router interface (RIF) is created due to a netdev being enslaved to a VRF master, then it should be associated with the appropriate virtual router (VR) and not the default one. If netdev is a VRF slave, lookup the VR based on the VRF's table ID. Otherwise default to the MAIN table. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-16 10:18:34 -07:00
Ido Schimmel	5d7bfd1419	ipv4: fib_rules: Dump FIB rules when registering FIB notifier In commit `c3852ef7f2` ("ipv4: fib: Replay events when registering FIB notifier") we dumped the FIB tables and replayed the events to the passed notification block. However, we merely sent a RULE_ADD notification in case custom rules were in use. As explained in previous patches, this approach won't work anymore. Instead, we should notify the caller about all the FIB rules and let it act accordingly. Upon registration to the FIB notification chain, replay a RULE_ADD notification for each programmed FIB rule, custom or not. The integrity of the dump is ensured by the mechanism introduced in the above mentioned commit. Prevent regressions by making sure current listeners correctly sanitize the notified rules. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-16 10:18:34 -07:00
Amritha Nambiar	56f36acd21	mqprio: Modify mqprio to pass user parameters via ndo_setup_tc. The configurable priority to traffic class mapping and the user specified queue ranges are used to configure the traffic class, overriding the hardware defaults when the 'hw' option is set to 0. However, when the 'hw' option is non-zero, the hardware QOS defaults are used. This patch makes it so that we can pass the data the user provided to ndo_setup_tc. This allows us to pull in the queue configuration if the user requested it as well as any additional hardware offload type requested by using a value other than 1 for the hw value. Finally it also provides a means for the device driver to return the level supported for the offload type via the qopt->hw value. Previously we were just always assuming the value to be 1, in the future values beyond just 1 may be supported. Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-15 15:20:27 -07:00
David S. Miller	101c431492	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/broadcom/genet/bcmgenet.c net/core/sock.c Conflicts were overlapping changes in bcmgenet and the lockdep handling of sockets. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-15 11:59:10 -07:00
Jiri Pirko	e9093b1183	mlxsw: reg: Fix SPVMLR max record count The num_rec field is 8 bit, so the maximal count number is 255. This fixes vlans learning not being enabled for wider ranges than 255. Fixes: `a4feea74cd` ("mlxsw: reg: Add Switch Port VLAN MAC Learning register definition") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-14 11:35:11 -07:00
Jiri Pirko	f004ec065b	mlxsw: reg: Fix SPVM max record count The num_rec field is 8 bit, so the maximal count number is 255. This fixes vlans not being enabled for wider ranges than 255. Fixes: `b2e345f9a4` ("mlxsw: reg: Add Switch Port VID and Switch Port VLAN Membership registers definitions") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-14 11:35:11 -07:00
Arkadi Sharshevsky	7c1b8eb175	mlxsw: spectrum: Add support for TC flower offload statistics Add support for TC flower offload statistics including number of packets, bytes and last use timestamp. Currently the statistics are gathered on a per-rule basis. Signed-off-by: Arkadi Sharshvesky <arkadis@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-12 23:50:15 -07:00
Arkadi Sharshevsky	4817072950	mlxsw: spectrum: Add support for counters on TCAM entries Add support for packets and byte statistics on TCAM entries. The counters are allocated from the generic flow counters pool. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-12 23:50:14 -07:00
Arkadi Sharshevsky	938ab60860	mlxsw: spectrum: Add support for Policing and Counting action block Add support for Policing and Counting action block. This action block will be used to bind counter to TCAM entries. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-12 23:50:14 -07:00
Arkadi Sharshevsky	446a154187	mlxsw: spectrum: Add periodic ACL rule activity update Introduce periodic task for dumping the activity status for the ACL rule TCAM entries. This is done in order to emulate last use statistics. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.comi> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-12 23:50:14 -07:00
Arkadi Sharshevsky	096e914f69	mlxsw: spectrum: Add support for direct rule access Currently the ACL rules can be accessed only by hashing. In order to dump the activity the rules are also placed in a list. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-12 23:50:14 -07:00
Arkadi Sharshevsky	7fd056c2ef	mlxsw: spectrum_acl_tcam: Add support for retrieving TCAM entry activity Add support for retrieving TCAM entry activity. In order to support ACL rule activity corresponding TCAM entry should be queried. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-12 23:50:14 -07:00
Arkadi Sharshevsky	1abcbcc292	mlxsw: spectrum: Add support for generic flow counter allocation Add support for allocating generic flow counter. Generic flow counter can count packets or packets and bytes and can be assigned to different hardware processes. First use will be for counting packets and bytes of ACL rules, and will be introduced in the following patches. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Reviewed-by: Ido schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-12 23:50:13 -07:00
Arkadi Sharshevsky	5766532abc	mlxsw: reg: Add Monitoring General Purpose Counter Set register The MGPC register retrieves generic flow counter value. It will be used to query ACL counters. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-12 23:50:13 -07:00
Arkadi Sharshevsky	ff7b0d2720	mlxsw: spectrum: Add support for counter allocator Add implementation for counter allocator. The ASIC has special memory pool for various counting purposes. Counter memory is distributed between equal size banks. The static sub-pool configuration should specify the following parameters for each sub-pool: - Number of required banks. - Maximum entry size. Each module can add dedicated sub-pool or use existing one. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-12 23:50:13 -07:00
Eugenia Emantayev	ea29bd304d	net/mlx5e: Fix loopback selftest Change packet type handler to ETH_P_IP instead of ETH_P_ALL since we are already expecting an IP packet. Also, using ETH_P_ALL will cause the loopback test packet type handler to be called on all outgoing packets, especially our own self loopback test SKB, which will be validated on xmit as well, and we don't want that. Tested with: ethtool -t ethX validated that the loopback test passes. Fixes: `0952da791c` ('net/mlx5e: Add support for loopback selftest') Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-10 10:03:26 -08:00
Or Gerlitz	65ba8fb7d5	net/mlx5e: Avoid wrong identification of rules on deletion When deleting offloaded TC flows, we must correctly identify E-switch rules. The current check could get us wrong w.r.t to rules set on the PF. Since it's possible to set NIC rules on the PF, switch to SRIOV offloads mode and then attempt to delete a NIC rule. To solve that, we add a flags field to offloaded rules, set it on creation time and use that over the code where currently needed. Fixes: `8b32580df1` ('net/mlx5e: Add TC vlan action for SRIOV offloads') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-10 10:03:26 -08:00
Huy Nguyen	33e21c5952	net/mlx5e: remove IEEE/CEE mode check when setting DCBX mode Currently, the function setdcbx fails if the request dcbx mode is either IEEE or CEE. We remove the IEEE/CEE mode check because we support both IEEE and CEE interfaces. Fixes: `3a6a931dfb` ("net/mlx5e: Support DCBX CEE API") Signed-off-by: Huy Nguyen <huyn@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-10 10:03:26 -08:00
Daniel Jurgens	5d47f6c89d	net/mlx5: Don't save PCI state when PCI error is detected When a PCI error is detected the PCI state could be corrupt, don't save it in that flow. Save the state after initialization. After restoring the PCI state during slot reset save it again, restoring the state destroys the previously saved state info. Fixes: `05ac2c0b74` ('net/mlx5: Fix race between PCI error handlers and health work') Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-10 10:03:26 -08:00
Paul Blakey	af36370569	net/mlx5: Fix create autogroup prev initializer The autogroups list is a list of non overlapping group boundaries sorted by their start index. If the autogroups list wasn't empty and an empty group slot was found at the start of the list, the new group was added to the end of the list instead of the beginning, as the prev initializer was incorrect. When this was repeated, it caused multiple groups to have overlapping boundaries. Fixed that by correctly initializing the prev pointer to the start of the list. Fixes: `eccec8da3b` ('net/mlx5: Keep autogroups list ordered') Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-10 10:03:26 -08:00
Ido Schimmel	b5d90e6d6b	mlxsw: spectrum_router: Make abort mechanism VR-aware When the abort mechanism is invoked it binds the first virtual router (VR) to an LPM tree and inserts a default route to direct packets to the CPU. With VRFs, we can have router interfaces (RIFs) bound to multiple VRs, so we need to make sure packets are trapped from all VRs and not just the first one. Upon abort invocation, bind all active VRs to the same LPM tree and insert a default route in each. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-10 09:36:06 -08:00
Ido Schimmel	6913229eea	mlxsw: spectrum_router: Explicitly Associate RIFs with VRs Up until now we implicitly associated all the router interfaces (RIFs) with the first virtual router (VR). This must be changed in order to enable VRF offload. Otherwise, a packet received via a VRF slave would do a FIB lookup in the same table used by other VRFs. Instead, bind the RIF to a VR according to the table where FIB lookup should be performed for packets received via the RIF. Currently, we only care about the MAIN and LOCAL tables (which we squash together). Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-10 09:36:06 -08:00
Ido Schimmel	76610ebbde	mlxsw: spectrum_router: Refactor virtual router handling A virtual router (VR) is an entity within the device to which routing tables and interfaces can be bound to. It can be used to implement VRFs. In the initial implementation we associated the VR with a specific protocol (e.g., IPv4) and an LPM tree. However, this isn't really accurate, as the same VR can be used for both IPv4 and IPv6 traffic, by binding a different LPM tree to a {VR, Proto} pair. This patch aims to restructure the VR code according to the above logic, so that VRs are more accurately represented by the driver's data structures. The main motivation behind this change is to prepare the driver for VRF offload. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-10 09:36:06 -08:00
Ido Schimmel	382dbb4014	mlxsw: spectrum_router: Simplify LPM tree allocation When looking for a new LPM tree we should always consider all the unused trees. It doesn't matter if the new tree is required due to changes in currently used prefixes inside an existing routing table or because a route was inserted into an empty table. Both cases are functionally identical and therefore should be treated the same. When looking for a new LPM tree, consider all unused trees and don't reserve trees for specific cases. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-10 09:36:06 -08:00
Ido Schimmel	4724ba561a	mlxsw: spectrum_router: Place RIF related code with router code The inetaddr notification block is currently implemented in the main driver file, but this isn't really appropriate, as it mainly creates and destroys router interfaces (RIFs) which belong with the rest of the router code. This will become even more apparent later on when we'll need to bind these RIFs to virtual routers according to the VRF's table. Structure the driver better and prevent unnecessary function exports by moving the RIF related code with the rest of the router code. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-10 09:36:06 -08:00
Ido Schimmel	97989ee0f5	mlxsw: spectrum_router: Allow more route types to be programmed Allow 'unreachable', 'blackhole' and 'prohibit' route types to be programmed into the device by sending any packet hitting them to the CPU. This is needed so that users will be able to program a default route into the VRF's table, thereby preventing lookup from leaking to other tables. Audit the code paths to make sure we don't rely on the presence of a nexthop netdev, as it doesn't exist for above mentioned route types. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-10 09:36:06 -08:00
Ido Schimmel	f4a761d203	mlxsw: spectrum: Destroy RIFs based on last removed address We only use the RIF reference count to determine when the last IP address was removed, but instead we can just test 'in_dev->ifa_list'. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-10 09:36:06 -08:00
Ido Schimmel	186962ebb7	mlxsw: spectrum: Associate PVID vPort with appropriate netdev When a VLAN device is configured on top of a LAG device (f.e., bond0.10), a vPort is created on top of each of the LAG's slaves and its 'dev' pointer is set to the VLAN device. This is in contrast to the implicit PVID vPort (representing 'bond0'), whose 'dev' pointer keeps pointing to the port netdev itself (f.e., 'sw1p1'). Make both cases consistent by setting their 'dev' pointer to the actual netdev they represent. Either the LAG device itself (in the case of the PVID vPort) or the VLAN device on top of it. This will later allow us to more easily understand for which netdev we should create the router interface (RIF) upon enslavement to a VRF master. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-10 09:36:06 -08:00
Ido Schimmel	1f88061ee4	mlxsw: spectrum: Don't assume upper device's type When an upper device is configured on top of a vPort we make sure it's a bridge master during PRECHANGEUPPER and fail otherwise. Therefore, when CHANGEUPPER is later received we don't bother checking the upper's type. Make the code more extendable in preparation for VRF uppers, by checking the upper's type. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-10 09:36:06 -08:00
Ido Schimmel	b414970e21	mlxsw: spectrum: Sanitize bridge's upper devices We're going to allow bridges stacked on top of port netdevs to be enslaved to a VRF, but for now, only VLAN uppers of the VLAN-aware bridge are supported. Sanitize any other bridge upper. This is consistent with the way we sanitize port netdevs' uppers. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-10 09:36:06 -08:00
Petr Machata	9caab08a76	mlxsw: spectrum: Add support for flower matches on VLAN ID, PCP Introduce MLXSW_AFK_ELEMENT_VID, PCP and declare them in afk_element infos that contain them. Use the elements when VLAD ID or priority are used in the flow. Also add MLXSW_AFK_ELEMENT_VID, PCP to mlxsw_sp_acl_tcam_pattern_ipv4. Both items are included in mlxsw_sp_afk_element_info_l2_dmac, resp. _smac, and both MLXSW_AFK_ELEMENT_SMAC and _DMAC are already in the pattern. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-09 18:35:35 -08:00
Petr Machata	a150201a70	mlxsw: spectrum: Add support for vlan modify TC action Add VLAN action offloading. Invoke it from Spectrum flower handler for "vlan modify" actions. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-09 18:35:35 -08:00
Eric Dumazet	68b8df4644	mlx4: remove duplicate code in mlx4_en_process_rx_cq() We should keep one way to build skbs, regardless of GRO being on or off. Note that I made sure to defer as much as possible the point we need to pull data from the frame, so that future prefetch() we might add are more effective. These skb attributes derive from the CQE or ring : ip_summed, csum hash vlan offload hwtstamps queue_mapping As a bonus, this patch removes mlx4 dependency on eth_get_headlen() which is very often broken enough to give us headaches. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-09 09:54:46 -08:00
Eric Dumazet	6969cf0fdb	mlx4: make validate_loopback() more generic Testing a boolean in fast path is not worth duplicating the code allocating packets, when GRO is on or off. If this proves to be a problem, we might later use a jump label. Next patch will remove this duplicated code and ease code review. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-09 09:54:46 -08:00
Eric Dumazet	02e6fd3e55	mlx4: factorize page_address() calls We need to compute the frame virtual address at different points. Do it once. Following patch will use the new va address for validate_loopback() Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-09 09:54:46 -08:00
Eric Dumazet	9e8c0395a7	mlx4: do not access rx_desc from mlx4_en_process_rx_cq() Instead of fetching dma address from rx_desc->data[0].addr, prefer using frags[0].dma + frags[0].page_offset to avoid a potential cache line miss. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-09 09:54:46 -08:00
Eric Dumazet	7d7bfc6a3f	mlx4: add rx_alloc_pages counter in ethtool -S This new counter tracks number of pages that we allocated for one port. lpaa24:~# ethtool -S eth0 \| egrep 'rx_alloc_pages\|rx_packets' rx_packets: 306755183 rx_alloc_pages: 932897 Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-09 09:54:46 -08:00
Eric Dumazet	34db548bfb	mlx4: add page recycling in receive path Same technique than some Intel drivers, for arches where PAGE_SIZE = 4096 In most cases, pages are reused because they were consumed before we could loop around the RX ring. This brings back performance, and is even better, a single TCP flow reaches 30Gbit on my hosts. v2: added full memset() in mlx4_en_free_frag(), as Tariq found it was needed if we switch to large MTU, as priv->log_rx_info can dynamically be changed. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-09 09:54:46 -08:00
Eric Dumazet	b5a54d9a31	mlx4: use order-0 pages for RX Use of order-3 pages is problematic in some cases. This patch might add three kinds of regression : 1) a CPU performance regression, but we will add later page recycling and performance should be back. 2) TCP receiver could grow its receive window slightly slower, because skb->len/skb->truesize ratio will decrease. This is mostly ok, we prefer being conservative to not risk OOM, and eventually tune TCP better in the future. This is consistent with other drivers using 2048 per ethernet frame. 3) Because we allocate one page per RX slot, we consume more memory for the ring buffers. XDP already had this constraint anyway. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-09 09:54:46 -08:00
Eric Dumazet	60c7f5ae54	mlx4: removal of frag_sizes[] We will soon use order-0 pages, and frag truesize will more precisely match real sizes. In the new model, we prefer to use <= 2048 bytes fragments, so that we can use page-recycle technique on PAGE_SIZE=4096 arches. We will still pack as much frames as possible on arches with big pages, like PowerPC. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-09 09:54:46 -08:00
Eric Dumazet	acd7628de0	mlx4: reduce rx ring page_cache size We only need to store the page and dma address. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-09 09:54:46 -08:00
Eric Dumazet	d85f6c14e9	mlx4: rx_headroom is a per port attribute No need to duplicate it per RX queue / frags. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-09 09:54:46 -08:00
Eric Dumazet	aaca121dd6	mlx4: get rid of frag_prefix_size Using per frag storage for frag_prefix_size is really silly. mlx4_en_complete_rx_desc() has all needed info already. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-09 09:54:46 -08:00
Eric Dumazet	159ddfd2ca	mlx4: remove order field from mlx4_en_frag_info This is really a port attribute, no need to duplicate it per RX queue and per frag. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-09 09:54:46 -08:00
Eric Dumazet	69ba943151	mlx4: dma_dir is a mlx4_en_priv attribute No need to duplicate it for all queues and frags. num_frags & log_rx_info become u8 to save space. u8 accesses are a bit faster than u16 anyway. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-09 09:54:46 -08:00
Ido Schimmel	61793af6ab	mlxsw: pci: Remove unused bit The overrun ignore bit isn't supported by the device's firmware and was recently removed from the programmer's reference manual (PRM). Remove it from the driver as well. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-08 23:17:38 -08:00
Jiri Pirko	1182e53639	mlxsw: spectrum: Fix helper function and port variable names Commit `dd82364c3a` ("mlxsw: Flip to the new dev walk API") did some small changes in mlxsw code, but it did not respect the naming conventions. So fix this now. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-08 23:17:38 -08:00
Jiri Pirko	713c43b315	mlxsw: spectrum_flower: Remove bogus warns in mlxsw_sp_flower_destroy This warnings may be hit even in case they should not - in case user puts a TC-flower rule which failed to be offloaded. So just remove them. Reported-by: Petr Machata <petrm@mellanox.com> Reported-by: Ido Schimmel <idosch@mellanox.com> Fixes: commit `7aa0f5aa90` ("mlxsw: spectrum: Implement TC flower offload") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-08 23:15:58 -08:00
Arnd Bergmann	0253f2681f	net/mlx5e: add IPV6 dependency The ethernet support now calls directly into the ipv6 core code, which fails if IPV6 is a loadable module but mlx5 is built-in: drivers/net/ethernet/mellanox/mlx5/core/en_tc.o: In function `mlx5e_create_encap_header_ipv6': en_tc.c:(.text.mlx5e_create_encap_header_ipv6+0x110): undefined reference to `ip6_route_output_flags' This adds a dependency to ensure that MLX5_CORE_EN can only be built if we are able link the kernel successfully. The downside is that the ethernet option can be hidden. Alternatively we could make MLX5_CORE depend on "IPV6 \|\| !IPV6", which would force MLX5_CORE to be a module when IPV6 is, including in configurations where we don't use the ethernet support at all. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-07 12:25:24 -08:00
Ido Schimmel	f7df4923fa	mlxsw: spectrum_router: Avoid potential packets loss When the structure of the LPM tree changes (f.e., due to the addition of a new prefix), we unbind the old tree and then bind the new one. This may result in temporary packet loss. Instead, overwrite the old binding with the new one. Fixes: `6b75c4807d` ("mlxsw: spectrum_router: Add virtual router management") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-01 09:50:58 -08:00
Eric Dumazet	47d3a07528	net/mlx4_en: fix overflow in mlx4_en_init_timestamp() The cited commit makes a great job of finding optimal shift/multiplier values assuming a 10 seconds wrap around, but forgot to change the overflow_period computation. It overflows in cyclecounter_cyc2ns(), and the final result is 804 ms, which is silly. Lets simply use 5 seconds, no need to recompute this, given how it is supposed to work. Later, we will use a timer instead of a work queue, since the new RX allocation schem will no longer need mlx4_en_recover_from_oom() and the service_task firing every 250 ms. Fixes: `31c128b66e` ("net/mlx4_en: Choose time-stamping shift value according to HW frequency") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tariq Toukan <tariqt@mellanox.com> Cc: Eugenia Emantayev <eugenia@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-26 15:39:43 -05:00
Linus Torvalds	190c3ee06a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Some 'const'ing in qlogic networking drivers, from Bhumika Goyal. 2) Fix scheduling while atomic in l2tp network namespace exit by deferring the work to the workqueue. From Ridge Kennedy. 3) Fix use after free in dccp timewait handling, from Andrey Ryabinin. 4) mlx5e CQE compression engine not initialized properly, from Tariq Toukan. 5) Some UAPI header fixes from Dmitry V. Levin. 6) Don't overwrite module parameter value in mlx4 driver, from Majd Dibbiny. 7) Fix divide by zero in xt_hashlimit netfilter module, from Alban Browaeys. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (35 commits) bpf: Fix bpf_xdp_event_output net/mlx4_en: Use __skb_fill_page_desc() net/mlx4_core: Use cq quota in SRIOV when creating completion EQs net/mlx4_core: Fix VF overwrite of module param which disables DMFS on new probed PFs net/mlx4: Spoofcheck and zero MAC can't coexist net/mlx4: Change ENOTSUPP to EOPNOTSUPP uapi: fix linux/rds.h userspace compilation errors uapi: fix linux/seg6.h and linux/seg6_iptunnel.h userspace compilation errors lib: Remove string from parman config selection forcedeth: Remove return from a void function bpf: fix spelling mistake: "proccessed" -> "processed" uapi: fix linux/llc.h userspace compilation error uapi: fix linux/ip6_tunnel.h userspace compilation errors net/mlx5e: Fix wrong CQE decompression net/mlx5e: Update MPWQE stride size when modifying CQE compress state net/mlx5e: Fix broken CQE compression initialization net/mlx5e: Do not reduce LRO WQE size when not using build_skb net/mlx5e: Register/unregister vport representors on interface attach/detach net/mlx5e: s390 system compilation fix tcp: account for ts offset only if tsecr not zero ...	2017-02-23 11:36:06 -08:00
Linus Torvalds	af17fe7a63	Mellanox specific updates for 4.11 merge window Because the Mellanox code required being based on a net-next tree, I keept it separate from the remainder of the RDMA stack submission that is based on 4.10-rc3. This branch contains: - Various mlx4 and mlx5 fixes and minor changes - Support for adding a tag match rule to flow specs - Support for cvlan offload operation for raw ethernet QPs - A change to the core IB code to recognize raw eth capabilities and enumerate them (touches non-Mellanox code) - Implicit On-Demand Paging memory registration support -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJYrx+WAAoJELgmozMOVy/du70P/1kpW2xY9Le04c3K7na2XOYl AUVIDrW/8Go63tpOaM7jBT3k4GlwVFr3IOmBpS24KbW/THxjhyUeP5L5+z2x+go+ jkQOgtPWWEHr5zP3MzsNyB8fDx1YQOnJwEXxybQRW/cbw4CLjnhP+ezd6FdV/3Yy pPEqDVlAErzvNweG+n2r1pjcUbR8uneC3inyMLnyzUBz4CHKmC8fgD3/qJIM+DNb gtFT5xHFIXKCigWdQ/EwsTDcHub43V8OXlI5sO7loG6vToOUATMkjI4oOUNhDmYS X7XLN3yRK9QHEfb5kutXIZEWzTGh7LiFtUYGaNNYqqzDfSiMRc9NC5kTOfplEXDV Uo+AGb6Fh1zYIOzNk7o+tazIv3LaLv6+Fcm+9bbe0VUIqasaylsePqaTwMuIzx/I xP5nitmd5lbYo8WdlasVdG6mH1DlJEUbU30v4DpmTpxCP6jGpog7lexyGyF3TgzS NhnG0IiIClWh3WQ2/GdsFK/obIdFkpLeASli1hwD81vzPfly9zc2YpgqydZI3WCr q6hTXYnANcP6+eciCpQPO7giRdXdiKey08Uoq/2jxb7Qbm4daG6UwopjvH9/lm1F m6UDaDvzNYm+Rx+bL/+KSx9JO9+fJB1L51yCmvLGpWi6yJI4ZTfanHNMBsCua46N Kev/DSpIAzX1WOBkte+a =rspQ -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull Mellanox rdma updates from Doug Ledford: "Mellanox specific updates for 4.11 merge window Because the Mellanox code required being based on a net-next tree, I keept it separate from the remainder of the RDMA stack submission that is based on 4.10-rc3. This branch contains: - Various mlx4 and mlx5 fixes and minor changes - Support for adding a tag match rule to flow specs - Support for cvlan offload operation for raw ethernet QPs - A change to the core IB code to recognize raw eth capabilities and enumerate them (touches non-Mellanox code) - Implicit On-Demand Paging memory registration support" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (40 commits) IB/mlx5: Fix configuration of port capabilities IB/mlx4: Take source GID by index from HW GID table IB/mlx5: Fix blue flame buffer size calculation IB/mlx4: Remove unused variable from function declaration IB: Query ports via the core instead of direct into the driver IB: Add protocol for USNIC IB/mlx4: Support raw packet protocol IB/mlx5: Support raw packet protocol IB/core: Add raw packet protocol IB/mlx5: Add implicit MR support IB/mlx5: Expose MR cache for mlx5_ib IB/mlx5: Add null_mkey access IB/umem: Indicate that process is being terminated IB/umem: Update on demand page (ODP) support IB/core: Add implicit MR flag IB/mlx5: Support creation of a WQ with scatter FCS offload IB/mlx5: Enable QP creation with cvlan offload IB/mlx5: Enable WQ creation and modification with cvlan offload IB/mlx5: Expose vlan offloads capabilities IB/uverbs: Enable QP creation with cvlan offload ...	2017-02-23 11:27:49 -08:00
Eric Dumazet	7f0137e2ef	net/mlx4_en: Use __skb_fill_page_desc() Or we might miss the fact that a page was allocated from memory reserves. Fixes: `dceeab0e52` ("mlx4: support __GFP_MEMALLOC for rx") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-23 10:57:57 -05:00
Jack Morgenstein	6ed63d845e	net/mlx4_core: Use cq quota in SRIOV when creating completion EQs When creating EQs to handle CQ completion events for the PF or for VFs, we create enough EQE entries to handle completions for the max number of CQs that can use that EQ. When SRIOV is activated, the max number of CQs a VF (or the PF) can obtain is its CQ quota (determined by the Hypervisor resource tracker). Therefore, when creating an EQ, the number of EQE entries that the VF should request for that EQ is the CQ quota value (and not the total number of CQs available in the FW). Under SRIOV, the PF, also must use its CQ quota, because the resource tracker also controls how many CQs the PF can obtain. Using the FW total CQs instead of the CQ quota when creating EQs resulted wasting MTT entries, due to allocating more EQEs than were needed. Fixes: `5a0d0a6161` ("mlx4: Structures and init/teardown for VF resource quotas") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reported-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-23 10:57:57 -05:00
Majd Dibbiny	95f1ba9a24	net/mlx4_core: Fix VF overwrite of module param which disables DMFS on new probed PFs In the VF driver, module parameter mlx4_log_num_mgm_entry_size was mistakenly overwritten -- and in a manner which overrode the device-managed flow steering option encoded in the parameter. log_num_mgm_entry_size is a global module parameter which affects all ConnectX-3 PFs installed on that host. If a VF changes log_num_mgm_entry_size, this will affect all PFs which are probed subsequent to the change (by disabling DMFS for those PFs). Fixes: `3c439b5586` ("mlx4_core: Allow choosing flow steering mode") Signed-off-by: Majd Dibbiny <majd@mellanox.com> Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-23 10:57:57 -05:00
Eugenia Emantayev	745d8ae462	net/mlx4: Spoofcheck and zero MAC can't coexist Spoofcheck can't be enabled if VF MAC is zero. Vice versa, can't zero MAC if spoofcheck is on. Fixes: `8f7ba3ca12` ('net/mlx4: Add set VF mac address support') Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-23 10:57:56 -05:00
Or Gerlitz	423b3aecf2	net/mlx4: Change ENOTSUPP to EOPNOTSUPP As ENOTSUPP is specific to NFS, change the return error value to EOPNOTSUPP in various places in the mlx4 driver. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Suggested-by: Yotam Gigi <yotamg@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-23 10:57:56 -05:00
Tariq Toukan	36154be40a	net/mlx5e: Fix wrong CQE decompression In cqe compression with striding RQ, the decompression of the CQE field wqe_counter was done with a wrong wraparound value. This caused handling cqes with a wrong pointer to wqe (rx descriptor) and creating SKBs with wrong data, pointing to wrong (and already consumed) strides/pages. The meaning of the CQE field wqe_counter in striding RQ holds the stride index instead of the WQE index. Hence, when decompressing a CQE, wqe_counter should have wrapped-around the number of strides in a single multi-packet WQE. We dropped this wrap-around mask at all in CQE decompression of striding RQ. It is not needed as in such cases the CQE compression session would break because of different value of wqe_id field, starting a new compression session. Tested: ethtool -K ethxx lro off/on ethtool --set-priv-flags ethxx rx_cqe_compress on super_netperf 16 {ipv4,ipv6} -t TCP_STREAM -m 50 -D verified no csum errors and no page refcount issues. Fixes: `7219ab34f1` ("net/mlx5e: CQE compression") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reported-by: Tom Herbert <tom@herbertland.com> Cc: kernel-team@fb.com Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-23 10:43:10 -05:00
Saeed Mahameed	6dc4b54e77	net/mlx5e: Update MPWQE stride size when modifying CQE compress state When the admin enables/disables cqe compression, updating mpwqe stride size is required: CQE compress ON ==> stride size = 256B CQE compress OFF ==> stride size = 64B This is already done on driver load via mlx5e_set_rq_type_params, all we need is just to call it on arbitrary admin changes of cqe compression state via priv flags or when changing timestamping state (as it is mutually exclusive with cqe compression). This bug introduces no functional damage, it only makes cqe compression occur less often, since in ConnectX4-LX CQE compression is performed only on packets smaller than stride size. Tested: ethtool --set-priv-flags ethxx rx_cqe_compress on pktgen with 64 < pkt size < 256 and netperf TCP_STREAM (IPv4/IPv6) verify `ethtool -S ethxx \| grep compress` are advancing more often (rapidly) Fixes: `7219ab34f1` ("net/mlx5e: CQE compression") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Cc: kernel-team@fb.com Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-23 10:43:10 -05:00
Tariq Toukan	b0d4660b4c	net/mlx5e: Fix broken CQE compression initialization Some of RQ type parameters are derived from CQE compression state flag, CQE compression flag was initialized only after RQ type parameters setup. This leads to load RQ with stride size smaller than what we want for when CQE compression is on. This bug introduces no functional damage, it only makes CQE compression occur less often, since in ConnectX4-LX CQE compression is performed only on packets smaller than stride size. Fix this by marking default status of CQE compression in PFLAG prior to calling mlx5e_set_rq_priv_params(), as it inits some fields based on it. Tested: load driver on systems where rx CQE compress will be on (MH) pktgen with 64 < pkt size < 256 and netperf TCP_STREAM (IPv4/IPv6) verify `ethtool -S ethxx \| grep compress` are advancing more often (rapidly) Fixes: `2fc4bfb725` ("net/mlx5e: Dynamic RQ type infrastructure") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Cc: kernel-team@fb.com Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-23 10:43:09 -05:00
Tariq Toukan	4078e637c1	net/mlx5e: Do not reduce LRO WQE size when not using build_skb When rq_type is Striding RQ, no room of SKB_RESERVE is needed as SKB allocation is not done via build_skb. Fixes: `e4b8550807` ("net/mlx5e: Slightly reduce hardware LRO size") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-23 10:43:09 -05:00
Saeed Mahameed	6f08a22c5f	net/mlx5e: Register/unregister vport representors on interface attach/detach Currently vport representors are added only on driver load and removed on driver unload. Apparently we forgot to handle them when we added the seamless reset flow feature. This caused to leave the representors netdevs alive and active with open HW resources on pci shutdown and on error reset flows. To overcome this we move their handling to interface attach/detach, so they would be cleaned up on shutdown and recreated on reset flows. Fixes: `26e59d8077` ("net/mlx5e: Implement mlx5e interface attach/detach callbacks") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-23 10:43:09 -05:00
Mohamad Haj Yahia	18bcf742fb	net/mlx5e: s390 system compilation fix Add necessary headers include for s390 arch compilation. Fixes: `e586b3b0ba` ("net/mlx5: Ethernet Datapath files") Fixes: `d605d6686d` ("net/mlx5e: Add support for ethtool self..") Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-23 10:43:09 -05:00
Eric Dumazet	3608b13ccc	mlx4: reduce OOM risk on arches with large pages Since mlx4 NIC are used on PowerPC with 64K pages, we need to adapt MLX4_EN_ALLOC_PREFER_ORDER definition. Otherwise, a fragment sitting in an out of order TCP queue can hold 0.5 Mbytes and it is a serious OOM risk. Fixes: `51151a16a6` ("mlx4: allow order-0 memory allocations in RX path") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-20 10:31:50 -05:00
Eric Dumazet	f5a5772337	mlx4: fix potential divide by 0 in mlx4_en_auto_moderation() 1) In the case where rate == priv->pkt_rate_low == priv->pkt_rate_high, mlx4_en_auto_moderation() does a divide by zero. 2) We want to properly change the moderation parameters if rx_frames was changed (like in ethtool -C eth0 rx-frames 16) Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-19 18:15:23 -05:00
Eric Dumazet	01f0f42534	mlx4: do not fire tasklet unless necessary All rx and rx netdev interrupts are handled by respectively by mlx4_en_rx_irq() and mlx4_en_tx_irq() which simply schedule a NAPI. But mlx4_eq_int() also fires a tasklet to service all items that were queued via mlx4_add_cq_to_tasklet(), but this handler was not called unless user cqe was handled. This is very confusing, as "mpstat -I SCPU ..." show huge number of tasklet invocations. This patch saves this overhead, by carefully firing the tasklet directly from mlx4_add_cq_to_tasklet(), removing four atomic operations per IRQ. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tariq Toukan <tariqt@mellanox.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-17 10:55:22 -05:00
David S. Miller	3f64116a83	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2017-02-16 19:34:01 -05:00
Jiri Pirko	0c921a894c	mlxsw: acl: Use PBS type for forward action Current behaviour of "mirred redirect" action (forward) offload is a bit odd. For matched packets the action forwards them to the desired destination, but it also lets the packet duplicates to go the original way down (bridge, router, etc). That is more like "mirred mirror". Fix this by using PBS type which behaves exactly like "mirred redirect". Note that PBS does not support loopback mode. Fixes: `4cda7d8d70` ("mlxsw: core: Introduce flexible actions support") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-15 13:23:20 -05:00
Eric Dumazet	99f5711e7c	mlx4: do not use rwlock in fast path Using a reader-writer lock in fast path is silly, when we can instead use RCU or a seqlock. For mlx4 hwstamp clock, a seqlock is the way to go, removing two atomic operations and false sharing. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-15 12:06:18 -05:00
Nogah Frankel	1ca6270b8d	mlxsw: spectrum: Change ipv6 unregistered mc table Point back the unregister IPv6 mc table to the bc table. It is done since IPv6 mcast snooping is not supported for Spectrum yet. Reported-by: Jiri Pirko <jiri@mellanox.com> Fixes: `71c365bdc4` ("mlxsw: spectrum: Separate bc and mc floods") Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Tested-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-14 13:15:46 -05:00
Or Gerlitz	fed06ee89b	net/mlx5e: Disable preemption when doing TC statistics upcall When called by HW offloading drivers, the TC action (e.g net/sched/act_mirred.c) code uses this_cpu logic, e.g _bstats_cpu_update(this_cpu_ptr(a->cpu_bstats), bytes, packets) per the kernel documention, preemption should be disabled, add that. Before the fix, when running with CONFIG_PREEMPT set, we get a BUG: using smp_processor_id() in preemptible [00000000] code: tc/3793 asserion from the TC action (mirred) stats_update callback. Fixes: `aad7e08d39` ('net/mlx5e: Hardware offloaded flower filter statistics support') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-14 11:56:01 -05:00
Maor Gottlieb	e04a018377	net/mlx5: Consolidate flow rules regardless their flow tag Flow rules with same match criteria and value should be mapped to the same flow table entry regardless the flow tag identifier. Flow tag is part of flow table entry context and not of the destination, therefore we should return error when we try to add destination to flow table entry with different flow tag. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 10:21:01 -05:00
Jiri Pirko	dc371700d4	spectrum: flower: Treat ETH_P_ALL as a special case and translate for HW HW does not understand ETH_P_ALL. So treat this special case differently and translate to 0/0 key/mask. That will allow HW to match all ethertypes. Fixes: `7aa0f5aa90` ("mlxsw: spectrum: Implement TC flower offload") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-10 13:42:50 -05:00
Nogah Frankel	90e0f0c1b4	mlxsw: spectrum: Update mc_disabled flag by switchdev attr Add a function to update mc_disabled from switchdev attr SWITCHDEV_ATTR_ID_BRIDGE_MC_DISABLED Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-10 11:46:41 -05:00
Nogah Frankel	1e5d94327d	mlxsw: spectrum: Extend port_orig_get for bridge devices The function mlxsw_sp_port_orig_get returns the vport from the physical port if needed, based on the original device. This patch addresses the case where the original device is a bridge. If it is vlan unaware bridge, it returns the matching vport. If it is vlan aware bridge, there is no matching vport, and it returns the original port. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-10 11:46:41 -05:00
Nogah Frankel	8ecd4591e7	mlxsw: spectrum: Add an option to flood mc by mc_router_port The decision whether to flood a multicast packet to a port dependent on three flags: mc_disabled, mc_router_port, mc_flood. If mc_disabled is on, the port will be flooded according to mc_flood, otherwise, according to mc_router_port. To accomplish that, add those flags into the mlxsw_sp_port struct and update the mc flood table accordingly. Update mc_router_port by switchdev attribute SWITCHDEV_ATTR_ID_PORT_MC_ROUTER_PORT. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-10 11:46:40 -05:00
Nogah Frankel	71c365bdc4	mlxsw: spectrum: Separate bc and mc floods Break the bm (broadcast-multicast) into two tables, one for broadcast (and link local multicast that behaves like bc) and one for unknown multicasts. Add a bool into mlxsw_sp_port named mc_flood that reflect the value this port should have in the mc flood table (currently, always 1); Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-10 11:46:40 -05:00
Nogah Frankel	63fe813c60	mlxsw: spectrum: Change max vfid A user that wants many bridges will use 1.Q bridge which are scalable. One can have as many 1.Q bridges as vfids. This patch sets their number to 1k, which is a reasonably large number. This change is done here because the next patches will add a new flood table, and without it, it will increase the overall size of the flood tables dramatically. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-10 11:46:40 -05:00
Nogah Frankel	69be01f374	mlxsw: spectrum: Make port flood update more generic Currently, there is a per port flood update function only for the UC table. Make the function more generic by changing the table type to be an input. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-10 11:46:39 -05:00
Nogah Frankel	eaa7df3c5a	mlxsw: spectrum: Break flood set func to be per table Currently, the flood set function can't operate on only one table, but sets both uc_flood and mb_flood together. This patch creates a function that sets the flood state per table. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-10 11:46:39 -05:00
Ido Schimmel	599cf8f95f	mlxsw: spectrum_router: Add support for route replace Upon the reception of an ENTRY_REPLACE notification, resolve the FIB node corresponding to the prefix and length and insert the new route before the first matching entry. Since the notification also signals the deletion of the replaced route, delete it from the driver's cache. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-10 11:32:14 -05:00
Ido Schimmel	4283bce5f8	mlxsw: spectrum_router: Add support for route append When a new route is appended, it's placed after existing routes sharing the same parameters (prefix, length, table ID, TOS and priority). While the device supports only one route with the same prefix and length in a single table, it's important to correctly place the appended route in the driver's cache, as when a route is deleted the next one is programmed into the device. Following the reception of an ENTRY_APPEND notification, resolve the FIB node corresponding to the prefix and length and correctly place the new entry in its entry list. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-10 11:32:13 -05:00
Ido Schimmel	9aecce1c7d	mlxsw: spectrum_router: Correctly handle identical routes In the device, routes are indexed in a routing table based on the prefix and its length. This is in contrast to the kernel's FIB where several FIB aliases can exist with these parameters being identical. In such cases, the routes will be sorted by table ID (LOCAL first, then MAIN), TOS and finally priority (metric). During lookup, these routes will be evaluated in order. In case the packet's TOS field is non-zero and a FIB alias with a matching TOS is found, then it's selected. Otherwise, the lookup defaults to the route with TOS 0 (if it exists). However, if the requested scope is narrower than the one found, then the lookup continues. To best reflect the kernel's datapath we should take the above into account. Given a prefix and its length, the reflected route will always be the first one in the FIB alias list. However, if the route has a non-zero TOS then its action will be converted to trap instead of forward, since we currently don't support TOS-based routing. If this turns out to be a real issue, we can add support for that using policy-based switching. The route's scope can be effectively ignored as any packet being routed by the device would've been looked-up using the widest scope (UNIVERSE). To achieve that we need to do two changes. Firstly, we need to create another struct (FIB node) that will hold the list of FIB entries sharing the same prefix and length. This struct will be hashed using these two parameters. Secondly, we need to change the route reflection to match the above logic, so that the first FIB entry in the list will be programmed into the device while the rest will remain in the driver's cache in case of subsequent changes. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-10 11:32:13 -05:00
Ido Schimmel	df6dd79be8	mlxsw: spectrum_router: Don't reflect LINKDOWN nexthops The kernel resolves the nexthops for a given route using FIB_LOOKUP_IGNORE_LINKSTATE which means a notification can be sent for a route with one of its nexthops being LINKDOWN. In case IGNORE_ROUTES_WITH_LINKDOWN is set for the nexthop netdev, then we shouldn't reflect the nexthop to the device's table. Once the nexthop netdev's carrier goes up we'll be notified using NH_ADD and reflect it to the device. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:43:59 -05:00
Ido Schimmel	9665b74562	mlxsw: spectrum_router: Flush resources when RIF is deleted When the last IP address is removed from a netdev, its RIF is deleted. However, if user didn't first remove neighbours and nexthops using this interface, then they would still be present in the device's tables. Therefore, whenever a RIF is deleted, make sure all the neighbours and nexthops (adjacency entries) using it are removed from the relevant tables as well. The action associated with any route using this RIF would be refreshed, most likely to trap. If the kernel decides to remove the route (f.e., because all the nexthops are now DEAD), then an event would be sent, causing the route to be removed from the device. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:25:19 -05:00
Ido Schimmel	ad178c8eef	mlxsw: spectrum_router: Reflect nexthop status changes When a packet hits a multipath route in the device's routing table, a hash is computed over its headers, which is then used to select the appropriate nexthop from the device's adjacency table. There are situations in which the kernel removes a nexthop from a multipath route (e.g., no carrier) and the device should do the same. Upon the reception of NH_{ADD,DEL} events, add or remove a nexthop from the device's adjacency table and refresh all the routes using the nexthop group. If all the nexthops of a multipath route are invalid, then any packet hitting the route would be trapped to the CPU for forwarding. If all the nexthops are DEAD, then the kernel would remove the route entirely. On the other hand, if all the nexthops are merely LINKDOWN, then the kernel would keep the route and forward any incoming packet using a different route. While the last case might sound like a problem, it's expected that a routing daemon running in user space would remove such a route from the FIB as it's dumped with the DEAD flag set. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:25:18 -05:00
Ido Schimmel	70ad35067c	mlxsw: spectrum_router: Use trap action only for some route types The device can have one of three actions associated with a route: 1) Remote - packets continue to the adjacency table 2) Local - packets continue to the neighbour table 3) Trap - packets continue to the CPU The first two actions can also trap packets to the CPU, but they do so using a different trap ID, which has a lower traffic class and less allotted bandwidth. We currently use the third action for both RTN_{LOCAL,BROADCAST} routes and RTN_UNICAST routes not pointing to the switch ports. However, packets that merely need to be forwarded by the switch are likely not control packets and can be therefore scheduled towards the CPU using a lower traffic class. Achieve the above by assigning the third action only to local and broadcast routes and have any other route use either of the first two actions, based on whether the route is gatewayed or not. This will also allow us to refresh routes using the local action and have them trap packets when their RIF is no longer valid following a NH_DEL event. One side effect of this patch is that we no longer give special treatment to multipath routes using both switch and non-switch ports towards their nexthops. If at least one of the nexthops can be resolved, then the device will forward the packets instead of trapping them. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:25:18 -05:00
Ido Schimmel	4b41147751	mlxsw: spectrum_router: Determine offload status using generic function The previous patch introduced a generic function to determine whether a route should be offloaded or not. Make use of it here. In the future we're going to add more conditions to this test (e.g., whether TOS is non-zero), so it makes sense to centralize it instead of open coding it in a few places. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:25:18 -05:00
Ido Schimmel	013b20f953	mlxsw: spectrum_router: More accurately set offload flag We currently set the RTNH_F_OFFLOAD flag for all routes using remote action, but this isn't always correct. If none of the nexthops associated with a gatewayed route can be offloaded into the device, then any packet hitting it would be trapped to the CPU and forwarded by the kernel. Solve this by pushing the setting of the offload flag to after the route was programmed into the device, thereby allowing us to take all the parameters into account. This change will also help us further in the patchset, when we refresh routes following the reception of NH_{ADD,DEL} events. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:25:17 -05:00
Ido Schimmel	a8c9701427	mlxsw: spectrum_router: Refactor nexthop init routine The nexthop init and de-init functions both have symmetric parts concerned with the reflection of the neighbour entry into the device's adjacency table, in case it's used by a gatewayed route. These sections of code also need to be called when a nexthop is marked as valid / invalid following NH_{ADD,DEL} events. Break these out into appropriate functions, so that they could be invoked following the reception of above events. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:25:17 -05:00
Ido Schimmel	c8b030774f	mlxsw: spectrum_router: Remove FIB info from FIB entry struct After the previous changes, the FIB info is embedded in every nexthop group struct, which in turn is embedded in every FIB entry struct. We can therefore safely remove the FIB info from the entry struct. This has the added advantage of making the router-related structs more generic and suitable for use with IPv6 offloads. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:25:17 -05:00
Ido Schimmel	b8399a1e5a	mlxsw: spectrum_router: Store routes in a more generic way Up until now, the only FIB entries that were associated with a nexthop group were routes to remote networks where all the nexthop devices had a valid router interface (RIF). This is in contrast to the FIB code, where all the routes are associated with a FIB info. The same design choice needs to be applied to the driver's cache. Based on the NH_{ADD,DEL} events which will be added later in the patchset, we need to be able to change the action (forward / trap) associated with all the routes using the nexthop group. However, if we can't link between the nexthop and the routes using it, then the above is impossible. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:25:17 -05:00
Ido Schimmel	b3e8d1ebad	mlxsw: spectrum_router: Add gateway indication to nexthop group The next patch is going to generalize the way in which we store routes. Instead of attaching a nexthop group only to gatewayed routes, one will be attached to each route, in a similar way to the way the FIB code stores its routes. The above means that any function operating on a nexthop group cannot assume the group represents only gatewayed nexthops. One such function is the one that refreshes a nexthop group and updates the adjacency table following nexthop changes. For a nexthop group that doesn't represent any gateways this function would essentially be a NOP, but it would be useful if it did update the action associated with any route using it. This will allow us to later consolidate code paths when a nexthop changes following NH_{ADD,DEL} events. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:25:16 -05:00
Ido Schimmel	d55409cb28	mlxsw: spectrum_router: Use nexthop's scope to set action type We currently use the scope of the FIB info to distinguish between a direct unicast route and a gatewayed one. However, the kernel is perfectly happy to configure a route with scope UNIVERSE to a directly connected network. Instead, we can rely on the first nexthop's scope to check if the route is gatewayed or not. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:25:16 -05:00
Ido Schimmel	c53b8e1b5a	mlxsw: spectrum_router: Store nexthops in a hash table Later in the patchset we'll add the NH_{ADD,DEL} events which will let us know when a nexthop is considered to be dead. Based on these events we need to be able to add or remove the nexthop from the device's tables. Therefore, store the private nexthop structs in a hash table and use the kernel's fib_nh struct as the key, so that we'll be able to easily find them when the events are received. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:25:16 -05:00
Ido Schimmel	e9ad5e7d8d	mlxsw: spectrum_router: Store nexthop groups in a hash table Currently, when we're notified about a new RTN_UNICAST route we perform a lookup on the nexthop group list looking for a group with a matching configuration to that found in the FIB info. This is quite inefficient. Instead, we can simply rely on the kernel to consolidate several FIB configurations into the same FIB info and use the FIB info as the key for our private nexthop group struct. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:25:15 -05:00
Ido Schimmel	e58be79e2d	mlxsw: spectrum_router: Nullify nexthop's neigh pointer When we invalidate a nexthop we should also invalidate its neighbour entry pointer as it might be destroyed later on. This makes the nexthop de-init function symmetric with its init and also ensures nobody will try to access the neighbour entry. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:25:15 -05:00
Jiri Pirko	b05d0cfa19	mlxsw: acl: Fix mlxsw_afa_block_commit error path No rollback is needed since the chain is in consistent state and mlxsw_afa_block_destroy() will take care of putting it away. So remove the one we have now which is wrong. Also move the set of 'finished' flag to the beginning of the function, because the block is certainly unusable for future action addition no matter if the function succeeds or not. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: `4cda7d8d70` ("mlxsw: core: Introduce flexible actions support") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:13:44 -05:00
Philippe Reynes	40710cf9ad	net: mellanox: switchx2: use new api ethtool_{get\|set}_link_ksettings The ethtool api {get\|set}_settings is deprecated. We move this driver to new api {get\|set}_link_ksettings. As I don't have the hardware, I'd be very pleased if someone may test this patch. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Tested-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 14:35:26 -05:00
David S. Miller	3efa70d78f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net The conflict was an interaction between a bug fix in the netvsc driver in 'net' and an optimization of the RX path in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-07 16:29:30 -05:00
Jiri Pirko	9bcdef3288	spectrum: acl_tcam: Fix catchall prio value This fixes an issue reported by smatch: mlxsw_sp_acl_tcam_chunk_create() warn: impossible condition '(priority == (-1)) => (0-u32max == u64max)' Reported-by: Or Gerlitz <ogerlitz@mellanox.com> Reported-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: `22a677661f` ("mlxsw: spectrum: Introduce ACL core with simple TCAM implementation") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-07 14:15:21 -05:00
David S. Miller	501ec18757	mlx5-updates-2017-01-31 This series includes some updates to mlx5 core and ethernet driver. We got one patch from Or to fix some static checker warnings. 2nd patche from Dan came to add the support for 128B cache line in the HCA, which will configures the hardware to use 128B alignment only on systems with 128B cache lines, otherwise it will be kept as the current default of 64B. From me three patches to support no inline copy on TX on ConnectX-5 and later HCAs. Starting with two small infrastructure changes and refactoring patches followed by two patches to add the actual support for both xmit ndo and XDP xmit routines. Last patch is a simple fix to return a mistakenly removed pointer from the SQ structure, which was remove in previous submission of mlx5 4K UAR. Saeed. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJYmQKaAAoJEEg/ir3gV/o+RBMH/RGHNw3yPB2MyWo28V3eabw+ xl/SymiNOUgmq03ULYoc6xJpi9RCya7m/Kyce1M/M1gSz6LXubG2IDw9QsKV8lnc +5rwHCKjop6MdR3khsgqvWqGiKfQN0+QON5MjlPZB3/4u8qFcjauhfXpiX9naMO5 aB/Sm9zRPwRnsEhy2AwPyZqOxe5boZzHqmZxpthIgPMtqbpBYNkTkooljsj/KqXf AO3y/mdGykELPF3lIHTE4X9zixx5s6MrlAYX2uGUrAojs2WVIBsq3iXI/J8X9zs/ lg7to15WoMttR66vRZ120U6tx17OMmoxuAp+bmgZumabi/wDAZGSy5ELbH28WlY= =F+t/ -----END PGP SIGNATURE----- Merge tag 'mlx5-updates-2017-01-31' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2017-01-31 This series includes some updates to mlx5 core and ethernet driver. We got one patch from Or to fix some static checker warnings. 2nd patche from Dan came to add the support for 128B cache line in the HCA, which will configures the hardware to use 128B alignment only on systems with 128B cache lines, otherwise it will be kept as the current default of 64B. From me three patches to support no inline copy on TX on ConnectX-5 and later HCAs. Starting with two small infrastructure changes and refactoring patches followed by two patches to add the actual support for both xmit ndo and XDP xmit routines. Last patch is a simple fix to return a mistakenly removed pointer from the SQ structure, which was remove in previous submission of mlx5 4K UAR. Saeed. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-07 13:44:08 -05:00
Benjamin Poirier	bd4ce941c8	mlx4: Invoke softirqs after napi_reschedule mlx4 may schedule napi from a workqueue. Afterwards, softirqs are not run in a deterministic time frame and the following message may be logged: NOHZ: local_softirq_pending 08 The problem is the same as what was described in commit `ec13ee8014` ("virtio_net: invoke softirqs after __napi_schedule") and this patch applies the same fix to mlx4. Fixes: `07841f9d94` ("net/mlx4_en: Schedule napi when RX buffers allocation fails") Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Benjamin Poirier <bpoirier@suse.com> Acked-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-07 12:50:43 -05:00
Arnd Bergmann	8d1fb01df8	mlxsw: add psample dependency for spectrum When PSAMPLE is a loadable module, spectrum must not be built-in: drivers/net/built-in.o: In function `mlxsw_sp_rx_listener_sample_func': spectrum.c:(.text+0xe357e): undefined reference to `psample_sample_packet' This adds a Kconfig dependency to enforce usable configurations. Fixes: `98d0f7b9ac` ("mlxsw: spectrum: Add packet sample offloading support") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Yotam Gigi <yotamg@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-07 11:44:12 -05:00
Arnd Bergmann	321fa4ffd9	net/mlx5e: fix another maybe-uninitialized false-positive In commit `abeffce` ("net/mlx5e: Fix a -Wmaybe-uninitialized warning"), I fixed a gcc warning for the ipv4 offload handling. Now we get the same warning for the added ipv6 support: drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:815:40: warning: 'out_dev' may be used uninitialized in this function [-Wmaybe-uninitialized] We can apply the same workaround here as well. Fixes: `ce99f6b97f` ("net/mlx5e: Support SRIOV TC encapsulation offloads for IPv6 tunnels") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-06 16:35:12 -05:00
Dan Carpenter	73cfb2a2e4	net/mlx4_en: fix a condition There is a "\|\|" vs "\|" typo here so we test 0x1 instead of 0x6. Fixes: `1f8176f735` ("net/mlx4_en: Check the enabling pptx/pprx flags in SET_PORT wrapper flow") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-06 12:01:06 -05:00
Ido Schimmel	fd76d9105b	mlxsw: spectrum_router: Fix typo in comment Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-06 11:25:57 -05:00
Ido Schimmel	01b1aa359d	mlxsw: spectrum_router: Don't read 'nud_state' without lock We periodically ask the neighbouring system to try and resolve neighbours that are used for nexthops, but aren't currently resolved. However, 'nud_state' is protected by the neighbour lock, so we shouldn't access it without taking it. Instead, we can simply check the 'connected' field of the neighbour entry, which we update upon NEIGH_UPDATE events. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-06 11:25:57 -05:00
Ido Schimmel	8a0b727526	mlxsw: spectrum_router: Remove redundant check We only add neighbour entries that are also used for nexthops to 'nexthop_neighs_list', so when iterating over this list there's no need to check that the entry is indeed used for nexthops. Remove the redundant check. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-06 11:25:57 -05:00
Ido Schimmel	5c8802f14a	mlxsw: spectrum_router: Simplify neighbour reflection Up until now we had two interfaces for neighbour related configuration: ndo_neigh_{construct,destroy} and NEIGH_UPDATE netevents. The ndos were used to add and remove neighbours from the driver's cache, whereas the netevent was used to reflect the neighbours into the device's tables. However, if the NUD state of a neighbour isn't NUD_VALID or if the neighbour is dead, then there's really no reason for us to keep it inside our cache. The only exception to this rule are neighbours that are also used for nexthops, which we periodically refresh to get them resolved. We can therefore eliminate the ndo entry point into the driver and simplify the code, making it similar to the FIB reflection, which is based solely on events. This also helps us avoid a locking issue, in which the RIF cache was traversed without proper locking during insertion into the neigh entry cache. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-06 11:25:56 -05:00
Ido Schimmel	de04b6a358	mlxsw: spectrum_router: Remove unused variable Since commit `33b1341cd1` ("mlxsw: spectrum_router: Fix handling of neighbour structure") we no longer use destination IP for neighbour lookup, so remove it. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-06 11:25:56 -05:00
Ido Schimmel	e60234ddb5	mlxsw: spectrum_router: Use ordered workqueue for neigh updates We currently associate each neighbour entry with a work item, so it's not possible to have multiple events queued for the same neighbour entry. However, this is about to be changed so that the neighbour entry is only resolved when the work item is scheduled. The above can result in a mismatch between the kernel's and the device's neighbour table, unless the associated work items are processed in the order in which they were submitted. Do that by migrating the NEIGH_UPDATE work items to be processed in the ordered workqueue which was recently introduced in mlxsw in commit `a3832b3189` ("mlxsw: core: Create an ordered workqueue for FIB offload"). Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-06 11:25:56 -05:00
Ido Schimmel	a0e4761d9b	mlxsw: core: Queue work immediately instead of delaying it We always use zero delay before queueing a work on the ordered workqueue ('mlxsw_owq'), so use work_struct directly instead of delayable work. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-06 11:25:55 -05:00
Saeed Mahameed	8ca967ab67	net/mlx5e: Bring back bfreg uar map dedicated pointer 4K Uar series modified the mlx5e driver to use the new bfreg API, and mistakenly removed the sq->uar_map iomem data path dedicated pointer, which was meant to be read from xmit path for cache locality utilization. Fix that by returning that pointer to the SQ struct. Fixes: 7309cb4ad71e ("IB/mlx5: Support 4k UAR for libmlx5") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com>	2017-02-06 18:20:18 +02:00
Saeed Mahameed	b70149dd7d	net/mlx5e: XDP Tx, no inline copy on ConnectX-5 ConnectX-5 and later HW generations will report min inline mode == MLX5_INLINE_MODE_NONE, which means driver is not required to copy packet headers to inline fields of TX WQE. Avoid copy to inline segment in XDP TX routine when HW inline mode doesn't require it. This will improve CPU utilization and boost XDP TX performance. Tested with xdp2 single flow: CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz HCA: Mellanox Technologies MT28800 Family [ConnectX-5 Ex] Before: 7.4Mpps After: 7.8Mpps Improvement: 5% Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com>	2017-02-06 18:20:18 +02:00
Saeed Mahameed	a6f402e499	net/mlx5e: Tx, no inline copy on ConnectX-5 ConnectX-5 and later HW generations will report min inline mode == MLX5_INLINE_MODE_NONE, which means driver is not required to copy packet headers to inline fields of TX WQE. When inline is not required, vlan insertion will be handled in the TX descriptor rather than copy to inline. For LSO case driver is still required to copy headers, for the HW to duplicate on wire. This will improve CPU utilization and boost TX performance. Tested with pktgen burst single flow: CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz HCA: Mellanox Technologies MT28800 Family [ConnectX-5 Ex] Before: 15.1Mpps After: 17.2Mpps Improvement: 14% Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com>	2017-02-06 18:20:17 +02:00
Saeed Mahameed	2b31f7ae5f	net/mlx5: TX WQE update Add new TX WQE fields for Connect-X5 vlan insertion support, type and vlan_tci, when type = MLX5_ETH_WQE_INSERT_VLAN the HW will insert the vlan and prio fields (vlan_tci) to the packet. Those bits and the inline header fields are mutually exclusive, and valid only when: MLX5_CAP_ETH(mdev, wqe_inline_mode) == MLX5_CAP_INLINE_MODE_NOT_REQUIRED and MLX5_CAP_ETH(mdev, wqe_vlan_insert), who will be set in ConnectX-5 and later HW generations. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com>	2017-02-06 18:20:16 +02:00
Daniel Jurgens	f32f5bd2eb	net/mlx5: Configure cache line size for start and end padding There is a hardware feature that will pad the start or end of a DMA to be cache line aligned to avoid RMWs on the last cache line. The default cache line size setting for this feature is 64B. This change configures the hardware to use 128B alignment on systems with 128B cache lines. In addition we lower bound MPWRQ stride by HCA cacheline in mlx5e, MPWRQ stride should be at least the HCA cacheline, the current default is 64B and in case HCA_CAP.cach_line_128byte capability is set, MPWRQ RX stride will automatically be aligned to 128B. Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2017-02-06 18:17:25 +02:00
Elad Raz	e158e5ef24	mlxsw: reg: Fix HTGT register length HTGT register length is limited to 32 bytes and not 256 bytes. Signed-off-by: Elad Raz <eladr@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-06 11:07:21 -05:00
Jiri Pirko	7aa0f5aa90	mlxsw: spectrum: Implement TC flower offload Extend the existing setup_tc ndo call and allow to offload cls_flower rules. Only limited set of dissector keys and actions are supported now. Use previously introduced ACL infrastructure to offload cls_flower rules to be processed in the HW. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-03 16:35:43 -05:00
Jiri Pirko	22a677661f	mlxsw: spectrum: Introduce ACL core with simple TCAM implementation Add ACL core infrastructure for Spectrum ASIC. This infra provides an abstraction layer over specific HW implementations. There are two basic objects used. One is "rule" and the second is "ruleset" which serves as a container of multiple rules. In general, within one ruleset the rules are allowed to have multiple priorities and masks. Each ruleset is bound to either ingress or egress a of port netdevice. The initial TCAM implementation is very simple and limited. It utilizes parman lsort manager to take care of TCAM region layout. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-03 16:35:43 -05:00
Jiri Pirko	8708ecf01d	mlxsw: resources: Add ACL related resources Add couple of resource limits related to ACL. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-03 16:35:42 -05:00
Jiri Pirko	b876b9aaad	mlxsw: spectrum: Introduce basic set of flexible key blocks Introduce basic set of Spectrum flexible key blocks. It contains blocks needed to carry all elements defined so far. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-03 16:35:41 -05:00
Jiri Pirko	4cda7d8d70	mlxsw: core: Introduce flexible actions support Each entry which is matched during ACL lookup points to an action set. This action set contains up to three separate actions. If more actions are needed to be chained, the extended set is created to hold them in KVD linear area. This patch implements handling of sets and encoding of actions. Currectly, only two actions are supported. Drop and forward. Forward action uses PBS pointer to KVD linear area, so the action code needs to take care of this as well. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-03 16:35:41 -05:00
Jiri Pirko	3f1a84e696	mlxsw: core: Introduce flexible keys support Hardware supports matching on so called "flexible keys". The idea is to assemble an optimal key to use for matching according to the fields in packet (elements) requested by user. Certain sets of elements are combined into pre-defined blocks. There is a picker to find needed blocks. Keys consist of 1..n blocks. Alongside with that, an initial portion of elements is introduced in order to be able to offload basic cls_flower rules. Picked keys are cached so multiple rules could share them. There is an encode function provided that takes care of encoding key and mask values according to given key. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-03 16:35:41 -05:00
Jiri Pirko	e3426e12fe	mlxsw: reg: Add Policy-Engine Extended Flexible Action Register PEFA register is used for accessing an extended flexible action entry in the central KVD Linear Database. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-03 16:35:40 -05:00
Jiri Pirko	d120649d86	mlxsw: reg: Add Policy-Engine Policy Based Switching Register The PPBS register retrieves and sets Policy Based Switching Table entries. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-03 16:35:40 -05:00
Jiri Pirko	937b682cc0	mlxsw: reg: Add Policy-Engine Rules Copy Register The PRCR register is used for accessing rules within a TCAM region. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-03 16:35:40 -05:00
Jiri Pirko	af7170eee6	mlxsw: reg: Add Policy-Engine Port Binding Table The PPBT is used for configuration of the Port Binding Table. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-03 16:35:39 -05:00
Jiri Pirko	0171cdec03	mlxsw: reg: Add Policy-Engine TCAM Entry Register Version 2 The PTCE-V2 register is used for accessing rules within a TCAM region. It is a new version of PTCE in order to support wider key, mask and action within a TCAM region. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-03 16:35:39 -05:00
Jiri Pirko	d9c2661e1c	mlxsw: reg: Add Policy-Engine TCAM Allocation Register The PTAR register is used for allocation of regions in the TCAM. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-03 16:35:39 -05:00
Jiri Pirko	10fabef513	mlxsw: reg: Add Policy-Engine ACL Group Table register The PAGT register is used for configuration of the ACL Group Table. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-03 16:35:38 -05:00
Jiri Pirko	3279da4c88	mlxsw: reg: Add Policy-Engine ACL Register The PACL register is used for configuration of the ACL. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-03 16:35:38 -05:00
Jiri Pirko	d5e556c6a1	mlxsw: item: Add helpers for getting pointer into payload for char buffer item Sometimes it is handy to get a pointer to a char buffer item and use it direcly to write/read data. So add these helpers. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-03 16:35:38 -05:00
Jiri Pirko	2946fde9fd	mlxsw: item: Add 8bit item helpers Item heplers for 8bit values are needed, let's add them. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-03 16:35:37 -05:00
Martin KaFai Lau	770f82253d	mlx4: xdp_prog becomes inactive after ethtool '-L' or '-G' After calling mlx4_en_try_alloc_resources (e.g. by changing the number of rx-queues with ethtool -L), the existing xdp_prog becomes inactive. The bug is that the xdp_prog ptr has not been carried over from the old rx-queues to the new rx-queues Fixes: `47a38e1550` ("net/mlx4_en: add support for fast rx drop bpf program") Cc: Brenden Blanco <bblanco@plumgrid.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-02 21:27:05 -05:00
Martin KaFai Lau	f32b20e89e	mlx4: Fix memory leak after mlx4_en_update_priv() In mlx4_en_update_priv(), dst->tx_ring[t] and dst->tx_cq[t] are over-written by src->tx_ring[t] and src->tx_cq[t] without first calling kfree. One of the reproducible code paths is by doing 'ethtool -L'. The fix is to do the kfree in mlx4_en_free_resources(). Here is the kmemleak report: unreferenced object 0xffff880841211800 (size 2048): comm "ethtool", pid 3096, jiffies 4294716940 (age 528.353s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff81930718>] kmemleak_alloc+0x28/0x50 [<ffffffff8120b213>] kmem_cache_alloc_trace+0x103/0x260 [<ffffffff8170e0a8>] mlx4_en_try_alloc_resources+0x118/0x1a0 [<ffffffff817065a9>] mlx4_en_set_ringparam+0x169/0x210 [<ffffffff818040c5>] dev_ethtool+0xae5/0x2190 [<ffffffff8181b898>] dev_ioctl+0x168/0x6f0 [<ffffffff817d7a72>] sock_do_ioctl+0x42/0x50 [<ffffffff817d819b>] sock_ioctl+0x21b/0x2d0 [<ffffffff81247a73>] do_vfs_ioctl+0x93/0x6a0 [<ffffffff812480f9>] SyS_ioctl+0x79/0x90 [<ffffffff8193d7ea>] entry_SYSCALL_64_fastpath+0x18/0xad [<ffffffffffffffff>] 0xffffffffffffffff unreferenced object 0xffff880841213000 (size 2048): comm "ethtool", pid 3096, jiffies 4294716940 (age 528.353s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff81930718>] kmemleak_alloc+0x28/0x50 [<ffffffff8120b213>] kmem_cache_alloc_trace+0x103/0x260 [<ffffffff8170e0cb>] mlx4_en_try_alloc_resources+0x13b/0x1a0 [<ffffffff817065a9>] mlx4_en_set_ringparam+0x169/0x210 [<ffffffff818040c5>] dev_ethtool+0xae5/0x2190 [<ffffffff8181b898>] dev_ioctl+0x168/0x6f0 [<ffffffff817d7a72>] sock_do_ioctl+0x42/0x50 [<ffffffff817d819b>] sock_ioctl+0x21b/0x2d0 [<ffffffff81247a73>] do_vfs_ioctl+0x93/0x6a0 [<ffffffff812480f9>] SyS_ioctl+0x79/0x90 [<ffffffff8193d7ea>] entry_SYSCALL_64_fastpath+0x18/0xad [<ffffffffffffffff>] 0xffffffffffffffff (gdb) list mlx4_en_try_alloc_resources+0x118 0xffffffff8170e0a8 is in mlx4_en_try_alloc_resources (drivers/net/ethernet/mellanox/mlx4/en_netdev.c:2145). 2140 if (!dst->tx_ring_num[t]) 2141 continue; 2142 2143 dst->tx_ring[t] = kzalloc(sizeof(struct mlx4_en_tx_ring ) * 2144 MAX_TX_RINGS, GFP_KERNEL); 2145 if (!dst->tx_ring[t]) 2146 goto err_free_tx; 2147 2148 dst->tx_cq[t] = kzalloc(sizeof(struct mlx4_en_cq ) 2149 MAX_TX_RINGS, GFP_KERNEL); (gdb) list mlx4_en_try_alloc_resources+0x13b 0xffffffff8170e0cb is in mlx4_en_try_alloc_resources (drivers/net/ethernet/mellanox/mlx4/en_netdev.c:2150). 2145 if (!dst->tx_ring[t]) 2146 goto err_free_tx; 2147 2148 dst->tx_cq[t] = kzalloc(sizeof(struct mlx4_en_cq ) * 2149 MAX_TX_RINGS, GFP_KERNEL); 2150 if (!dst->tx_cq[t]) { 2151 kfree(dst->tx_ring[t]); 2152 goto err_free_tx; 2153 } 2154 } Fixes: `ec25bc04ed` ("net/mlx4_en: Add resilience in low memory systems") Cc: Eugenia Emantayev <eugenia@mellanox.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-02 21:27:05 -05:00

... 3 4 5 6 7 ...

2702 Commits