Commit Graph

6615 Commits

Author SHA1 Message Date
Vu Pham
9728366f53 net/mlx5e: Use change upper event to setup representors' bond_metadata
Use change upper event to detect slave representor from
enslaving/unslaving to/from lag device.

On enslaving event, call mlx5_enslave_rep() API to create, add
this slave representor shadow entry to the slaves list of
bond_metadata structure representing master lag device and use
its metadata to setup ingress acl metadata header.

On unslaving event, resetting the vport of unslaved representor
to use its default ingress/egress acls and rx rules with its
default_metadata.

The last slave will free the shared bond_metadata and its
unique metadata.

Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:50 -07:00
Vu Pham
88e96e533c net/mlx5e: Slave representors sharing unique metadata for match
Bonded slave representors' vports must share a unique metadata
for match.

On enslaving event of slave representor to lag device, allocate
new unique "bond_metadata" for match if this is the first slave.
The subsequent enslaved representors will share the same unique
"bond_metadata".

On unslaving event of slave representor, reset the slave
representor's vport to use its own default metadata.

Replace ingress acl and rx rules of the slave representors' vports
using new vport->bond_metadata.

Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:49 -07:00
Vu Pham
133dcfc577 net/mlx5: E-Switch, Alloc and free unique metadata for match
Introduce infrastructure to create unique metadata for match
for vport without depending on vport_num. Vport uses its
default metadata for match in standalone configuration but
will share a different unique "bond_metadata" for match with
other vports in bond configuration.

Using ida to generate unique metadata for match for vports
in default and bond configurations.

Introduce APIs to generate, free metadata for match.
Introduce APIs to set vport's bond_metadata and replace its
ingress acl rules with bond_metatada.

Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:49 -07:00
Vu Pham
d97555e145 net/mlx5e: Add bond_metadata and its slave entries
Adding bond_metadata and its slave entries to represent a lag device
and its slaves VF representors. Bond_metadata structure includes a
unique metadata shared by slaves VF respresentors, and a list of slaves
representors slave entries.

On enslaving event, create a bond_metadata structure representing
the upper lag device of this slave representor if it has not been
created yet. Create and add entry for the slave representor to the
slaves list.

On unslaving event, free the slave entry of the slave representor.
On the last unslave event, free the bond_metadata structure and its
resources.

Introduce APIs to create and remove bond_metadata and its resources,
enslave and unslave VF representor slave entries.

Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:49 -07:00
Or Gerlitz
d34eb2fcd0 net/mlx5e: Offload flow rules to active lower representor
When a bond device is created over one or more non uplink representors,
and when a flow rule is offloaded to such bond device, offload a rule
to the active lower device.

Assuming that this is active-backup lag, the rules should be offloaded
to the active lower device which is the representor of the direct
path (not the failover).

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:48 -07:00
Vu Pham
553f932838 net/mlx5e: Support tc block sharing for representors
Currently offloading a rule over a tc block shared by multiple
representors fails because an e-switch global hashtable to keep
the mapping from tc cookies to mlx5e flow instances is used, and
tc block sharing offloads the same rule/cookie multiple times,
each time for different representor sharing the tc block.

Changing the implementation and behavior by acknowledging and returning
success if the same rule/cookie is offloaded again to other slave
representor sharing the tc block by setting, checking and comparing
the netdev that added the rule first.

Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:48 -07:00
Or Gerlitz
7e51891a23 net/mlx5e: Use netdev events to set/del egress acl forward-to-vport rule
Register a notifier block to handle netdev events for bond device
of non-uplink representors to support eswitch vports bonding.

When a non-uplink representor is a lower dev (slave) of bond and
becomes active, adding egress acl forward-to-vport rule of all slave
netdevs (active + standby) to forward to this representor's vport. Use
change lower netdev event to do this.

Use change upper event to detect slave representor unslaved from lag
device to delete its vport egress acl forward rule if any.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:47 -07:00
Vu Pham
bf773dc0e6 net/mlx5: E-Switch, Introduce APIs to enable egress acl forward-to-vport rule
By default, e-switch vport's egress acl just forward packets to its
counterpart NIC vport using existing egress acl table.

During port failover in bonding scenario where two VFs representors
are bonded, the egress acl forward-to-vport rule will be added to
the existing egress acl table of e-switch vport of passive/inactive
slave representor to forward packets to other NIC vport ie. the active
slave representor's NIC vport to handle egress "failover" traffic.

Enable egress acl and have APIs to create and destroy egress acl
forward-to-vport rule and group.

Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:47 -07:00
Vu Pham
07bab95026 net/mlx5: E-Switch, Refactor eswitch ingress acl codes
Restructure the eswitch ingress acl codes into eswitch directory
and different files:
. Acl ingress helper functions to acl_helper.c/h
. Acl ingress functions used in offloads mode to acl_ingress_ofld.c
. Acl ingress functions used in legacy mode to acl_ingress_lgy.c

This patch does not change any functionality.

Signed-off-by: Vu Pham <vuhuong@mellanox.com>
2020-05-27 18:13:47 -07:00
Vu Pham
ea651a86d4 net/mlx5: E-Switch, Refactor eswitch egress acl codes
Refactor the egress acl codes so that offloads and legacy modes
can configure specifically their own needs of egress acl table,
groups and rules. While at it, restructure the eswitch egress
acl codes into eswitch directory and different files:
. Acl egress helper functions to acl_helper.c/h
. Acl egress functions used in offloads mode to acl_egress_ofld.c
. Acl egress functions used in legacy mode to acl_egress_lgy.c

This patch does not change any functionality.

Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-27 18:13:46 -07:00
Jason Gunthorpe
e4fdf7625b Merge branch 'mellanox/mlx5-next' into rdma.git for/next
From the mlx5-next branch at
  git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux

Required for dependencies in following patches

* branch 'mellanox/mlx5-next':
  net/mlx5: Add ability to read and write ECE options
  net/mlx5: Add support for RDMA TX FT headers modifying
  net/mlx5: Move iseg access helper routines close to mlx5_core driver
  net/mlx5: Cleanup mlx5_ifc_fte_match_set_misc2_bits

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-27 16:01:17 -03:00
Colin Ian King
7cf4eda481 mlxsw: spectrum_router: remove redundant initialization of pointer br_dev
The pointer br_dev is being initialized with a value that is never read
and is being updated with a new value later on. The initialization
is redundant and can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-27 11:23:27 -07:00
Ido Schimmel
10d3757fcb mlxsw: spectrum_router: Allow programming link-local prefix routes
The device has a trap for IPv6 packets that need be routed and have a
unicast link-local destination IP (i.e., fe80::/10). This allows mlxsw
to ignore link-local routes, as the packets will be trapped to the CPU
in any case.

However, since link-local routes are not programmed, it is possible for
routed packets to hit the default route which might also be programmed
to trap packets. This means that packets with a link-local destination
IP might be trapped for the wrong reason.

To overcome this, allow programming link-local prefix routes (usually
one fe80::/64 per-table), so that the packets will be forwarded until
reaching the link-local trap.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 20:33:58 -07:00
Ido Schimmel
9785b92b44 mlxsw: spectrum: Add packet traps for BFD packets
Bidirectional Forwarding Detection (BFD) provides "low-overhead,
short-duration detection of failures in the path between adjacent
forwarding engines" (RFC 5880).

This is accomplished by exchanging BFD packets between the two
forwarding engines. Up until now these packets were trapped via the
general local delivery (i.e., IP2ME) trap which also traps a lot of
other packets that are not as time-sensitive as BFD packets.

Expose dedicated traps for BFD packets so that user space could
configure a dedicated policer for them.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 20:33:58 -07:00
Ido Schimmel
dacc4e3acf mlxsw: spectrum: Treat IPv6 link-local SIP as an exception
IPv6 packets that need to be forwarded and have a link-local source IP are
dropped by the kernel and an ICMPv6 "Destination unreachable" is sent to
the sending host.

As such, change the trap group of such packets so that they do not
interfere with IPv6 management packets. In the future this trap will be
exposed as an exception via devlink-trap.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 20:33:58 -07:00
Ido Schimmel
1260e083d4 mlxsw: spectrum: Share one group for all locally delivered packets
Routed IP packets with the Router Alert option need to be trapped to
the CPU as they might need to be locally delivered to raw sockets with
the IP_ROUTER_ALERT / IPV6_ROUTER_ALERT socket option.

Move them to the same group with other packets that might need to be
trapped following route lookup.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 20:33:58 -07:00
Ido Schimmel
500769bebe mlxsw: reg: Move all trap groups under the same enum
After the previous patch the split is no longer necessary and all the
trap groups can be moved under the same enum.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 20:33:58 -07:00
Ido Schimmel
b87bde80da mlxsw: spectrum_trap: Do not hard code "thin" policer identifier
As explained in commit e612523041 ("mlxsw: spectrum_trap: Introduce
dummy group with thin policer"), the purpose of the "thin" policer is to
pass as less packets as possible to the CPU.

The identifier of this policer is currently set according to the maximum
number of used trap groups, but this is fragile: On Spectrum-1 the
maximum number of policers is less than the maximum number of trap
groups, which might result in an invalid policer identifier in case the
number of used trap groups grows beyond the policer limit.

Solve this by dynamically allocating the policer identifier.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 20:33:58 -07:00
Ido Schimmel
03cb0ce0dd mlxsw: switchx2: Move SwitchX-2 trap groups out of main enum
The number of Spectrum trap groups is not infinite, but two identifiers
are occupied by SwitchX-2 specific trap groups. Free these identifiers
by moving them out of the main enum.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 20:33:58 -07:00
Ido Schimmel
025b7de7f4 mlxsw: spectrum: Reduce priority of locally delivered packets
To align with recent recommended values. Will be configurable by future
patches.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 20:33:58 -07:00
Ido Schimmel
1e3cd58942 mlxsw: spectrum: Use same trap group for local routes and link-local destination
Packets with an IPv6 link-local destination (i.e., fe80::/10) should not
be forwarded and are therefore trapped to the CPU for local delivery.
Since these packets are trapped for the same logical reason as packets
hitting local routes, associate both traps with the same group.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 20:33:57 -07:00
Ido Schimmel
d322309d72 mlxsw: spectrum: Use separate trap group for FID miss
When a packet enters the device it is classified to a filtering
identifier (FID) based on the ingress port and VLAN. The FID miss trap
is used to trap packets for which a FID could not be found.

In mlxsw this trap should only be triggered when a port is enslaved to
an OVS bridge and a matching ACL rule could not be found, so as to
trigger learning.

These packets are therefore completely unrelated to packets hitting
local routes and should be in a different group. Move them.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 20:33:57 -07:00
Ido Schimmel
954eef2677 mlxsw: spectrum: Use same trap group for various IPv6 packets
Group these various IPv6 packets (e.g., router solicitations, router
advertisement) together and subject them to the same policer.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 20:33:57 -07:00
Ido Schimmel
412df3d1bb mlxsw: spectrum: Rename IPv6 ND trap group
The IPv6 Neighbour Discovery (ND) group will be used for various IPv6
packets, not all of which fall under the definition of ND, so rename it
to "IPV6" which is more appropriate.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 20:33:57 -07:00
Ido Schimmel
761bc42fbe mlxsw: spectrum: Use same switch case for identical groups
Trap groups that use the same policer settings can share the same switch
case.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 20:33:57 -07:00
Ido Schimmel
3c2d8a046a mlxsw: spectrum: Use dedicated trap group for ACL trap
Packets that are trapped via tc's trap action are currently subject to
the same policer as packets hitting local routes. The latter are
critical to the correct functioning of the control plane, while the
former are mainly used for traffic inspection.

Split the ACL trap to a separate group with its own policer. Use a
higher priority for these traps than for traps using mirror action
(e.g., ARP, IGMP). Otherwise, packets matching both traps will not be
forwarded in hardware (because of trap action) and also not forwarded in
software because they will be marked with 'offload_fwd_mark'.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 20:33:57 -07:00
Guillaume Nault
58cff782cc flow_dissector: Parse multiple MPLS Label Stack Entries
The current MPLS dissector only parses the first MPLS Label Stack
Entry (second LSE can be parsed too, but only to set a key_id).

This patch adds the possibility to parse several LSEs by making
__skb_flow_dissect_mpls() return FLOW_DISSECT_RET_PROTO_AGAIN as long
as the Bottom Of Stack bit hasn't been seen, up to a maximum of
FLOW_DIS_MPLS_MAX entries.

FLOW_DIS_MPLS_MAX is arbitrarily set to 7. This should be enough for
many practical purposes, without wasting too much space.

To record the parsed values, flow_dissector_key_mpls is modified to
store an array of stack entries, instead of just the values of the
first one. A bit field, "used_lses", is also added to keep track of
the LSEs that have been set. The objective is to avoid defining a
new FLOW_DISSECTOR_KEY_MPLS_XX for each level of the MPLS stack.

TC flower is adapted for the new struct flow_dissector_key_mpls layout.
Matching on several MPLS Label Stack Entries will be added in the next
patch.

The NFP and MLX5 drivers are also adapted: nfp_flower_compile_mac() and
mlx5's parse_tunnel() now verify that the rule only uses the first LSE
and fail if it doesn't.

Finally, the behaviour of the FLOW_DISSECTOR_KEY_MPLS_ENTROPY key is
slightly modified. Instead of recording the first Entropy Label, it
now records the last one. This shouldn't have any consequences since
there doesn't seem to have any user of FLOW_DISSECTOR_KEY_MPLS_ENTROPY
in the tree. We'd probably better do a hash of all parsed MPLS labels
instead (excluding reserved labels) anyway. That'd give better entropy
and would probably also simplify the code. But that's not the purpose
of this patch, so I'm keeping that as a future possible improvement.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-26 15:22:58 -07:00
Ido Schimmel
154388e112 mlxsw: spectrum: Fix spelling mistake in trap's name
Fix incorrect spelling of "advertisement".

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-24 19:32:23 -07:00
Ido Schimmel
ce3c3bf0bf mlxsw: spectrum: Use dedicated trap group for sampled packets
The rate with which packets are sampled is determined by user space, so
there is no need to associate such packets with a policer.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-24 19:32:23 -07:00
Ido Schimmel
b33f5d9fb7 mlxsw: spectrum: Use same trap group for IPv6 ND and ARP packets
Both packet types are needed for the same reason (neighbour discovery),
so associate them with the same trap group.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-24 19:32:23 -07:00
Ido Schimmel
32446438cc mlxsw: spectrum: Rename ARP trap group
The ARP trap group will be used for IPv6 ND traps in the next patch, so
rename it to "NEIGH_DISCOVERY" which is more appropriate.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-24 19:32:23 -07:00
Ido Schimmel
d88f8cc158 mlxsw: spectrum_trap: Remove unnecessary field
Now that traffic class (TC) and priority are set to the same value,
there is no need to store both. Remove the first.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-24 19:32:23 -07:00
Ido Schimmel
5047d819f5 mlxsw: spectrum: Align TC and trap priority
The traffic class (TC) attribute of packet traps determines through which
TC a packet trap will be scheduled through the CPU port.

The priority attribute determines which trap will be triggered in case
several packet traps match a packet.

We try to configure these attributes to the same value for all packet
traps as there is little reason not to.

Some packet traps did not use the same value, so rectify that now.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-24 19:32:23 -07:00
Ido Schimmel
e0d848477a mlxsw: spectrum_buffers: Assign non-zero quotas to TC 0 of the CPU port
As explained in commit 9ffcc3725f ("mlxsw: spectrum: Allow packets to
be trapped from any PG"), incoming packets can be admitted to the shared
buffer and forwarded / trapped, if:

(Ingress{Port}.Usage < Thres && Ingress{Port,PG}.Usage < Thres &&
 Egress{Port}.Usage < Thres && Egress{Port,TC}.Usage < Thres)
||
(Ingress{Port}.Usage < Min || Ingress{Port,PG} < Min ||
 Egress{Port}.Usage < Min || Egress{Port,TC}.Usage < Min)

Trapped packets are scheduled to transmission through the CPU port.
Currently, the minimum and maximum quotas of traffic class (TC) 0 of the
CPU port are 0, which means it is not usable.

Assign non-zero quotas to TC 0 of the CPU port, so that it could be
utilized by subsequent patches.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-24 19:32:23 -07:00
Ido Schimmel
938e6d0b76 mlxsw: spectrum: Change default rate and priority of DHCP packets
Reduce the default acceptable rate of DHCP packets to 128 packets per
second and reduce their priority. This is reasonable given the Spectrum
ASICs are limited to 128 ports at the moment.

These are only the default values. Users will be able to modify them via
devlink-trap.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-24 19:32:23 -07:00
Ido Schimmel
0ecb947412 mlxsw: spectrum: Trap IPv4 DHCP packets in router
Currently, IPv4 DHCP packets are trapped during L2 forwarding, which
means that packets might be trapped unnecessarily. Instead, only trap
the DHCP packets that reach the router. Either because they were flooded
to the router port or forwarded to it by the FDB. This is consistent
with the corresponding IPv6 trap.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-24 19:32:23 -07:00
Ido Schimmel
99129069b7 mlxsw: spectrum: Use same trap group for MLD and IGMP packets
Both packet types are needed for the same reason (multicast snooping),
so associate them with the same trap group.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-24 19:32:23 -07:00
Ido Schimmel
debb7af686 mlxsw: spectrum: Rename IGMP trap group
The IGMP trap group will be used for MLD traps in the next patch, so
rename it to "MC_SNOOPING" which is more appropriate.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-24 19:32:23 -07:00
David S. Miller
13209a8f73 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
The MSCC bug fix in 'net' had to be slightly adjusted because the
register accesses are done slightly differently in net-next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-24 13:47:27 -07:00
David S. Miller
e3181e9a72 mlx5-fixes-2020-05-22
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl7IbksACgkQSD+KveBX
 +j5T8Af/XT6b23VlSn2Km4tg8WQNDRJLdq1s6fTS5SGcyc0awxfH07cvYvJ26kKW
 kmdDNijkVbd0ma2UxHiiD3vmE8Vs85gZ6BDNyl485x/cH3zFzAm54R5fZdnK5JgN
 YNgdFP0MOwPtAdDtxLH+r8aOyNKncIOmCZrMNnxVgI+IytG1L5QLnS6GeQy2zyIx
 9F/9sihta2z567IstGu2wvmgviSHVk/zV9yqn/orD9tV6oFvvrBQMlEt8l27b1tA
 4bajbHIyc1WmfQ+wg56eXATdbqCQ2YYfMjhchiCfFv5DhnMnPi5bV0PNR9Rq0CYw
 05xpF16/85uvDbTizsgGNZ1Pb1nGsQ==
 =oFWF
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-fixes-2020-05-22' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5 fixes 2020-05-22

This series introduces some fixes to mlx5 driver.

Please pull and let me know if there is any problem.

For -stable v4.13
   ('net/mlx5: Add command entry handling completion')

For -stable v5.2
   ('net/mlx5: Fix error flow in case of function_setup failure')
   ('net/mlx5: Fix memory leak in mlx5_events_init')

For -stable v5.3
   ('net/mlx5e: Update netdev txq on completions during closure')
   ('net/mlx5e: kTLS, Destroy key object after destroying the TIS')
   ('net/mlx5e: Fix inner tirs handling')

For -stable v5.6
   ('net/mlx5: Fix cleaning unmanaged flow tables')
   ('net/mlx5: Fix a race when moving command interface to events mode')
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-23 16:39:45 -07:00
David S. Miller
46c54f9500 mlx5-updates-2020-05-22
This series includes two updates and one cleanup patch
 
 1) Tang Bim, clean-up with IS_ERR() usage
 
 2) Vlad introduces a new mlx5 kconfig flag for TC support
 
    This is required due to the high volume of current and upcoming
    development in the eswitch and representors areas where some of the
    feature are TC based such as the downstream patches of MPLSoUDP and
    the following representor bonding support for VF live migration and
    uplink representor dynamic loading.
    For this Vlad kept TC specific code in tc.c and rep/tc.c and
    organized non TC code in representors specific files.
 
 3) Eli Cohen adds support for MPLS over UPD encap and decap TC offloads.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl7IZFEACgkQSD+KveBX
 +j7IEQf/RFv633bWTlL63fEJjViRv1rjfkbyaXrGVL3gzr/Er01DeAPR22CNOlC3
 bu1jHLKqVn0Mg0g5g2B4/H/7JoFbMBRTy4MXpM5VrQCIqwMuXG4zhWuoUj7ncQ5w
 kXHAU6DUuZRn8/x1JLQOHDRTzKhav7ldT+nvvoKEMrad/DEMGz+bq67xh4l8nfi+
 ktSFAO0UFi9ysb25CMfdqIqAL0J5nAJ7DNhw5x7IvtwUxNxate7HtBaBhBgZ9NWv
 jYf8R3p+7JdgvVW18pZhmjbaBqaApXcZrC7rI07PR6rCOAHfToX6miR8gUtpIEno
 itQkzYt9UF2dgNwMmxoJLqnUNiy/Cg==
 =wkSR
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2020-05-22' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2020-05-22

This series includes two updates and one cleanup patch

1) Tang Bim, clean-up with IS_ERR() usage

2) Vlad introduces a new mlx5 kconfig flag for TC support

   This is required due to the high volume of current and upcoming
   development in the eswitch and representors areas where some of the
   feature are TC based such as the downstream patches of MPLSoUDP and
   the following representor bonding support for VF live migration and
   uplink representor dynamic loading.
   For this Vlad kept TC specific code in tc.c and rep/tc.c and
   organized non TC code in representors specific files.

3) Eli Cohen adds support for MPLS over UPD encap and decap TC offloads.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-23 16:37:00 -07:00
Qiushi Wu
febfd9d3c7 net/mlx4_core: fix a memory leak bug.
In function mlx4_opreq_action(), pointer "mailbox" is not released,
when mlx4_cmd_box() return and error, causing a memory leak bug.
Fix this issue by going to "out" label, mlx4_free_cmd_mailbox() can
free this pointer.

Fixes: fe6f700d6c ("net/mlx4_core: Respond to operation request by firmware")
Signed-off-by: Qiushi Wu <wu000273@umn.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-23 16:34:37 -07:00
David S. Miller
a152b85984 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2020-05-23

The following pull-request contains BPF updates for your *net-next* tree.

We've added 50 non-merge commits during the last 8 day(s) which contain
a total of 109 files changed, 2776 insertions(+), 2887 deletions(-).

The main changes are:

1) Add a new AF_XDP buffer allocation API to the core in order to help
   lowering the bar for drivers adopting AF_XDP support. i40e, ice, ixgbe
   as well as mlx5 have been moved over to the new API and also gained a
   small improvement in performance, from Björn Töpel and Magnus Karlsson.

2) Add getpeername()/getsockname() attach types for BPF sock_addr programs
   in order to allow for e.g. reverse translation of load-balancer backend
   to service address/port tuple from a connected peer, from Daniel Borkmann.

3) Improve the BPF verifier is_branch_taken() logic to evaluate pointers
   being non-NULL, e.g. if after an initial test another non-NULL test on
   that pointer follows in a given path, then it can be pruned right away,
   from John Fastabend.

4) Larger rework of BPF sockmap selftests to make output easier to understand
   and to reduce overall runtime as well as adding new BPF kTLS selftests
   that run in combination with sockmap, also from John Fastabend.

5) Batch of misc updates to BPF selftests including fixing up test_align
   to match verifier output again and moving it under test_progs, allowing
   bpf_iter selftest to compile on machines with older vmlinux.h, and
   updating config options for lirc and v6 segment routing helpers, from
   Stanislav Fomichev, Andrii Nakryiko and Alan Maguire.

6) Conversion of BPF tracing samples outdated internal BPF loader to use
   libbpf API instead, from Daniel T. Lee.

7) Follow-up to BPF kernel test infrastructure in order to fix a flake in
   the XDP selftests, from Jesper Dangaard Brouer.

8) Minor improvements to libbpf's internal hashmap implementation, from
   Ian Rogers.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 18:30:34 -07:00
Shay Drory
4f7400d5cb net/mlx5: Fix error flow in case of function_setup failure
Currently, if an error occurred during mlx5_function_setup(), we
keep dev->state as DEVICE_STATE_UP.
Fixing it by adding a goto label.

Fixes: e161105e58 ("net/mlx5: Function setup/teardown procedures")
Signed-off-by: Shay Drory <shayd@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 17:28:58 -07:00
Roi Dayan
d37bd5e81e net/mlx5e: CT: Correctly get flow rule
The correct way is to us the flow_cls_offload_flow_rule() wrapper
instead of f->rule directly.

Fixes: 4c3844d9e9 ("net/mlx5e: CT: Introduce connection tracking")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 17:28:56 -07:00
Moshe Shemesh
5e911e2c06 net/mlx5e: Update netdev txq on completions during closure
On sq closure when we free its descriptors, we should also update netdev
txq on completions which would not arrive. Otherwise if we reopen sqs
and attach them back, for example on fw fatal recovery flow, we may get
tx timeout.

Fixes: 29429f3300 ("net/mlx5e: Timeout if SQ doesn't flush during close")
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 17:28:54 -07:00
Roi Dayan
9ca415399d net/mlx5: Annotate mutex destroy for root ns
Invoke mutex_destroy() to catch any errors.

Fixes: 2cc43b494a ("net/mlx5_core: Managing root flow table")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 17:28:52 -07:00
Roi Dayan
6eb7a268a9 net/mlx5: Don't maintain a case of del_sw_func being null
Add del_sw_func cb for root ns. Now there is no need to
maintain a case of del_sw_func being null when freeing the node.

Fixes: 2cc43b494a ("net/mlx5_core: Managing root flow table")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 17:28:50 -07:00
Roi Dayan
aee37f3d94 net/mlx5: Fix cleaning unmanaged flow tables
Unmanaged flow tables doesn't have a parent and tree_put_node()
assume there is always a parent if cleaning is needed. fix that.

Fixes: 5281a0c909 ("net/mlx5: fs_core: Introduce unmanaged flow tables")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 17:28:48 -07:00
Moshe Shemesh
df14ad1ecc net/mlx5: Fix memory leak in mlx5_events_init
Fix memory leak in mlx5_events_init(), in case
create_single_thread_workqueue() fails, events
struct should be freed.

Fixes: 5d3c537f90 ("net/mlx5: Handle event of power detection in the PCIE slot")
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 17:28:46 -07:00
Roi Dayan
a16b8e0dcf net/mlx5e: Fix inner tirs handling
In the cited commit inner_tirs argument was added to create and destroy
inner tirs, and no indication was added to mlx5e_modify_tirs_hash()
function. In order to have a consistent handling, use
inner_indir_tir[0].tirn in tirs destroy/modify function as an indication
to whether inner tirs are created.
Inner tirs are not created for representors and before this commit,
a call to mlx5e_modify_tirs_hash() was sending HW commands to
modify non-existent inner tirs.

Fixes: 46dc933cee ("net/mlx5e: Provide explicit directive if to create inner indirect tirs")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 17:28:44 -07:00
Tariq Toukan
16736e11f4 net/mlx5e: kTLS, Destroy key object after destroying the TIS
The TLS TIS object contains the dek/key ID.
By destroying the key first, the TIS would contain an invalid
non-existing key ID.
Reverse the destroy order, this also acheives the desired assymetry
between the destroy and the create flows.

Fixes: d2ead1f360 ("net/mlx5e: Add kTLS TX HW offload support")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 17:28:42 -07:00
Maor Dickman
321348475d net/mlx5e: Fix allowed tc redirect merged eswitch offload cases
After changing the parent_id to be the same for both NICs of same
The cited commit wrongly allow offload of tc redirect flows from
VF to uplink and vice versa when devcies are on different eswitch,
these cases aren't supported by HW.

Disallow the above offloads when devcies are on different eswitch
and VF LAG is not configured.

Fixes: f6dc1264f1 ("net/mlx5e: Disallow tc redirect offload cases we don't support")
Signed-off-by: Maor Dickman <maord@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 17:28:40 -07:00
Eran Ben Elisha
f7936ddd35 net/mlx5: Avoid processing commands before cmdif is ready
When driver is reloading during recovery flow, it can't get new commands
till command interface is up again. Otherwise we may get to null pointer
trying to access non initialized command structures.

Add cmdif state to avoid processing commands while cmdif is not ready.

Fixes: e126ba97db ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 17:28:38 -07:00
Eran Ben Elisha
d43b7007db net/mlx5: Fix a race when moving command interface to events mode
After driver creates (via FW command) an EQ for commands, the driver will
be informed on new commands completion by EQE. However, due to a race in
driver's internal command mode metadata update, some new commands will
still be miss-handled by driver as if we are in polling mode. Such commands
can get two non forced completion, leading to already freed command entry
access.

CREATE_EQ command, that maps EQ to the command queue must be posted to the
command queue while it is empty and no other command should be posted.

Add SW mechanism that once the CREATE_EQ command is about to be executed,
all other commands will return error without being sent to the FW. Allow
sending other commands only after successfully changing the driver's
internal command mode metadata.
We can safely return error to all other commands while creating the command
EQ, as all other commands might be sent from the user/application during
driver load. Application can rerun them later after driver's load was
finished.

Fixes: e126ba97db ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 17:28:36 -07:00
Moshe Shemesh
17d00e839d net/mlx5: Add command entry handling completion
When FW response to commands is very slow and all command entries in
use are waiting for completion we can have a race where commands can get
timeout before they get out of the queue and handled. Timeout
completion on uninitialized command will cause releasing command's
buffers before accessing it for initialization and then we will get NULL
pointer exception while trying access it. It may also cause releasing
buffers of another command since we may have timeout completion before
even allocating entry index for this command.
Add entry handling completion to avoid this race.

Fixes: e126ba97db ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 17:28:34 -07:00
Eli Cohen
582234b465 net/mlx5e: Support pedit on mpls over UDP decap
Allow to modify ethernet headers while decapsulating mpls over UDP
packets. This is implemented using the same reformat object used for
decapsulation.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 16:46:23 -07:00
Eli Cohen
14e6b038af net/mlx5e: Add support for hw decapsulation of MPLS over UDP
MPLS over UDP is supported in hardware by using a packet reformat object
with reformat type equal L3_TUNNEL_TO_L2 which both decapsulates the
outer L3, L4 and MPLS headers, and allows for setting the L2 headers of
the resulting decapsulated packet. For the hardware to operate
correctly, the configuration of the firmware must have
FLEX_PARSER_PROFILE_ENABLE = 1.

Example tc rule:
  tc filter add dev bareudp0 protocol all prio 1 root flower enc_dst_port \
      6635 enc_src_ip 8.8.8.23 action mpls pop protocol ip pipe \
      action pedit ex munge eth dst set 00:11:22:33:44:21 pipe action \
      mirred egress redirect dev enp59s0f0_0

We use pedit to set the correct destination MAC.

For MPLS over UDP decapsulation to take place, the driver logic requires
the following:

1. flower filter added on bareudp device.
2. action mpls pop
3. zero or more pedit munge actions
4. one redirect action

Current implementation supports only IPv4 and no VLAN.

tc filter show output looks like this:
   filter protocol all pref 1 flower chain 0
   filter protocol all pref 1 flower chain 0 handle 0x1
     enc_src_ip 8.8.8.24
     enc_dst_port 6635
     in_hw in_hw_count 1
            action order 1: mpls  pop protocol ip pipe
             index 2 ref 1 bind 1

            action order 2:  pedit action pipe keys 2
             index 1 ref 1 bind 1
             key #0  at eth+0: val 00112233 mask 00000000
             key #1  at eth+4: val 44210000 mask 0000ffff

            action order 3: mirred (Egress Redirect to device enp59s0f0_0) stolen
            index 2 ref 1 bind 1

Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 16:46:21 -07:00
Eli Cohen
72046a91d1 net/mlx5e: Allow to match on mpls parameters
Support matching on MPLS over UDP parameters using misc2 section of
match parameters.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 16:46:19 -07:00
Eli Cohen
f828ca6a2f net/mlx5e: Add support for hw encapsulation of MPLS over UDP
MPLS over UDP is supported by adding a rule on a representor net device
which does tunnel_key set, push mpls and forward to a baredup device. At
the hardware level we use a packet_reformat_context object to do the
encapsulation of the packet.

The resulting packet looks as follows (left side transmitted first):
outer L2 | outer IP | UDP | MPLS | inner L3 and data |

Example usage:
  tc filter add dev $rep0 protocol ip prio 1 root flower skip_sw  \
     action tunnel_key set src_ip 8.8.8.21 dst_ip 8.8.8.24 id 555 \
     dst_port 6635 tos 4 ttl 6 csum action mpls push protocol 0x8847 \
     label 555 tc 3 action mirred egress redirect dev bareudp0

This is how the filter is shown with tc filter show:
tc filter show dev enp59s0f0_0 ingress
filter protocol ip pref 1 flower chain 0
filter protocol ip pref 1 flower chain 0 handle 0x1
  eth_type ipv4
  skip_sw
  in_hw in_hw_count 1
        action order 1: tunnel_key  set
        src_ip 8.8.8.21
        dst_ip 8.8.8.24
        key_id 555
        dst_port 6635
        csum
        tos 0x4
        ttl 6 pipe
         index 1 ref 1 bind 1

        action order 2: mpls  push protocol mpls_uc label 555 tc 3 ttl 255 pipe
         index 1 ref 1 bind 1

        action order 3: mirred (Egress Redirect to device bareudp0) stolen
        index 1 ref 1 bind 1

Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Eli Britstein <elibr@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 16:46:18 -07:00
Vlad Buslov
d956873f90 net/mlx5e: Introduce kconfig var for TC support
In order to improve code maintainability and readability, introduce new
CONFIG_MLX5_CLS_ACT kconfig variable to control compilation of TC hardware
offloads implementation. This allows distinguishing between features that
require TC support (MPLSoUDP, etc.) and features that just rely on
representor functionality (rep_bond for live migration, etc.).

Modify rep_tc.h, rep_neigh.h, en_tc.h and chains.h files to provide stubs
for functions that are called from generic code.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 16:46:14 -07:00
Vlad Buslov
e2394a61d2 net/mlx5e: Move TC-specific code from en_main.c to en_tc.c
As a preparation for introducing new kconfig option that controls
compilation of all TC offloads code in mlx5, extract TC-specific code from
en_main.c to en_tc.c. This allows easily compiling out the code by
only including new source in make file when corresponding kconfig is
enabled instead of adding multiple ifdef blocks to en_main.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 16:46:12 -07:00
Vlad Buslov
549c243e4e net/mlx5e: Extract neigh-specific code from en_rep.c to rep/neigh.c
As a preparation for introducing new kconfig option that controls
compilation of all TC offloads code in mlx5, extract neigh-specific code
from en_rep.c to standalone file. This allows easily compiling out the code
by only including new source in make file when corresponding kconfig is
enabled instead of adding multiple ifdef blocks to en_rep.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 16:46:10 -07:00
Vlad Buslov
768c3667e6 net/mlx5e: Extract TC-specific code from en_rep.c to rep/tc.c
As a preparation for introducing new kconfig option that controls
compilation of all TC offloads code in mlx5, extract TC-specific code from
en_rep.c to standalone file. This allows easily compiling out the code by
only including new source in make file when corresponding kconfig is
enabled instead of adding multiple ifdef blocks to en_rep.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 16:46:08 -07:00
Tang Bin
2639324a8f net/mlx5e: Use IS_ERR() to check and simplify code
Use IS_ERR() and PTR_ERR() instead of PTR_ERR_OR_ZERO() to
simplify code, avoid redundant judgements.

Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Signed-off-by: Tang Bin <tangbin@cmss.chinamobile.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-22 16:46:07 -07:00
Jiri Pirko
4340f42f20 mlxsw: spectrum: Fix use-after-free of split/unsplit/type_set in case reload fails
In case of reload fail, the mlxsw_sp->ports contains a pointer to a
freed memory (either by reload_down() or reload_up() error path).
Fix this by initializing the pointer to NULL and checking it before
dereferencing in split/unsplit/type_set callpaths.

Fixes: 24cc68ad6c ("mlxsw: core: Add support for reload")
Reported-by: Danielle Ratson <danieller@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 16:08:14 -07:00
Edward Cree
060b6381ef net: flow_offload: simplify hw stats check handling
Make FLOW_ACTION_HW_STATS_DONT_CARE be all bits, rather than none, so that
 drivers and __flow_action_hw_stats_check can use simple bitwise checks.

Pre-fill all actions with DONT_CARE in flow_rule_alloc(), rather than
 relying on implicit semantics of zero from kzalloc, so that callers which
 don't configure action stats themselves (i.e. netfilter) get the correct
 behaviour by default.

Only the kernel's internal API semantics change; the TC uAPI is unaffected.

v4: move DONT_CARE setting to flow_rule_alloc() for robustness and simplicity.

v3: set DONT_CARE in nft and ct offload.

v2: rebased on net-next, removed RFC tags.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 15:52:08 -07:00
Björn Töpel
39d6443c8d mlx5, xsk: Migrate to new MEM_TYPE_XSK_BUFF_POOL
Use the new MEM_TYPE_XSK_BUFF_POOL API in lieu of MEM_TYPE_ZERO_COPY in
mlx5e. It allows to drop a lot of code from the driver (which is now
common in AF_XDP core and was related to XSK RX frame allocation, DMA
mapping, etc.) and slightly improve performance (RX +0.8 Mpps, TX +0.4
Mpps).

rfc->v1: Put back the sanity check for XSK params, use XSK API to get
         the total headroom size. (Maxim)

v1->v2: Fix DMA address handling, set XDP metadata to invalid. (Maxim)

v2->v3: Handle frame_sz, use xsk_buff_xdp_get_frame_dma, use xsk_buff
        API for DMA sync on TX, add performance numbers. (Maxim)

v3->v4: Remove unused variable num_xsk_frames. (Jakub)

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200520192103.355233-12-bjorn.topel@gmail.com
2020-05-21 17:31:27 -07:00
Magnus Karlsson
a71506a4fd xsk: Move driver interface to xdp_sock_drv.h
Move the AF_XDP zero-copy driver interface to its own include file
called xdp_sock_drv.h. This, hopefully, will make it more clear for
NIC driver implementors to know what functions to use for zero-copy
support.

v4->v5: Fix -Wmissing-prototypes by include header file. (Jakub)

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200520192103.355233-4-bjorn.topel@gmail.com
2020-05-21 17:31:26 -07:00
Jason Gunthorpe
eafd47fc20 Linux 5.7-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl7BzV8eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGg8EH/A2pXMTxtc96RI4S
 sttEsUQqbakFS0Z/2tQPpMGr/qW2e5eHgsTX/a3SiUeZiIXk6f4lMFkMuctzBf7p
 X77cNEDwGOEdbtCXTsMcmKSde7sP2zCXsPB8xTWLyE6rnaFRgikwwkeqgkIKhp1h
 bvOQV0t9HNGvxGAM0iZeOvQAvFl4vd7nS123/MYbir9cugfQUSJRueQ4BiCiJqVE
 6cNA7/vFzDJuFGszzIrJ7HXn/IdQMMWHkvTDjgBw0GZw1mDbGFbfbZwOeTz1ojCt
 smUQ4tIFxBa/VA5zx7dOy2P2keHbSVf4VLkZRPcceT7OqVS65ETmFDp+qt5NdWM5
 vZ8+7/0=
 =CyYH
 -----END PGP SIGNATURE-----

Merge tag 'v5.7-rc6' into rdma.git for-next

Linux 5.7-rc6

Conflict in drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c
resolved by deleting dr_cq_event, matching how netdev resolved it.

Required for dependencies in the following patches.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21 17:08:27 -03:00
Michael Guralnik
ecf814e0e1 net/mlx5: Add support for RDMA TX FT headers modifying
Support adding header modifying actions to the RDMA TX flow table.

Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-18 09:21:46 -07:00
Parav Pandit
555af0c3fa net/mlx5: Move iseg access helper routines close to mlx5_core driver
Only mlx5_core driver handles fw initialization check and command
interface revision check.
Hence move them inside the mlx5_core driver where it is used.
This avoid exposing these helpers to all mlx5 drivers.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-18 09:21:46 -07:00
Raed Salem
356d411c26 net/mlx5: Cleanup mlx5_ifc_fte_match_set_misc2_bits
Remove the "metadata_reg_b" field and all uses of this field in code
to match the device specification. As this field is not in use in SW
steering it is safe to remove it.

Signed-off-by: Raed Salem <raeds@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-18 09:21:46 -07:00
Ido Schimmel
200b7cca0b mlxsw: spectrum_trap: Store all trap data in one array
Each trap registered with devlink is mapped to one or more Rx listeners.
These listeners allow the switch driver (e.g., mlxsw_spectrum) to
register a function that is called when a packet is received (trapped)
for a specific reason.

Currently, three arrays are used to describe the mapping between the
logical devlink traps and the Rx listeners.

Instead, get rid of these arrays and store all the information in one
array that is easier to validate and extend with more per-trap
information.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-16 16:42:31 -07:00
Ido Schimmel
b14a40dbde mlxsw: spectrum_trap: Store all trap group data in one array
Use one array to store all the information about all the trap groups
instead of hard coding it in code. This will be used in future patches
to disable certain functionality (e.g., policer binding) on a trap group
basis.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-16 16:42:31 -07:00
Ido Schimmel
cc678f4dbc mlxsw: spectrum_trap: Store all trap policer data in one array
Instead of maintaining an array of policers and a linked list, only
maintain an array.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-16 16:42:31 -07:00
Ido Schimmel
85d4ec5925 mlxsw: spectrum_trap: Move struct definition out of header file
'struct mlxsw_sp_trap_policer_item' is only used in one file, so move it
there.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-16 16:42:31 -07:00
Tariq Toukan
3f3ab178c7 net/mlx5e: Take DCBNL-related definitions into dedicated files
Take DCBNL-related definitions out of the common en.h header,
Use a dedicated header file for exposing them.
Some need not to be exposed, use them locally in the .c file.
Use stubs to eliminate use of CONFIG_MLX5_CORE_EN_DCB in the
generic control flows.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-15 15:44:36 -07:00
Maxim Mikityanskiy
5ffb4d858b net/mlx5e: Calculate SQ stop room in a robust way
Currently, different formulas are used to estimate the space that may be
taken by WQEs in the SQ during a single packet transmit. This space is
called stop room, and it's checked in the end of packet transmit to find
out if the next packet could overflow the SQ. If it could, the driver
tells the kernel to stop sending next packets.

Many factors affect the stop room:

1. Padding with NOPs to avoid WQEs spanning over page boundaries.

2. Enabled and disabled offloads (TLS, upcoming MPWQE).

3. The maximum size of a WQE.

The padding is performed before every WQE if it doesn't fit the current
page.

The current formula assumes that only one padding will be required per
packet, and it doesn't take into account that the WQEs posted during the
transmission of a single packet might exceed the page size in very rare
circumstances. For example, to hit this condition with 4096-byte pages,
TLS offload will have to interrupt an almost-full MPWQE session, be in
the resync flow and try to transmit a near to maximum amount of data.

To avoid SQ overflows in such rare cases after MPWQE is added, this
patch introduces a more robust formula to estimate the stop room. The
new formula uses the fact that a WQE of size X will not require more
than X-1 WQEBBs of padding. More exact estimations are possible, but
they result in much more complex and error-prone code for little gain.

Before this patch, the TLS stop room included space for both INNOVA and
ConnectX TLS offloads that couldn't run at the same time anyway, so this
patch accounts only for the active one.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-15 15:44:34 -07:00
Erez Shitrit
8b46d424a7 net/mlx5e: IPoIB, Drop multicast packets that this interface sent
After enabled loopback packets for IPoIB, we need to drop these packets
that this HCA has replicated and came back to the same interface that
sent them.

Fixes: 4c6c615e3f ("net/mlx5e: IPoIB, Add PKEY child interface nic profile")
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-15 15:44:32 -07:00
Erez Shitrit
80639b199c net/mlx5e: IPoIB, Enable loopback packets for IPoIB interfaces
Enable loopback of unicast and multicast traffic for IPoIB enhanced
mode.
This will allow interfaces with the same pkey to communicate between
them e.g cloned interfaces that located in different namespaces.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-15 15:44:30 -07:00
Roi Dayan
9102d836d2 net/mlx5e: CT: Fix offload with CT action after CT NAT action
It could be a chain of rules will do action CT again after CT NAT
Before this fix matching will break as we get into the CT table
after NAT changes and not CT NAT.
Fix this by adding pre ct and pre ct nat tables to skip ct/ct_nat
tables and go straight to post_ct table if ct/nat was already done.

Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-15 15:44:27 -07:00
Eran Ben Elisha
90bf1c8dbd net/mlx5: Move internal timer read function to clock library
Move mlx5_read_internal_timer() into lib/clock.c file as it is being
used there. As such, make this function a static one.

In addition, rearrange headers include to support function move.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-15 15:44:25 -07:00
Paul Blakey
49c0355d30 net/mlx5: Wait for inactive autogroups
Currently, if one thread tries to add an entry to an autogrouped table
with no free matching group, while another thread is in the process of
creating a new matching autogroup, it doesn't wait for the new group
creation, and creates an unnecessary new autogroup.

Instead of skipping inactive, wait on the write lock of those groups.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-15 15:44:22 -07:00
Parav Pandit
41798df9bf net/mlx5: Drain wq first during PCI device removal
mlx5_unload_one() is done with cleanup = true only once.

So instead of doing health wq drain inside the if(), directly do
during PCI device removal.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-15 15:44:20 -07:00
Parav Pandit
4162f58b47 net/mlx5: Have single error unwinding path
Having multiple error unwinding path are error prone.
Lets have just one error unwinding path.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-15 15:44:17 -07:00
Eran Ben Elisha
e7f860e210 net/mlx5: Fix a bug of releasing wrong chunks on > 4K page size systems
On systems with page size larger than 4K, a fwp object has few 4K chunks.
Fix a bug in fwp free flow where the chunk address was dropped and
fwp->addr was used instead (first chunk address). This caused a wrong
update of fwp->bitmask which later can cause errors in re-alloc fwp
chunk flow.

In order to fix this it, re-factor the release flow:
- Free 4k: Releases a specific 4k chunk inside the fwp, defined by
  starting address.
- Free fwp: Unconditionally release the whole fwp and its resources.
Free addr will call free fwp if all chunks were released, in order to do
code sharing.

In addition, fix npages to count for all released chunks correctly.

Fixes: c6168161f6 ("net/mlx5: Add support for release all pages event")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-15 15:44:15 -07:00
Eran Ben Elisha
2726cd4a29 net/mlx5: Dedicate fw page to the requesting function
The cited patch assumes that all chuncks in a fw page belong to the same
function, thus the driver must dedicate fw page to the requesting
function, which is actually what was intedned in the original fw pages
allocator design, hence the fwp->func_id !

Up until the cited patch everything worked ok, but now "relase all pages"
is broken on systems with page_size > 4k.

Fix this by dedicating fw page to the requesting function id via adding a
func_id parameter to alloc_4k() function.

Fixes: c6168161f6 ("net/mlx5: Add support for release all pages event")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-15 15:44:12 -07:00
Jesper Dangaard Brouer
d628ee4fef mlx5: Rx queue setup time determine frame_sz for XDP
The mlx5 driver have multiple memory models, which are also changed
according to whether a XDP bpf_prog is attached.

The 'rx_striding_rq' setting is adjusted via ethtool priv-flags e.g.:
 # ethtool --set-priv-flags mlx5p2 rx_striding_rq off

On the general case with 4K page_size and regular MTU packet, then
the frame_sz is 2048 and 4096 when XDP is enabled, in both modes.

The info on the given frame size is stored differently depending on the
RQ-mode and encoded in a union in struct mlx5e_rq union wqe/mpwqe.
In rx striding mode rq->mpwqe.log_stride_sz is either 11 or 12, which
corresponds to 2048 or 4096 (MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ).
In non-striding mode (MLX5_WQ_TYPE_CYCLIC) the frag_stride is stored
in rq->wqe.info.arr[0].frag_stride, for the first fragment, which is
what the XDP case cares about.

To reduce effect on fast-path, this patch determine the frame_sz at
setup time, to avoid determining the memory model runtime. Variable
is named frame0_sz to make it clear that this is only the frame
size of the first fragment.

This mlx5 driver does a DMA-sync on XDP_TX action, but grow is safe
as it have done a DMA-map on the entire PAGE_SIZE. The driver also
already does a XDP length check against sq->hw_mtu on the possible
XDP xmit paths mlx5e_xmit_xdp_frame() + mlx5e_xmit_xdp_frame_mpwqe().

V3+4: Change variable name first_frame_sz to frame0_sz

V2: Fix that frag_size need to be recalc before creating SKB.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Link: https://lore.kernel.org/bpf/158945348021.97035.12295039384250022883.stgit@firesoul
2020-05-14 21:21:56 -07:00
Jesper Dangaard Brouer
d201ea9ebc mlx4: Add XDP frame size and adjust max XDP MTU
The mlx4 drivers size of memory backing the RX packet is stored in
frag_stride. For XDP mode this will be PAGE_SIZE (normally 4096).
For normal mode frag_stride is 2048.

Also adjust MLX4_EN_MAX_XDP_MTU to take tailroom into account.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Link: https://lore.kernel.org/bpf/158945341893.97035.2688142527052329942.stgit@firesoul
2020-05-14 21:21:55 -07:00
Maor Gottlieb
9254f8ed15 net/mlx5: Add support in forward to namespace
Currently, fs_core supports rule of forward the traffic
to continue matching in the next priority, now we add support
to forward the traffic matching in the next namespace.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-05-13 18:56:31 +03:00
Maor Gottlieb
14c129e301 {IB/net}/mlx5: Simplify don't trap code
The fs_core already supports creation of rules with multiple
actions/destinations. Refactor fs_core to handle the case
when don't trap rule is created with destination. Adapt the
calling code in the driver.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-05-13 18:56:18 +03:00
Jiri Pirko
67ed68fc0c mlxsw: spectrum_flower: Forbid to insert flower rules in collision with matchall rules
On ingress, the matchall rules doing mirroring and sampling are offloaded
into hardware blocks that are processed before any flower rules.
On egress, the matchall mirroring rules are offloaded into hardware
block that is processed after all flower rules.

Therefore check the priorities of inserted flower rules against
existing matchall rules and ensure the correct ordering.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-09 16:02:43 -07:00
Jiri Pirko
18346b70ab mlxsw: spectrum_matchall: Forbid to insert matchall rules in collision with flower rules
On ingress, the matchall rules doing mirroring and sampling are offloaded
into hardware blocks that are processed before any flower rules.
On egress, the matchall mirroring rules are offloaded into hardware
block that is processed after all flower rules.

Therefore check the priorities of inserted matchall rules against
existing flower rules and ensure the correct ordering.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-09 16:02:43 -07:00
Jiri Pirko
aed65285fb mlxsw: spectrum_matchall: Expose a function to get min and max rule priority
Introduce an infrastructure that allows to get minimum and maximum
rule priority for specified chain. This is going to be used by
a subsequent patch to enforce ordering between flower and
matchall filters.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-09 16:02:43 -07:00
Jiri Pirko
5a2939b9d7 mlxsw: spectrum_matchall: Put matchall list into substruct of flow struct
As there are going to be other matchall specific fields in flow
structure, put the existing list field into matchall substruct.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-09 16:02:43 -07:00
Jiri Pirko
593bb84379 mlxsw: spectrum_flower: Expose a function to get min and max rule priority
Introduce an infrastructure that allows to get minimum and maximum
rule priority for specified chain. This is going to be used by
a subsequent patch to enforce ordering between flower and
matchall filters.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-09 16:02:43 -07:00
Jiri Pirko
18aa23b31f mlxsw: spectrum_matchall: Restrict sample action to be allowed only on ingress
HW supports packet sampling on ingress only. Check and fail if user
is adding sample on egress.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-09 16:02:43 -07:00
Tariq Toukan
28bff09518 net/mlx5e: Enhance ICOSQ WQE info fields
The same WQE opcode might be used in different ICOSQ flows
and WQE types.
To have a better distinguishability, replace it with an enum that
better indicates the WQE type and flow it is used for.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09 01:05:42 -07:00
Tariq Toukan
6b74f60ef5 net/mlx5: Accel, Remove unnecessary header include
The include of Ethernet driver header in core is not needed
and actually wrong.
Remove it.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09 01:05:42 -07:00
Tariq Toukan
41a8e4ebb4 net/mlx5e: Use struct assignment for WQE info updates
Struct assignment looks more clean, and implies resetting
the not assigned fields to zero, instead of holding values
from older ring cycles.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09 01:05:41 -07:00
Tariq Toukan
05dfd57082 net/mlx5e: Take TX WQE info structures out of general EN header
Into the txrx header file.
The mlx5e_sq_wqe_info structure describes WQE info for the ICOSQ,
rename it to better reflect this.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09 01:05:41 -07:00
Tariq Toukan
f713ce1de8 net/mlx5e: kTLS, Do not fill edge for the DUMP WQEs in TX flow
Every single DUMP WQE resides in a single WQEBB.
As the pi is calculated per each one separately, there is
no real need for a contiguous room for them, allow them to populate
different WQ fragments.
This reduces WQ waste and improves its utilization.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09 01:05:41 -07:00
Tariq Toukan
ab1e0ce99d net/mlx5e: kTLS, Fill work queue edge separately in TX flow
For the static and progress context params WQEs, do the edge
filling separately.
This improves the WQ utilization, code readability, and reduces
the chance of future bugs.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09 01:05:40 -07:00
Maxim Mikityanskiy
714c88a38b net/mlx5e: Split TX acceleration offloads into two phases
After previous modifications, the offloads are no longer called one by
one, the pi is calculated and the wqe is cleared on between of TLS and
IPSEC offloads, which doesn't quite fit mlx5e_accel_handle_tx's purpose.

This patch splits mlx5e_accel_handle_tx into two functions that
correspond to two logical phases of running offloads:

1. Before fetching a WQE. Here runs the code that can post WQEs on its
own, before the main WQE is fetched. It's the main part of TLS offload.

2. After fetching a WQE. Here runs the code that updates the WQE's
fields, but can't post other WQEs any more. It's a minor part of TLS
offload that sets the tisn field in the cseg, and eseg-based offloads
(currently IPSEC, and later patches will move GENEVE and checksum
offloads there, too).

It allows to make mlx5e_xmit take care of all actions needed to transmit
a packet in the right order, improve the structure of the code and
reduce unnecessary operations. The structure will be further improved in
the following patches (all eseg-based offloads will be moved to a single
place, and reserving space for the main WQE will happen between phase 1
and phase 2 of offloads to eliminate unneeded data movements).

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09 01:05:40 -07:00
Maxim Mikityanskiy
5546100038 net/mlx5e: Update UDP fields of the SKB for GSO first
mlx5e_udp_gso_handle_tx_skb updates the length field in the UDP header
in case of GSO. It doesn't interfere with other offloads, so do it first
to simplify further restructuring of the code. This way we'll make all
independent modifications to the SKB before starting to work with WQEs.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Raed Salem <raeds@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09 01:05:40 -07:00
Maxim Mikityanskiy
2eeb6e3841 net/mlx5e: Make TLS offload independent of wqe and pi
TLS offload may write a 32-bit field (tisn) to the cseg of the WQE. To
do that, it receives pi and wqe pointers. As TLS offload may also send
additional WQEs, it has to update pi and wqe, and in many cases it even
doesn't use pi calculated before and wqe zeroed before and does it
itself. Also, mlx5e_sq_xmit has to copy the whole cseg if it goes to the
mlx5e_fill_sq_frag_edge flow. This all is not efficient.

It's more efficient to do the following:

1. Just return tisn from TLS offload and make the caller fill it in a
more appropriate place.

2. Calculate pi and clear wqe after calling TLS offload.

3. If TLS offload has to send WQEs, calculate pi and clear wqe just
before that. It's already done in all places anyway, so this commit
allows to remove some redundant memsets and calls.

Copying of cseg will be eliminated in one of the following commits, and
all other stuff is done here.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09 01:05:40 -07:00
Maxim Mikityanskiy
0bdb078c74 net/mlx5e: Pass only eseg to IPSEC offload
IPSEC offload needs to modify the eseg of the WQE that is being filled,
but it receives a pointer to the whole WQE. To make the contract
stricter, pass only the pointer to the eseg of that WQE. This commit is
preparation for the following refactoring of offloads in the TX path and
for the MPWQE support.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09 01:05:39 -07:00
Maxim Mikityanskiy
3df711db05 net/mlx5e: Return void from mlx5e_sq_xmit and mlx5i_sq_xmit
mlx5e_sq_xmit and mlx5i_sq_xmit always return NETDEV_TX_OK. Drop the
return value to simplify the code.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09 01:05:39 -07:00
Maxim Mikityanskiy
7f8546f3f0 net/mlx5e: Unify checks of TLS offloads
Both INNOVA and ConnectX TLS offloads perform the same checks in the
beginning. Unify them to reduce repeating code. Do WARN_ON_ONCE on
netdev mismatch and finish with an error in both offloads, not only in
the ConnectX one.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09 01:05:39 -07:00
Maxim Mikityanskiy
f02bac9ad6 net/mlx5e: Return bool from TLS and IPSEC offloads
TLS and IPSEC offloads currently return struct sk_buff *, but the value
is either NULL or the same skb that was passed as a parameter. Return
bool instead to provide stronger guarantees to the calling code (it
won't need to support handling a different SKB that could be potentially
returned before this change) and to simplify restructuring this code in
the following commits.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09 01:05:39 -07:00
Saeed Mahameed
76cd622fe2 Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
This merge includes updates to bonding driver needed for the rdma stack,
to avoid conflicts with the RDMA branch.

Maor Gottlieb Says:

====================
Bonding: Add support to get xmit slave

The following series adds support to get the LAG master xmit slave by
introducing new .ndo - ndo_get_xmit_slave. Every LAG module can
implement it and it first implemented in the bond driver.
This is follow-up to the RFC discussion [1].

The main motivation for doing this is for drivers that offload part
of the LAG functionality. For example, Mellanox Connect-X hardware
implements RoCE LAG which selects the TX affinity when the resources
are created and port is remapped when it goes down.

The first part of this patchset introduces the new .ndo and add the
support to the bonding module.

The second part adds support to get the RoCE LAG xmit slave by building
skb of the RoCE packet based on the AH attributes and call to the new
.ndo.

The third part change the mlx5 driver driver to set the QP's affinity
port according to the slave which found by the .ndo.
====================

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09 01:05:30 -07:00
Jacob Keller
c75a33c84b net: remove newlines in NL_SET_ERR_MSG_MOD
The NL_SET_ERR_MSG_MOD macro is used to report a string describing an
error message to userspace via the netlink extended ACK structure. It
should not have a trailing newline.

Add a cocci script which catches cases where the newline marker is
present. Using this script, fix the handful of cases which accidentally
included a trailing new line.

I couldn't figure out a way to get a patch mode working, so this script
only implements context, report, and org.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-07 17:56:14 -07:00
Jason Yan
5a7c45097c net: mlx4: remove unneeded variable "err" in mlx4_en_ethtool_add_mac_rule()
Fix the following coccicheck warning:

drivers/net/ethernet/mellanox/mlx4/en_ethtool.c:1396:5-8: Unneeded
variable: "err". Return "0" on line 1411

Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-07 13:04:21 -07:00
David S. Miller
3793faad7b Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Conflicts were all overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06 22:10:13 -07:00
Pablo Neira Ayuso
16f8036086 net: flow_offload: skip hw stats check for FLOW_ACTION_HW_STATS_DONT_CARE
This patch adds FLOW_ACTION_HW_STATS_DONT_CARE which tells the driver
that the frontend does not need counters, this hw stats type request
never fails. The FLOW_ACTION_HW_STATS_DISABLED type explicitly requests
the driver to disable the stats, however, if the driver cannot disable
counters, it bails out.

TCA_ACT_HW_STATS_* maintains the 1:1 mapping with FLOW_ACTION_HW_STATS_*
except by disabled which is mapped to FLOW_ACTION_HW_STATS_DISABLED
(this is 0 in tc). Add tc_act_hw_stats() to perform the mapping between
TCA_ACT_HW_STATS_* and FLOW_ACTION_HW_STATS_*.

Fixes: 319a1d1947 ("flow_offload: check for basic action hw stats type")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06 20:13:10 -07:00
Jason Yan
f9cbf19c7f net: mlx4: remove unneeded variable "err" in mlx4_en_get_rxfh()
Fix the following coccicheck warning:

drivers/net/ethernet/mellanox/mlx4/en_ethtool.c:1238:5-8: Unneeded
variable: "err". Return "0" on line 1252

Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06 13:57:30 -07:00
Tariq Toukan
40e473071d net/mlx4_core: Fix use of ENOSPC around mlx4_counter_alloc()
When ENOSPC is set the idx is still valid and gets set to the global
MLX4_SINK_COUNTER_INDEX.  However gcc's static analysis cannot tell that
ENOSPC is impossible from mlx4_cmd_imm() and gives this warning:

drivers/net/ethernet/mellanox/mlx4/main.c:2552:28: warning: 'idx' may be
used uninitialized in this function [-Wmaybe-uninitialized]
 2552 |    priv->def_counter[port] = idx;

Also, when ENOSPC is returned mlx4_allocate_default_counters should not
fail.

Fixes: 6de5f7f6a1 ("net/mlx4_core: Allocate default counter per port")
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-04 10:42:06 -07:00
Maor Gottlieb
c6bc6041b1 net/mlx5: Add support to get lag physical port
Add function to get the device physical port of the lag slave.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-01 12:15:38 -07:00
Maor Gottlieb
64363e61c7 net/mlx5: Change lag mutex lock to spin lock
The lag lock could be a spin lock, the critical section is short
and there is no need that the thread will sleep.
Change the lock that protects the LAG structure from mutex
to spin lock. It is required for next patch that need to
access this structure from context that we can't sleep.
In addition there is no need to hold this lock when query the
congestion counters.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-01 12:15:38 -07:00
Jiri Pirko
6ef4889fc0 mlxsw: spectrum_acl_tcam: Position vchunk in a vregion list properly
Vregion helpers to get min and max priority depend on the correct
ordering of vchunks in the vregion list. However, the current code
always adds new chunk to the end of the list, no matter what the
priority is. Fix this by finding the correct place in the list and put
vchunk there.

Fixes: 22a677661f ("mlxsw: spectrum: Introduce ACL core with simple TCAM implementation")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30 20:41:14 -07:00
David S. Miller
3857c77624 mlx5-updates-2020-04-30
1) Add release all pages support, From Eran.
    to release all FW pages at once on driver unload, when supported by FW.
 
 2) From Maxim and Tariq, Trivial Data path cleanup and code improvements
    in preparation for their next features, TLS offload and TX performance
     improvements
 
 3) Multiple cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl6rBpYACgkQSD+KveBX
 +j5L4Qf+MA17+ENeqPfMLRKHtSn9D50M3+8uDuYd4VK0uqQIQHbBpxHNM4FLa1sI
 WTBx/HnFKq5eZdOvYiZExVxpOcBWk+KVoIq8r4IPHsCU3Y2BqQOc6qqbi8haQ7J8
 fgrdi+gbS02N8MD45uUbNP/8JhZxN+4s0uEaH9cQ68sSorZOF1VtExAttTpQoqso
 Zur9gQH3MfYmQBPbr7mj4OsKiho7cb17UPadASyiLjvD7QDJ+++73PHF5YCGuSy0
 HTSvdBZOesBzaeD7qTwdq3bFILYiuiNmMlLotdvXmuAceXfa7lO8bqvwlc8an+No
 UMOBELwlw6tqVjSohMtqlbEMS9PNXQ==
 =XSFa
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2020-04-30' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2020-04-30

1) Add release all pages support, From Eran.
   to release all FW pages at once on driver unload, when supported by FW.

2) From Maxim and Tariq, Trivial Data path cleanup and code improvements
   in preparation for their next features, TLS offload and TX performance
    improvements

3) Multiple cleanups.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30 13:15:16 -07:00
Ido Schimmel
ca0892235a mlxsw: spectrum_span: Remove old SPAN API
Remove the old SPAN API now that matchall-based and flower-based
mirroring were converted to use the new API.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30 13:02:32 -07:00
Ido Schimmel
835d6b8c1a mlxsw: spectrum_span: Use new analyzed ports list during speed / MTU change
As previously explained, each port whose outgoing traffic is analyzed
needs to have an egress mirror buffer.

The size of the egress mirror buffer is calculated based on various
parameters, two of which are the speed and the MTU of the port.

Therefore, when the MTU or the speed of a port change, the SPAN code is
called to see if the egress mirror buffer of the port needs to be
adjusted.

Currently, this is done by traversing all the SPAN agents and for each
SPAN agent the list of bound ports is traversed.

Instead of the above, traverse the recently added list of analyzed
ports.

This will later allow us to remove the old SPAN API.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30 13:02:32 -07:00
Ido Schimmel
7240db69c3 mlxsw: spectrum_acl: Convert flower-based mirroring to new SPAN API
In flower-based mirroring, mirroring is done with ACLs and the SPAN
agent is not bound to a port. Instead its identifier is specified in an
ACL action.

Convert this type of mirroring to use the new API.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30 13:02:32 -07:00
Ido Schimmel
c1d7845dfb mlxsw: spectrum: Convert matchall-based mirroring to new SPAN API
In matchall-based mirroring, mirroring is not done with ACLs, but a SPAN
agent is bound to the ingress / egress of a port and all incoming /
outgoing traffic is mirrored.

Convert this type of mirroring to use the new API.

First the SPAN agent is resolved, then the port is marked as analyzed
and its egress mirror buffer is potentially allocated. Lastly, the
binding is performed.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30 13:02:32 -07:00
Ido Schimmel
c056618c53 mlxsw: spectrum_span: Add APIs to bind / unbind a SPAN agent
Currently, a SPAN agent can only be bound to a per-port trigger where
the trigger is either an incoming packet (INGRESS) or an outgoing packet
(EGRESS) to / from the port.

A follow-up patch set will introduce the concept of global triggers and
per-{port, TC} enablement. With global triggers, the trigger entry is
only keyed by a trigger and not by a port and a trigger. The trigger can
be, for example, a packet that was early dropped.

While the binding between the SPAN agent and the trigger is performed
only once, the trigger entry needs to be reference counted, as the
trigger can be enabled on multiple ports.

Add APIs to bind / unbind a SPAN agent to a trigger and reference count
the trigger entry in preparation for global triggers.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30 13:02:32 -07:00
Ido Schimmel
14366da6b5 mlxsw: spectrum_span: Wrap buffer change in a function
The code that adjusts the egress buffer size is not symmetric at the
moment. The update is done via a call to
mlxsw_sp_span_port_buffer_update(), but the disablement is done inline
by invoking the write to SBIB register directly.

Wrap the disablement code in mlxsw_sp_span_port_buffer_disable().

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Suggested-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30 13:02:32 -07:00
Ido Schimmel
eb773c3a2d mlxsw: spectrum_span: Rename function
Next patch will introduce mlxsw_sp_span_port_buffer_disable() function
that disables the egress buffer on an analyzed port. Rename the opposite
function that updates the buffer on an analyzed port accordingly.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Suggested-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30 13:02:32 -07:00
Ido Schimmel
ed04458d4a mlxsw: spectrum_span: Add APIs to get / put an analyzed port
An analyzed port is a port whose incoming / outgoing traffic is mirrored
to a SPAN agent and analyzed on a remote server.

A port can be analyzed by multiple tc filters and therefore the
corresponding analyzed port entry needs to be reference counted. This is
significant because ports whose outgoing traffic is analyzed need to
have an egress mirror buffer.

Add APIs to get / put an analyzed port. Allocate an egress mirror buffer
on a port when it is first inspected at egress and free the buffer when
it is no longer inspected at egress.

Protect the list of analyzed ports with a mutex, as a later patch will
traverse it from a context in which RTNL lock is not held.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30 13:02:32 -07:00
Ido Schimmel
466010342e mlxsw: spectrum_span: Add APIs to get / put a SPAN agent
Given a netdev that packets should be mirrored to, create a SPAN agent
and return its identifier to the caller.

The SPAN agent is reference counted, as multiple tc-mirred actions can
point to the same destination netdev.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30 13:02:32 -07:00
Maxim Mikityanskiy
ec9cdca066 net/mlx5e: Unify reserving space for WQEs
In our fast-path design, a WQE (Work Queue Element) must not cross the
page boundary. To enforce that, for WQEs consisting of more than one BB
(Basic Block), the driver checks the available contiguous space in the
WQ in advance, and if it's not enough, it pads it with NOPs.

This patch modifies the code that calculates the position of next WQE,
considering the padding, and prepares the WQE. This code is common for
all SQ types. In this patch it's reorganized in a way that makes the
usage pattern unified for all SQ types, and makes the implementations
self-contained and look almost the same, preparing the repeating code to
further attempts to deduplicate it.

One place is left as is: mlx5e_sq_xmit and mlx5e_fill_sq_frag_edge call
inside, because it is special in a way that it may also copy WQE's cseg
and eseg when reserving space. This will be eliminated in one of the
following patches, and this place will be converted to the new approach,
too.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30 10:10:46 -07:00
Maxim Mikityanskiy
7d42c8e9ab net/mlx5e: Rename ICOSQ WQE info struct and field
Structs mlx5e_txqsq and mlx5e_xdpsq contain wqe_info arrays to store
supplementary information corresponding to WQEs in the queue. Struct
mlx5e_icosq also has such an array, but it's called differently -
ico_wqe. This patch renames it to unify with the other SQs.

In addition, rename the struct to emphasize its specific usage.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30 10:10:45 -07:00
Maxim Mikityanskiy
fed0c6cfcd net/mlx5e: Fetch WQE: reuse code and enforce typing
There are multiple functions mlx5{e,i}_*_fetch_wqe that contain the same
code, that is repeated, because they operate on different SQ struct
types. mlx5e_sq_fetch_wqe also returns void *, instead of the concrete
WQE type.

This commit generalizes the fetch WQE operation by putting this code
into a single function. To simplify calls of the generic function in
concrete use cases, macros are provided that substitute the right WQE
size and cast the return type.

Before this patch, fetch_wqe used to calculate pi itself, but the value
was often known to the caller. This calculation is moved outside to
eliminate this unnecessary step and prepare for the fill_frag_edge
refactoring in the next patch.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30 10:10:45 -07:00
Tariq Toukan
e2e11dbf36 net/mlx5e: XDP, Print the offending TX descriptor on error completion
Upon an error completion on an XDP SQ, print the offending WQE
to ease the debug process.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30 10:10:45 -07:00
Tariq Toukan
f1b95753ee net/mlx5e: TX, Generalise code and usage of error CQE dump
Error CQE was dumped only for TXQ SQs.
Generalise the function, and add usage for error completions
on ICO SQs and XDP SQs.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30 10:10:44 -07:00
Tariq Toukan
e658664c77 net/mlx5e: Use proper name field for the UMR key
Even though some of the WQE control segment's field share
the same memory bits (a union of fields), prefer having the
right field name for every different usage.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30 10:10:44 -07:00
Eran Ben Elisha
c6168161f6 net/mlx5: Add support for release all pages event
If FW sets release_all_pages bit in MLX5_EVENT_TYPE_PAGE_REQUEST,
driver shall release all pages of a given function id, with no further
pages reclaim negotiation with FW nor MANAGE_PAGES commands from driver
towards FW.

Upon receiving this bit as part of pages reclaim event, driver will
initiate release all flow, in which it will iterate and release all
function's pages.

As part of driver <-> FW capabilities handshake, FW will report
release_all_pages max HCA cap bit, and driver will set the
release_all_pages bit in HCA cap.

NIC: ConnectX-4 Lx
CPU: Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
Test case: Simulataniously FLR 4 VFs, and measure FW release pages by
driver.
Before: 3.18 Sec
After:  0.31 Sec

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30 10:10:44 -07:00
Eran Ben Elisha
c7636942d2 net/mlx5: Rate limit page not found error messages
Thousands of pages are released with free_addr() function. In case of
buggy sync between FW and driver on released address, the log will be
flooded with error messages. Use mlx5_core_warn_rl() to limit it.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30 10:10:44 -07:00
Eran Ben Elisha
c655c1f469 net/mlx5: Add helper function to release fw page
Factor out the fwp address release page to an helper function, will be
used in the downstream patch.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30 10:10:44 -07:00
Tariq Toukan
51dde00b8f net/mlx5: Remove unused field in EQ
The size field in EQ is not in use.
Remove it.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Tal Gilboa <talgi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30 10:10:43 -07:00
Paul Blakey
d2658b4a1d net/mlx5: CT: Remove unused variables
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30 10:10:43 -07:00
Roi Dayan
70a5698a56 net/mlx5e: CT: Avoid false warning about rule may be used uninitialized
Avoid gcc warning by preset rule to invalid ptr.

Fixes: 4c3844d9e9 ("net/mlx5e: CT: Introduce connection tracking")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30 10:10:43 -07:00
Zheng Bin
e59b254cbe net/mlx5e: Remove unneeded semicolon
Fixes coccicheck warning:

drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c:690:2-3: Unneeded semicolon

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30 10:10:42 -07:00
Parav Pandit
9c8e7434e0 net/mlx5e: Use helper API to get devlink port index for all port flavours
Use existing helper API to get unique devlink port index for all
devlink port flavours.

Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30 10:10:42 -07:00
Raed Salem
72d3fef161 net/mlx5: IPsec, Fix coverity issue
The cited commit introduced the following coverity issue at functions
mlx5_fpga_is_ipsec_device() and mlx5_fpga_ipsec_release_sa_ctx():
- bit_and_with_zero:
  accel_xfrm->attrs.action & MLX5_ACCEL_ESP_ACTION_DECRYPT is always 0.

As MLX5_ACCEL_ESP_ACTION_DECRYPT is not a bitwise flag and was wrongly
used with bitwise operation, the above expression is always zero value
as MLX5_ACCEL_ESP_ACTION_DECRYPT is zero.

Fix by using "==" comparison operator instead.

Fixes: 7dfee4b1d7 ("net/mlx5: IPsec, Refactor SA handle creation and destruction")
Signed-off-by: Raed Salem <raeds@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30 10:10:42 -07:00
Saeed Mahameed
a6b1b93605 Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
mlx5 updates for both net-next and rdma-next:

1) HW bits and definitions for TLS and IPsec offlaods
2) Release all pages capability bits
3) New command interface helpers and some code cleanup as a result
4) Move qp.c out of mlx5 core driver into mlx5_ib rdma driver

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30 09:49:53 -07:00
Roi Dayan
67b38de646 net/mlx5e: Fix q counters on uplink representors
Need to allocate the q counters before init_rx which needs them
when creating the rq.

Fixes: 8520fa57a4 ("net/mlx5e: Create q counters on uplink representors")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30 09:20:33 -07:00
Moshe Shemesh
cece6f432c net/mlx5: Fix command entry leak in Internal Error State
Processing commands by cmd_work_handler() while already in Internal
Error State will result in entry leak, since the handler process force
completion without doorbell. Forced completion doesn't release the entry
and event completion will never arrive, so entry should be released.

Fixes: 73dd3a4839 ("net/mlx5: Avoid using pending command interface slots")
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30 09:20:33 -07:00
Moshe Shemesh
f3cb3cebe2 net/mlx5: Fix forced completion access non initialized command entry
mlx5_cmd_flush() will trigger forced completions to all valid command
entries. Triggered by an asynch event such as fast teardown it can
happen at any stage of the command, including command initialization.
It will trigger forced completion and that can lead to completion on an
uninitialized command entry.

Setting MLX5_CMD_ENT_STATE_PENDING_COMP only after command entry is
initialized will ensure force completion is treated only if command
entry is initialized.

Fixes: 73dd3a4839 ("net/mlx5: Avoid using pending command interface slots")
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30 09:20:32 -07:00
Erez Shitrit
8075411d93 net/mlx5: DR, On creation set CQ's arm_db member to right value
In polling mode, set arm_db member to a value that will avoid CQ
event recovery by the HW.
Otherwise we might get event without completion function.
In addition,empty completion function to was added to protect from
unexpected events.

Fixes: 297cccebdc ("net/mlx5: DR, Expose an internal API to issue RDMA operations")
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30 09:20:32 -07:00
Parav Pandit
f8d1eddaf9 net/mlx5: E-switch, Fix mutex init order
In cited patch mutex is initialized after its used.
Below call trace is observed.
Fix the order to initialize the mutex early enough.
Similarly follow mirror sequence during cleanup.

kernel: DEBUG_LOCKS_WARN_ON(lock->magic != lock)
kernel: WARNING: CPU: 5 PID: 45916 at kernel/locking/mutex.c:938
__mutex_lock+0x7d6/0x8a0
kernel: Call Trace:
kernel: ? esw_vport_tbl_get+0x3b/0x250 [mlx5_core]
kernel: ? mark_held_locks+0x55/0x70
kernel: ? __slab_free+0x274/0x400
kernel: ? lockdep_hardirqs_on+0x140/0x1d0
kernel: esw_vport_tbl_get+0x3b/0x250 [mlx5_core]
kernel: ? mlx5_esw_chains_create_fdb_prio+0xa57/0xc20 [mlx5_core]
kernel: mlx5_esw_vport_tbl_get+0x88/0xf0 [mlx5_core]
kernel: mlx5_esw_chains_create+0x2f3/0x3e0 [mlx5_core]
kernel: esw_create_offloads_fdb_tables+0x11d/0x580 [mlx5_core]
kernel: esw_offloads_enable+0x26d/0x540 [mlx5_core]
kernel: mlx5_eswitch_enable_locked+0x155/0x860 [mlx5_core]
kernel: mlx5_devlink_eswitch_mode_set+0x1af/0x320 [mlx5_core]
kernel: devlink_nl_cmd_eswitch_set_doit+0x41/0xb0

Fixes: 96e326878f ("net/mlx5e: Eswitch, Use per vport tables for mirroring")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30 09:20:32 -07:00
Parav Pandit
e986453905 net/mlx5: E-switch, Fix printing wrong error value
When mlx5_modify_header_alloc() fails, instead of printing the error
value returned, current error log prints 0.

Fix by printing correct error value returned by
mlx5_modify_header_alloc().

Fixes: 6724e66b90 ("net/mlx5: E-Switch, Get reg_c1 value on miss")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30 09:20:32 -07:00
Parav Pandit
799499850a net/mlx5: E-switch, Fix error unwinding flow for steering init failure
Error unwinding is done incorrectly in the cited commit.
When steering init fails, there is no need to perform steering cleanup.
When vport error exists, error cleanup should be mirror of the setup
routine, i.e. to perform steering cleanup before metadata cleanup.

This avoids the call trace in accessing uninitialized objects which are
skipped during steering_init() due to failure in steering_init().

Call trace:
mlx5_cmd_modify_header_alloc:805:(pid 21128): too many modify header
actions 1, max supported 0
E-Switch: Failed to create restore mod header

BUG: kernel NULL pointer dereference, address: 00000000000000d0
[  677.263079]  mlx5_destroy_flow_group+0x13/0x80 [mlx5_core]
[  677.268921]  esw_offloads_steering_cleanup+0x51/0xf0 [mlx5_core]
[  677.275281]  esw_offloads_enable+0x1a5/0x800 [mlx5_core]
[  677.280949]  mlx5_eswitch_enable_locked+0x155/0x860 [mlx5_core]
[  677.287227]  mlx5_devlink_eswitch_mode_set+0x1af/0x320
[  677.293741]  devlink_nl_cmd_eswitch_set_doit+0x41/0xb0
[  677.299217]  genl_rcv_msg+0x1eb/0x430

Fixes: 7983a675ba ("net/mlx5: E-Switch, Enable chains only if regs loopback is enabled")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30 09:20:31 -07:00
Raed Salem
244faedfd4 net/mlx5: Refactor imm_inval_pkey field in cqe struct
The imm_inval_pkey field can hold four different types of data,
depends on the usage, the data could be one of the below:
- Immediate field of the received message
- Invalidate rkey
- Pkey of the packet
- Flow table metadata

Current implementation doesn't reflect the intended usage of the
field at usage time.

Reflect the different types by replace this field with a union,
modify code where this field is used to reflect its intended
usage.

Signed-off-by: Raed Salem <raeds@mellanox.com>
Reviewed-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-28 12:45:15 -07:00
Erez Shitrit
dff8e2d152 net/mlx5: Use aligned variable while allocating ICM memory
The alignment value is part of the input structure, so use it and spare
extra memory allocation when is not needed.
Now, using the new ability when allocating icm for Direct-Rule
insertion.
Signed-off-by: Ariel Levkovich <lariel@mellanox.com>
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-28 12:45:10 -07:00
Huy Nguyen
d65dbedfd2 net/mlx5: Add support for COPY steering action
Add COPY type to modify_header action. IPsec feature is the first
feature that needs COPY steering action.

Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Raed Salem <raeds@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-28 12:44:44 -07:00
Jiri Pirko
19f06771ca mlxsw: spectrum: Move flow offload binding into spectrum_flow.c
Move the code taking case of setup of flow offload into spectrum_flow.c
Do small renaming of callbacks on the way.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-27 12:43:29 -07:00
Jiri Pirko
3c650136af mlxsw: spectrum_matchall: Process matchall events from the same cb as flower
Currently there are two callbacks registered: one for matchall,
one for flower. This causes the user to see "in_hw_count 2" in TC filter
dump. Because of this and also as a preparation for future matchall
offload for rules equivalent to flower-all-match, move the processing of
shared block into matchall.c. Leave only one cb for mlxsw driver
per-block.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-27 12:43:29 -07:00
Jiri Pirko
481ff57aad mlxsw: spectrum: Avoid copying sample values and use RCU pointer direcly instead
Currently, only the psample_group is accessed using RCU on RX path.
However, it is possible (unlikely) that other sample values get change
during RX processing. Fix this by having the port->sample struct
accessed as RCU pointer, containing all sample values including
psample_group pointer. That avoids extra alloc per-port, copying the
values and the race condition described above.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-27 12:43:29 -07:00
Jiri Pirko
dd0fbc89d2 mlxsw: spectrum_matchall: Push per-port rule add/del into separate functions
As the replace/destroy is going to be used later on per-block, push
the per-port rule addition/deletion into separate functions.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-27 12:43:29 -07:00
Jiri Pirko
47fa15eae4 mlxsw: spectrum_matchall: Move ingress indication into mall_entry
Instead of having it in mirror_entry structure, move it to mall_entry
and set it during rule insertion.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-27 12:43:29 -07:00
Jiri Pirko
c7ea0e162f mlxsw: spectrum_matchall: Pass mall_entry as arg to mlxsw_sp_mall_port_sample_add()
In the preparation for future changes, have the
mlxsw_sp_mall_port_sample_add() function to accept mall_entry including
all needed info originally obtained from cls and act pointers.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-27 12:43:29 -07:00
Jiri Pirko
780ba878a1 mlxsw: spectrum_matchall: Pass mall_entry as arg to mlxsw_sp_mall_port_mirror_add()
In the preparation for future changes, have the
mlxsw_sp_mall_port_mirror_add() function to accept mall_entry including
the "to_dev" originally obtained from act pointer.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-27 12:43:29 -07:00
Jiri Pirko
6c8cd435b5 mlxsw: spectrum_acl: Use block variable in mlxsw_sp_acl_rule_del()
On couple of places in mlxsw_sp_acl_rule_del(), block variable is not
used directly as it could be. So do it.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-27 12:43:29 -07:00
Jiri Pirko
d7fcc98622 mlxsw: spectrum: Push matchall bits into a separate file
Similar to flower, have matchall related code in a separate file.
Do some small renaming on the way (consistent "mall" prefixes,
dropped "_tc_", dropped "_port_" where suitable).

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-27 12:43:29 -07:00
Jiri Pirko
d52238eb7b mlxsw: spectrum: Push flow_block related functions into a separate file
The code around flow_block is currently mixed in spectrum_acl.c.
However, as it really does not directly relate to ACL part only,
push the bits into a separate file.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-27 12:43:29 -07:00
Jiri Pirko
3bc3ffb6e9 mlxsw: spectrum: Rename acl_block to flow_block
The acl_block structure is going to be used for non-acl case - matchall
offload. So rename it accordingly.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-27 12:43:29 -07:00
Jiri Pirko
49c958ccd2 mlxsw: spectrum_acl: Move block helpers into inline header functions
The struct is defined in the header, no need to have the helpers
in the c file. Move the helpers to the header.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-27 12:43:29 -07:00
Zou Wei
c90af587a9 net/mlx4_core: Add missing iounmap() in error path
This fixes the following coccicheck warning:

drivers/net/ethernet/mellanox/mlx4/crdump.c:200:2-8: ERROR: missing iounmap;
ioremap on line 190 and execution via conditional on line 198

Fixes: 7ef19d3b1d ("devlink: report error once U32_MAX snapshot ids have been used")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zou Wei <zou_wei@huawei.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-25 20:43:56 -07:00
David S. Miller
d483389678 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Simple overlapping changes to linux/vermagic.h

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-25 20:18:53 -07:00
Zheng Bin
10395e99f4 net/mlxfw: Remove unneeded semicolon
Fixes coccicheck warning:

drivers/net/ethernet/mellanox/mlxfw/mlxfw_fsm.c:79:2-3: Unneeded semicolon
drivers/net/ethernet/mellanox/mlxfw/mlxfw_fsm.c:162:2-3: Unneeded semicolon

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-24 16:56:22 -07:00
Ido Schimmel
4780dbdbd9 mlxsw: spectrum_span: Replace zero-length array with flexible-array member
In a similar fashion to commit e99f8e7f88 ("mlxsw: Replace zero-length
array with flexible-array member"), use a flexible-array member to get a
compiler warning in case the flexible array does not occur last in the
structure.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-24 15:41:51 -07:00
Ido Schimmel
4c00dafc59 mlxsw: spectrum_span: Use 'refcount_t' for reference counting
'refcount_t' is very useful for catching over/under flows. Convert the
SPAN agent objects to use it instead of 'int' for their reference count.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-24 15:41:51 -07:00
Ido Schimmel
c0c2899cf6 mlxsw: spectrum_span: Remove unnecessary debug prints
To the best of my knowledge, these debug prints were never used. Remove
them.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-24 15:41:51 -07:00
Amit Cohen
7f9b099bd9 mlxsw: spectrum_span: Rename parms() to parms_set()
Use a more meaningful name for parms() function.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-24 15:41:51 -07:00
Amit Cohen
8146458fcd mlxsw: spectrum_span: Reduce nesting in mlxsw_sp_span_entry_configure()
Use early return to avoid unnecessary nesting.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-24 15:41:51 -07:00
Eric Dumazet
cf4058dbaa net/mlx4_en: use napi_complete_done() in TX completion
In order to benefit from the new napi_defer_hard_irqs feature,
we need to use napi_complete_done() variant in this driver.

RX path is already using it, this patch implements TX completion side.

mlx4_en_process_tx_cq() now returns the amount of retired packets,
instead of a boolean, so that mlx4_en_poll_tx_cq() can pass
this value to napi_complete_done().

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-23 12:43:20 -07:00
Dan Carpenter
c391eb8366 mlxsw: Fix some IS_ERR() vs NULL bugs
The mlxsw_sp_acl_rulei_create() function is supposed to return an error
pointer from mlxsw_afa_block_create().  The problem is that these
functions both return NULL instead of error pointers.  Half the callers
expect NULL and half expect error pointers so it could lead to a NULL
dereference on failure.

This patch changes both of them to return error pointers and changes all
the callers which checked for NULL to check for IS_ERR() instead.

Fixes: 4cda7d8d70 ("mlxsw: core: Introduce flexible actions support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-23 12:34:43 -07:00
Leon Romanovsky
e0b4b4722d net/mlx5: Update transobj.c new cmd interface
Do mass update of transobj.c to reuse newly introduced
mlx5_cmd_exec_in*() interfaces.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-23 21:42:16 +03:00
Leon Romanovsky
7ba294e435 net/mlx5: Update SW steering new cmd interface
Do mass update of SW steering to reuse newly introduced
mlx5_cmd_exec_in*() interfaces.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-23 21:42:16 +03:00
Leon Romanovsky
2276a0dfc1 net/mlx5: Update port.c new cmd interface
Do mass update of port.c to reuse newly introduced
mlx5_cmd_exec_in*() interfaces.

Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-23 21:42:15 +03:00
Leon Romanovsky
fa8110f445 net/mlx5: Update rl.c new cmd interface
Do mass update of rl.c to reuse newly introduced
mlx5_cmd_exec_in*() interfaces.

Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-23 21:42:14 +03:00
Leon Romanovsky
1fb5193434 net/mlx5: Update uar.c new cmd interface
Do mass update of uar.c to reuse newly introduced
mlx5_cmd_exec_in*() interfaces.

Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-23 21:42:14 +03:00
Leon Romanovsky
9b3ca3ec03 net/mlx5: Update pd.c new cmd interface
Do mass update of pd.c to reuse newly introduced
mlx5_cmd_exec_in*() interfaces.

Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-23 21:42:13 +03:00
Leon Romanovsky
86d41641dd net/mlx5: Update pagealloc.c new cmd interface
Do mass update of pagealloc.c to reuse newly introduced
mlx5_cmd_exec_in*() interfaces.

Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-23 21:42:13 +03:00
Leon Romanovsky
adda874c95 net/mlx5: Update mr.c new cmd interface
Do mass update of mr.c to reuse newly introduced
mlx5_cmd_exec_in*() interfaces.

Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-23 21:42:12 +03:00
Leon Romanovsky
62a9fec040 net/mlx5: Update mcg.c new cmd interface
Do mass update of mcg.c to reuse newly introduced
mlx5_cmd_exec_in*() interfaces.

Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-23 21:42:11 +03:00
Leon Romanovsky
3ac0e69e69 net/mlx5: Update main.c new cmd interface
Do mass update of main.c to reuse newly introduced
mlx5_cmd_exec_in*() interfaces.

Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-23 21:42:10 +03:00
Leon Romanovsky
253e790e20 net/mlx5: Update vxlan.c new cmd interface
Do mass update of vxlan.c to reuse newly introduced
mlx5_cmd_exec_in*() interfaces.

Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-23 21:42:10 +03:00
Leon Romanovsky
9d6ed27163 net/mlx5: Update mpfs.c new cmd interface
Do mass update of mpfs.c to reuse newly introduced
mlx5_cmd_exec_in*() interfaces.

Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-23 21:42:09 +03:00
Leon Romanovsky
bb7664d369 net/mlx5: Update gid.c new cmd interface
Do mass update of gid.c to reuse newly introduced
mlx5_cmd_exec_in*() interfaces.

Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-23 21:42:09 +03:00
Leon Romanovsky
5d19395f69 net/mlx5: Update lag.c new cmd interface
Do mass update of lag.c to reuse newly introduced
mlx5_cmd_exec_in*() interfaces.

Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-23 21:42:08 +03:00
Leon Romanovsky
59ad21c21f net/mlx5: Update fw.c new cmd interface
Do mass update of fw.c to reuse newly introduced
mlx5_cmd_exec_in*() interfaces.

Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-23 21:42:07 +03:00
Leon Romanovsky
31a0956ea9 net/mlx5: Update fs_core new cmd interface
Do mass update of fs_core to reuse newly introduced
mlx5_cmd_exec_in*() interfaces.

Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-23 21:42:07 +03:00
Leon Romanovsky
b316e1866f net/mlx5: Update FPGA to new cmd interface
Do mass update of FPGA to reuse newly introduced
mlx5_cmd_exec_in*() interfaces.

Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-23 21:42:06 +03:00
Leon Romanovsky
e08a6832f9 net/mlx5: Update eswitch to new cmd interface
Do mass update of eswitch to reuse newly introduced
mlx5_cmd_exec_in*() interfaces.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-23 21:42:05 +03:00
Leon Romanovsky
a184cda1bb net/mlx5: Update statistics to new cmd interface
Do mass update of statistics to reuse newly introduced
mlx5_cmd_exec_in*() interfaces.

Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-23 21:42:05 +03:00
Leon Romanovsky
49d7fcd127 net/mlx5: Update eq.c to new cmd interface
Do mass update of eq.c to reuse newly introduced
mlx5_cmd_exec_in*() interfaces.

Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-23 21:42:04 +03:00
Leon Romanovsky
9aa536ad45 net/mlx5: Update ecpf.c to new cmd interface
Do mass update of ecpf.c to reuse newly introduced
mlx5_cmd_exec_in*() interfaces.

Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-23 21:42:04 +03:00
Leon Romanovsky
e36fb468d2 net/mlx5: Update debugfs.c to new cmd interface
Do mass update of debugfs.c to reuse newly introduced
mlx5_cmd_exec_in*() interfaces.

Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-23 21:42:03 +03:00
Leon Romanovsky
d1f620500c net/mlx5: Update cq.c to new cmd interface
Do mass update of cq.c to reuse newly introduced
mlx5_cmd_exec_in*() interfaces.

Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-23 21:42:03 +03:00
Leon Romanovsky
5d1c9a114a net/mlx5: Update vport.c to new cmd interface
Do mass update of vport.c to reuse newly introduced
mlx5_cmd_exec_in*() interfaces.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-23 21:42:02 +03:00
Zhu Yanjun
dcdf4ce0ff net/mlx5e: Get the latest values from counters in switchdev mode
In the switchdev mode, when running "cat
/sys/class/net/NIC/statistics/tx_packets", the ppcnt register is
accessed to get the latest values. But currently this command can
not get the correct values from ppcnt.

From firmware manual, before getting the 802_3 counters, the 802_3
data layout should be set to the ppcnt register.

When the command "cat /sys/class/net/NIC/statistics/tx_packets" is
run, before updating 802_3 data layout with ppcnt register, the
monitor counters are tested. The test result will decide the
802_3 data layout is updated or not.

Actually the monitor counters do not support to monitor rx/tx
stats of 802_3 in switchdev mode. So the rx/tx counters change
will not trigger monitor counters. So the 802_3 data layout will
not be updated in ppcnt register. Finally this command can not get
the latest values from ppcnt register with 802_3 data layout.

Fixes: 5c7e8bbb02 ("net/mlx5e: Use monitor counters for update stats")
Signed-off-by: Zhu Yanjun <yanjunz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-20 14:30:22 -07:00
Saeed Mahameed
96c34151d1 net/mlx5: Kconfig: convert imply usage to weak dependency
MLX5_CORE uses the 'imply' keyword to depend on VXLAN, PTP_1588_CLOCK,
MLXFW and PCI_HYPERV_INTERFACE.

This was useful to force vxlan, ptp, etc.. to be reachable to mlx5
regardless of their config states.

Due to the changes in the cited commit below, the semantics of 'imply'
was changed to not force any restriction on the implied config.

As a result of this change, the compilation of MLX5_CORE=y and VXLAN=m
would result in undefined references, as VXLAN now would stay as 'm'.

To fix this we change MLX5_CORE to have a weak dependency on
these modules/configs and make sure they are reachable, by adding:
depend on symbol || !symbol.

For example: VXLAN=m MLX5_CORE=y, this will force MLX5_CORE to m

Fixes: def2fbffe6 ("kconfig: allow symbols implied by y to become m")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Nicolas Pitre <nico@fluxnic.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
2020-04-20 14:30:22 -07:00
Maxim Mikityanskiy
e7e0004abd net/mlx5e: Don't trigger IRQ multiple times on XSK wakeup to avoid WQ overruns
XSK wakeup function triggers NAPI by posting a NOP WQE to a special XSK
ICOSQ. When the application floods the driver with wakeup requests by
calling sendto() in a certain pattern that ends up in mlx5e_trigger_irq,
the XSK ICOSQ may overflow.

Multiple NOPs are not required and won't accelerate the process, so
avoid posting a second NOP if there is one already on the way. This way
we also avoid increasing the queue size (which might not help anyway).

Fixes: db05815b36 ("net/mlx5e: Add XSK zero-copy support")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-20 14:30:22 -07:00
Paul Blakey
70840b66da net/mlx5: CT: Change idr to xarray to protect parallel tuple id allocation
After allowing parallel tuple insertion, we get the following trace:

[ 5505.142249] ------------[ cut here ]------------
[ 5505.148155] WARNING: CPU: 21 PID: 13313 at lib/radix-tree.c:581 delete_node+0x16c/0x180
[ 5505.295553] CPU: 21 PID: 13313 Comm: kworker/u50:22 Tainted: G           OE     5.6.0+ #78
[ 5505.304824] Hardware name: Supermicro Super Server/X10DRT-P, BIOS 2.0b 03/30/2017
[ 5505.313740] Workqueue: nf_flow_table_offload flow_offload_work_handler [nf_flow_table]
[ 5505.323257] RIP: 0010:delete_node+0x16c/0x180
[ 5505.349862] RSP: 0018:ffffb19184eb7b30 EFLAGS: 00010282
[ 5505.356785] RAX: 0000000000000000 RBX: ffff904ac95b86d8 RCX: ffff904b6f938838
[ 5505.365190] RDX: 0000000000000000 RSI: ffff904ac954b908 RDI: ffff904ac954b920
[ 5505.373628] RBP: ffff904b4ac13060 R08: 0000000000000001 R09: 0000000000000000
[ 5505.382155] R10: 0000000000000000 R11: 0000000000000040 R12: 0000000000000000
[ 5505.390527] R13: ffffb19184eb7bfc R14: ffff904b6bef5800 R15: ffff90482c1203c0
[ 5505.399246] FS:  0000000000000000(0000) GS:ffff904c2fc80000(0000) knlGS:0000000000000000
[ 5505.408621] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5505.415739] CR2: 00007f5d27006010 CR3: 0000000058c10006 CR4: 00000000001626e0
[ 5505.424547] Call Trace:
[ 5505.428429]  idr_alloc_u32+0x7b/0xc0
[ 5505.433803]  mlx5_tc_ct_entry_add_rule+0xbf/0x950 [mlx5_core]
[ 5505.441354]  ? mlx5_fc_create+0x23c/0x370 [mlx5_core]
[ 5505.448225]  mlx5_tc_ct_block_flow_offload+0x874/0x10b0 [mlx5_core]
[ 5505.456278]  ? mlx5_tc_ct_block_flow_offload+0x63d/0x10b0 [mlx5_core]
[ 5505.464532]  nf_flow_offload_tuple.isra.21+0xc5/0x140 [nf_flow_table]
[ 5505.472286]  ? __kmalloc+0x217/0x2f0
[ 5505.477093]  ? flow_rule_alloc+0x1c/0x30
[ 5505.482117]  flow_offload_work_handler+0x1d0/0x290 [nf_flow_table]
[ 5505.489674]  ? process_one_work+0x17c/0x580
[ 5505.494922]  process_one_work+0x202/0x580
[ 5505.500082]  ? process_one_work+0x17c/0x580
[ 5505.505696]  worker_thread+0x4c/0x3f0
[ 5505.510458]  kthread+0x103/0x140
[ 5505.514989]  ? process_one_work+0x580/0x580
[ 5505.520616]  ? kthread_bind+0x10/0x10
[ 5505.525837]  ret_from_fork+0x3a/0x50
[ 5505.570841] ---[ end trace 07995de9c56d6831 ]---

This happens from parallel deletes/adds to idr, as idr isn't protected.
Fix that by using xarray as the tuple_ids allocator instead of idr.

Fixes: 7da182a998 ("netfilter: flowtable: Use work entry per offload command")
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-20 14:30:21 -07:00
Niklas Schnelle
a019b36123 net/mlx5: Fix failing fw tracer allocation on s390
On s390 FORCE_MAX_ZONEORDER is 9 instead of 11, thus a larger kzalloc()
allocation as done for the firmware tracer will always fail.

Looking at mlx5_fw_tracer_save_trace(), it is actually the driver itself
that copies the debug data into the trace array and there is no need for
the allocation to be contiguous in physical memory. We can therefor use
kvzalloc() instead of kzalloc() and get rid of the large contiguous
allcoation.

Fixes: f53aaa31cc ("net/mlx5: FW tracer, implement tracer logic")
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-20 14:30:21 -07:00
Hu Haowen
6533380dfd net/mlx5: improve some comments
Replaced "its" with "it's".

Signed-off-by: Hu Haowen <xianfengting221@163.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-20 14:20:21 -07:00
Parav Pandit
c89da067a2 net/mlx5: Read embedded cpu bit only once
Embedded CPU bit doesn't change with PCI resume/suspend.
Hence read it only once while probing the PCI device.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Bodong Wang <bodong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-20 14:20:21 -07:00
Maxim Mikityanskiy
fa3748775b net/mlx5e: Handle errors from netif_set_real_num_{tx,rx}_queues
netif_set_real_num_tx_queues and netif_set_real_num_rx_queues may fail.
Now that mlx5e supports handling errors in the preactivate hook, this
commit leverages that functionality to handle errors from those
functions and roll back all changes on failure.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-20 14:20:20 -07:00
Roi Dayan
d7a42ad062 net/mlx5e: Allow partial data mask for tunnel options
We use mapping to save and restore the tunnel options.
Save also the tunnel options mask.

Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-20 14:20:20 -07:00
Tariq Toukan
82fe299641 net/mlx5e: Set of completion request bit should not clear other adjacent bits
In notify HW (ring doorbell) flow, we set the bit to request a completion
on the TX descriptor.
When doing so, we should not unset other bits in the same byte.
Currently, this does not fix a real issue, as we still don't have a flow
where both MLX5_WQE_CTRL_CQ_UPDATE and any adjacent bit are set together.

Fixes: 542578c679 ("net/mlx5e: Move helper functions to a new txrx datapath header")
Fixes: 864b2d7153 ("net/mlx5e: Generalize tx helper functions for different SQ types")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-20 14:20:20 -07:00
Raed Salem
7dfee4b1d7 net/mlx5: IPsec, Refactor SA handle creation and destruction
Currently the SA handle is created and managed as part of the common
code for different IPsec supporting HW, this handle is passed to HW
to be used on Rx to identify the SA handle that was used to
return the xfrm state to stack.

The above implementation pose a limitation on managing this handle.

Refactor by moving management of this field to the specific HW code.

Downstream patches will introduce the Connect-X support for IPsec that
will use this handle differently than current implementation.

Signed-off-by: Raed Salem <raeds@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Reviewed-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-20 14:20:20 -07:00
Raed Salem
0aab3e1b04 net/mlx5e: IPSec, Expose IPsec HW stat only for supporting HW
The current HW counters are supported only by Innova, split the ipsec
stats group into two groups, one for HW and one for SW. And expose
the HW counters to ethtool only if Innova HW is used for IPsec offload.

Signed-off-by: Raed Salem <raeds@mellanox.com>
Reviewed-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-20 14:20:19 -07:00
Raed Salem
1dbd51d0a7 net/mlx5: Refactor mlx5_accel_esp_create_hw_context parameter list
Currently the FPGA IPsec is the only hw implementation of the IPsec
acceleration api, and so the mlx5_accel_esp_create_hw_context was
wrongly made to suit this HW api, among other in its parameter list
and some of its parameter endianness.

This implementation might not be suitable for different HW.

Refactor by group and pass all function arguments of
mlx5_accel_esp_create_hw_context in common mlx5_accel_esp_xfrm_attrs
struct field of mlx5_accel_esp_xfrm struct and correct the endianness
according to the HW being called.

Signed-off-by: Raed Salem <raeds@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Reviewed-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-20 14:20:19 -07:00
Raed Salem
9425c595bd net/mlx5e: en_accel, Add missing net/geneve.h include
The cited commit relies on include <net/geneve.h> being included
implicitly prior to include "en_accel/en_accel.h".
This mandates that all files that needs to include en_accel.h
to redantantly include net/geneve.h.

Include net/geneve.h explicitly at "en_accel/en_accel.h" to avoid
undesired constrain as above.

Fixes: e3cfc7e6b7 ("net/mlx5e: TX, Add geneve tunnel stateless offload support")
Signed-off-by: Raed Salem <raeds@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-20 14:20:19 -07:00
Raed Salem
8c8eea07c1 net/mlx5: Use the correct IPsec capability function for FPGA ops
Currently the IPsec acceleration capability function is also used
at IPsec fpga capable device code.

This could cause a future bug as the acceleration layer is agnostic
to the device implementing its API.

Fix by using the IPsec FPGA capability function instead of acceleration
layer capability function in case of FPGA IPsec only related operations.

Downstream patches will add support for Connect-X IPsec, this can avoid
a future bug.

Signed-off-by: Raed Salem <raeds@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Reviewed-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-20 14:20:19 -07:00
Ido Schimmel
b7f03b0b2a mlxsw: reg: Increase register field length to 13 bits
The Infrastructure Entry Delete Register (IEDR) is used to delete
entries stored in the KVD linear database. Currently, it is only
possible to delete entries of size up to 2048. Future firmware versions
will support deletion of entries of size up to 4096.

Increase the size of the field so that the driver will be able to
perform such deletions in the future, when required.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-20 11:57:33 -07:00
Ido Schimmel
cec2500d44 mlxsw: spectrum_router: Re-increase scale of IPv6 nexthop groups
As explained in commit fc25996e6f ("mlxsw: spectrum_router: Increase
scale of IPv6 nexthop groups"), each nexthop group is hashed by XOR-ing
the interface indexes of all the member nexthop devices.

To avoid many different nexthop groups ending up using the same key, the
above commit started hashing the interface indexes themselves before
they are XOR-ed.

However, in cases in which there are many nexthop groups that all use
the same nexthop device and only differ in the gateway IP, we can still
end up in a situation in which all the groups are using the same key.
This eventually leads to -EBUSY error from rhashtable during insertion.

Improve the situation by also making the gateway IP part of the key.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Alex Veber <alexve@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Tested-by: Alex Veber <alexve@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-20 11:57:33 -07:00
Mark Zhang
59e9e8e4fe net/mlx5: Enable SW-defined RoCEv2 UDP source port
When this is enabled, UDP source port for RoCEv2 packets are defined
by software instead of firmware.

Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-19 15:57:53 +03:00
Leon Romanovsky
a2a322f447 net/mlx5: Refactor HCA capability set flow
Reduce the amount of kzalloc/kfree cycles by allocating
command structure in the parent function and leverage the
knowledge that set_caps() is called for HCA capabilities
only with specific HW structure as parameter to calculate
mailbox size.

Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-19 15:57:52 +03:00
Leon Romanovsky
333fbaa025 net/mlx5: Move QP logic to mlx5_ib
The mlx5_core doesn't need any functionality coded in qp.c, so move
that file to drivers/infiniband/ be under mlx5_ib responsibility.

Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-19 15:53:21 +03:00
Leon Romanovsky
9c275ee4ad net/mlx5: Delete not-used cmd header
The structures defined in the cmd header are not used and can be safely
removed from the driver. This patch removes that file and deletes all
relevant includes.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-19 15:53:20 +03:00
Leon Romanovsky
66247fbb28 net/mlx5: Remove Q counter low level helper APIs
mlx5 core users are encouraged to use low level API (mlx5_cmd_exec)
without the need of helper functions, do this for q counters, remove
helper functions and call mlx5_cmd_exec directly from users.

This will help reduce the total amount of code and reduction of the
mlx5_core symbol table.

Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-19 15:53:20 +03:00
Leon Romanovsky
57a6c5e992 net/mlx5: Replace hand written QP context struct with automatic getters
By changing debugfs to not use any QP related API, convert the debugfs
code to use MLX5_GET() directly to access QP context instead of hand
written struct.

Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-19 15:53:19 +03:00
Leon Romanovsky
f93f4f4f31 net/mlx5: Remove extra indirection while storing QPN
The FPGA, SW steering and IPoIB need to have only QPN from the
mlx5_core_qp struct, so reduce memory footprint by storing QPN
directly.

Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-19 15:53:19 +03:00
Leon Romanovsky
a452e0e436 net/mlx5: Open-code modify QP in the IPoIB module
Remove dependency on qp.c from the IPoIB by open coding
modify QP interface.

Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-19 15:53:19 +03:00
Leon Romanovsky
a6532fd925 net/mlx5: Open-code modify QP in the FPGA module
Remove dependency on qp.c from the FPGA by open coding
modify QP interface.

Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-19 15:53:18 +03:00
Leon Romanovsky
acab4b88e9 net/mlx5: Open-code modify QP in steering module
Remove dependency on qp.c from SW steering by open
coding modify QP interface.

Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-19 15:53:18 +03:00
Leon Romanovsky
73a75b96fc net/mlx5: Remove empty QP and CQ events handlers
The QP and CQ events functions do nothing except printing some debug
messages. There is nothing to do with this knowledge and such events,
so remove them.

Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-19 15:53:18 +03:00
Leon Romanovsky
ec44e72b73 net/mlx5: Open-code create and destroy QP calls
FPGA, IPoIB and SW steering don't need anything from the
mlx5_core_create_qp() and mlx5_core_destroy_qp() except calls
to mlx5_cmd_exec().

Let's open-code it, so we will be able to move qp.c to mlx5_ib.

Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-19 15:53:18 +03:00
Eric Dumazet
310660a14b net/mlx4_en: avoid indirect call in TX completion
Commit 9ecc2d8617 ("net/mlx4_en: add xdp forwarding and data write support")
brought another indirect call in fast path.

Use INDIRECT_CALL_2() helper to avoid the cost of the indirect call
when/if CONFIG_RETPOLINE=y

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Cc: Willem de Bruijn <willemb@google.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-18 15:30:22 -07:00
Paul Blakey
9808dd0a2a net/mlx5e: CT: Use rhashtable's ct entries instead of a separate list
Fixes CT entries list corruption.

After allowing parallel insertion/removals in upper nf flow table
layer, unprotected ct entries list can be corrupted by parallel add/del
on the same flow table.

CT entries list is only used while freeing a ct zone flow table to
go over all the ct entries offloaded on that zone/table, and flush
the table.

As rhashtable already provides an api to go over all the inserted entries,
fix the race by using the rhashtable iteration instead, and remove the list.

Fixes: 7da182a998 ("netfilter: flowtable: Use work entry per offload command")
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-08 15:46:54 -07:00
Parav Pandit
230a1bc247 net/mlx5e: Fix devlink port netdev unregistration sequence
In cited commit netdevice is registered after devlink port.

Unregistration flow should be mirror sequence of registration flow.
Hence, unregister netdevice before devlink port.

Fixes: 31e87b39ba ("net/mlx5e: Fix devlink port register sequence")
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-08 15:46:51 -07:00
Parav Pandit
7482d9cb5b net/mlx5e: Fix pfnum in devlink port attribute
Cited patch missed to extract PCI pf number accurately for PF and VF
port flavour. It considered PCI device + function number.
Due to this, device having non zero device number shown large pfnum.

Hence, use only PCI function number; to avoid similar errors, derive
pfnum one time for all port flavours.

Fixes: f60f315d33 ("net/mlx5e: Register devlink ports for physical link, PCI PF, VFs")
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-08 15:46:49 -07:00
Roi Dayan
d5a3c2b640 net/mlx5e: Fix missing pedit action after ct clear action
With ct clear action we should not allocate the action in hw
and not release the mod_acts parsed in advance.
It will be done when handling the ct clear action.

Fixes: 1ef3018f5a ("net/mlx5e: CT: Support clear action")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-08 15:46:46 -07:00
Dmytro Linkin
70f478ca08 net/mlx5e: Fix nest_level for vlan pop action
Current value of nest_level, assigned from net_device lower_level value,
does not reflect the actual number of vlan headers, needed to pop.
For ex., if we have untagged ingress traffic sended over vlan devices,
instead of one pop action, driver will perform two pop actions.
To fix that, calculate nest_level as difference between vlan device and
parent device lower_levels.

Fixes: f3b0a18bb6 ("net: remove unnecessary variables and callback")
Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com>
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-08 15:46:44 -07:00
Eran Ben Elisha
d19987ccf5 net/mlx5e: Add missing release firmware call
Once driver finishes flashing the firmware image, it should release it.

Fixes: 9c8bca2637 ("mlx5: Move firmware flash implementation to devlink")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-08 15:46:41 -07:00
Eli Cohen
84be2fdae4 net/mlx5: Fix condition for termination table cleanup
When we destroy rules from slow path we need to avoid destroying
termination tables since termination tables are never created in slow
path. By doing so we avoid destroying the termination table created for the
slow path.

Fixes: d8a2034f15 ("net/mlx5: Don't use termination tables in slow path")
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-08 15:46:38 -07:00
Moshe Shemesh
8c702a53bb net/mlx5: Fix frequent ioread PCI access during recovery
High frequency of PCI ioread calls during recovery flow may cause the
following trace on powerpc:

[ 248.670288] EEH: 2100000 reads ignored for recovering device at
location=Slot1 driver=mlx5_core pci addr=0000:01:00.1
[ 248.670331] EEH: Might be infinite loop in mlx5_core driver
[ 248.670361] CPU: 2 PID: 35247 Comm: kworker/u192:11 Kdump: loaded
Tainted: G OE ------------ 4.14.0-115.14.1.el7a.ppc64le #1
[ 248.670425] Workqueue: mlx5_health0000:01:00.1 health_recover_work
[mlx5_core]
[ 248.670471] Call Trace:
[ 248.670492] [c00020391c11b960] [c000000000c217ac] dump_stack+0xb0/0xf4
(unreliable)
[ 248.670548] [c00020391c11b9a0] [c000000000045818]
eeh_check_failure+0x5c8/0x630
[ 248.670631] [c00020391c11ba50] [c00000000068fce4]
ioread32be+0x114/0x1c0
[ 248.670692] [c00020391c11bac0] [c00800000dd8b400]
mlx5_error_sw_reset+0x160/0x510 [mlx5_core]
[ 248.670752] [c00020391c11bb60] [c00800000dd75824]
mlx5_disable_device+0x34/0x1d0 [mlx5_core]
[ 248.670822] [c00020391c11bbe0] [c00800000dd8affc]
health_recover_work+0x11c/0x3c0 [mlx5_core]
[ 248.670891] [c00020391c11bc80] [c000000000164fcc]
process_one_work+0x1bc/0x5f0
[ 248.670955] [c00020391c11bd20] [c000000000167f8c]
worker_thread+0xac/0x6b0
[ 248.671015] [c00020391c11bdc0] [c000000000171618] kthread+0x168/0x1b0
[ 248.671067] [c00020391c11be30] [c00000000000b65c]
ret_from_kernel_thread+0x5c/0x80

Reduce the PCI ioread frequency during recovery by using msleep()
instead of cond_resched()

Fixes: 3e5b72ac2f ("net/mlx5: Issue SW reset on FW assert")
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-08 15:46:36 -07:00
Linus Torvalds
479a72c0c6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:

 1) Slave bond and team devices should not be assigned ipv6 link local
    addresses, from Jarod Wilson.

 2) Fix clock sink config on some at803x PHY devices, from Oleksij
    Rempel.

 3) Uninitialized stack space transmitted in slcan frames, fix from
    Richard Palethorpe.

 4) Guard HW VLAN ops properly in stmmac driver, from Jose Abreu.

 5) "=" --> "|=" fix in aquantia driver, from Colin Ian King.

 6) Fix TCP fallback in mptcp, from Florian Westphal. (accessing a plain
    tcp_sk as if it were an mptcp socket).

 7) Fix cavium driver in some configurations wrt. PTP, from Yue Haibing.

 8) Make ipv6 and ipv4 consistent in the lower bound allowed for
    neighbour entry retrans_time, from Hangbin Liu.

 9) Don't use private workqueue in pegasus usb driver, from Petko
    Manolov.

10) Fix integer overflow in mlxsw, from Colin Ian King.

11) Missing refcnt init in cls_tcindex, from Cong Wang.

12) One too many loop iterations when processing cmpri entries in ipv6
    rpl code, from Alexander Aring.

13) Disable SG and TSO by default in r8169, from Heiner Kallweit.

14) NULL deref in macsec, from Davide Caratti.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (42 commits)
  macsec: fix NULL dereference in macsec_upd_offload()
  skbuff.h: Improve the checksum related comments
  net: dsa: bcm_sf2: Ensure correct sub-node is parsed
  qed: remove redundant assignment to variable 'rc'
  wimax: remove some redundant assignments to variable result
  mlxsw: spectrum_flower: Do not stop at FLOW_ACTION_VLAN_MANGLE
  mlxsw: spectrum_flower: Do not stop at FLOW_ACTION_PRIORITY
  r8169: change back SG and TSO to be disabled by default
  net: dsa: bcm_sf2: Do not register slave MDIO bus with OF
  ipv6: rpl: fix loop iteration
  tun: Don't put_page() for all negative return values from XDP program
  net: dsa: mt7530: fix null pointer dereferencing in port5 setup
  mptcp: add some missing pr_fmt defines
  net: phy: micrel: kszphy_resume(): add delay after genphy_resume() before accessing PHY registers
  net_sched: fix a missing refcnt in tcindex_init()
  net: stmmac: dwmac1000: fix out-of-bounds mac address reg setting
  mlxsw: spectrum_trap: fix unintention integer overflow on left shift
  pegasus: Remove pegasus' own workqueue
  neigh: support smaller retrans_time settting
  net: openvswitch: use hlist_for_each_entry_rcu instead of hlist_for_each_entry
  ...
2020-04-07 12:03:32 -07:00
Petr Machata
ccfc569347 mlxsw: spectrum_flower: Do not stop at FLOW_ACTION_VLAN_MANGLE
The handler for FLOW_ACTION_VLAN_MANGLE ends by returning whatever the
lower-level function that it calls returns. If there are more actions lined
up after this action, those are never offloaded. Fix by only bailing out
when the called function returns an error.

Fixes: a150201a70 ("mlxsw: spectrum: Add support for vlan modify TC action")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-06 10:14:00 -07:00
Petr Machata
0be0ae1441 mlxsw: spectrum_flower: Do not stop at FLOW_ACTION_PRIORITY
The handler for FLOW_ACTION_PRIORITY ends by returning whatever the
lower-level function that it calls returns. If there are more actions lined
up after this action, those are never offloaded. Fix by only bailing out
when the called function returns an error.

Fixes: 463957e3fb ("mlxsw: spectrum_flower: Offload FLOW_ACTION_PRIORITY")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-06 10:14:00 -07:00
Colin Ian King
468c2a1002 mlxsw: spectrum_trap: fix unintention integer overflow on left shift
Shifting the integer value 1 is evaluated using 32-bit
arithmetic and then used in an expression that expects a 64-bit
value, so there is potentially an integer overflow. Fix this
by using the BIT_ULL macro to perform the shift and avoid the
overflow.

Addresses-Coverity: ("Unintentional integer overflow")
Fixes: 13f2e64b94 ("mlxsw: spectrum_trap: Add devlink-trap policer support")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Tested-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-02 18:00:01 -07:00
Linus Torvalds
919dce2470 RDMA 5.7 pull request
The majority of the patches are cleanups, refactorings and clarity
 improvements
 
 - Various driver updates for siw, bnxt_re, rxe, efa, mlx5, hfi1
 
 - Lots of cleanup patches for hns
 
 - Convert more places to use refcount
 
 - Aggressively lock the RDMA CM code that syzkaller says isn't working
 
 - Work to clarify ib_cm
 
 - Use the new ib_device lifecycle model in bnxt_re
 
 - Fix mlx5's MR cache which seems to be failing more often with the new
   ODP code
 
 - mlx5 'dynamic uar' and 'tx steering' user interfaces
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl6CSr0ACgkQOG33FX4g
 mxrtKg//XovbOfYAO7nC05FtGz9iEkIUBiwQOjUojgNSi6RMNDqRW1bmqKUugm1o
 9nXA6tw+fueEvUNSD541SCcxkUZJzWvubO9wHB6N3Fgy68N3Vf2rKV3EBBTh99rK
 Cb7rnmTTN6izRyI1wdyP2sjDJGyF8zvsgIrG2sibzLnqlyScrnD98YS0FdPZUfOa
 a1mmXBN/T7eaQ4TbE3lLLzGnifRlGmZ5vxEvOQmAlOdqAfIKQdbbW7oCRLVIleso
 gfQlOOvIgzHRwQ3VrFa3i6ETYtywXq7EgmQxCjqPVJQjCA79n5TLBkP1iRhvn8xi
 3+LO4YCkiSJ/NjTA2d9KwT6K4djj3cYfcueuqo2MoXXr0YLiY6TLv1OffKcUIq7c
 LM3d4CSwIAG+C2FZwaQrdSGa2h/CNfLAEeKxv430zggeDNKlwHJPV5w3rUJ8lT56
 wlyT7Lzosl0O9Z/1BSLYckTvbBCtYcmanVyCfHG8EJKAM1/tXy5LS8btJ3e51rPu
 XekR9ELrTTA2CTuoSCQGP6J0dBD2U7qO4XRCQ9N5BYLrI6RdP7Z4xYzzey49Z3Cx
 JaF86eurM7nS5biUszTtwww8AJMyYicB+0VyjBfk+mhv90w8tS1vZ1aZKzaQ1L6Z
 jWn8WgIN4rWY0YGQs6PiovT1FplyGs3p1wNmjn92WO0wZZ3WsmQ=
 =ae+a
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "The majority of the patches are cleanups, refactorings and clarity
  improvements.

  This cycle saw some more activity from Syzkaller, I think we are now
  clean on all but one of those bugs, including the long standing and
  obnoxious rdma_cm locking design defect. Continue to see many drivers
  getting cleanups, with a few new user visible features.

  Summary:

   - Various driver updates for siw, bnxt_re, rxe, efa, mlx5, hfi1

   - Lots of cleanup patches for hns

   - Convert more places to use refcount

   - Aggressively lock the RDMA CM code that syzkaller says isn't
     working

   - Work to clarify ib_cm

   - Use the new ib_device lifecycle model in bnxt_re

   - Fix mlx5's MR cache which seems to be failing more often with the
     new ODP code

   - mlx5 'dynamic uar' and 'tx steering' user interfaces"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (144 commits)
  RDMA/bnxt_re: make bnxt_re_ib_init static
  IB/qib: Delete struct qib_ivdev.qp_rnd
  RDMA/hns: Fix uninitialized variable bug
  RDMA/hns: Modify the mask of QP number for CQE of hip08
  RDMA/hns: Reduce the maximum number of extend SGE per WQE
  RDMA/hns: Reduce PFC frames in congestion scenarios
  RDMA/mlx5: Add support for RDMA TX flow table
  net/mlx5: Add support for RDMA TX steering
  IB/hfi1: Call kobject_put() when kobject_init_and_add() fails
  IB/hfi1: Fix memory leaks in sysfs registration and unregistration
  IB/mlx5: Move to fully dynamic UAR mode once user space supports it
  IB/mlx5: Limit the scope of struct mlx5_bfreg_info to mlx5_ib
  IB/mlx5: Extend QP creation to get uar page index from user space
  IB/mlx5: Extend CQ creation to get uar page index from user space
  IB/mlx5: Expose UAR object and its alloc/destroy commands
  IB/hfi1: Get rid of a warning
  RDMA/hns: Remove redundant judgment of qp_type
  RDMA/hns: Remove redundant assignment of wc->smac when polling cq
  RDMA/hns: Remove redundant qpc setup operations
  RDMA/hns: Remove meaningless prints
  ...
2020-04-01 18:18:18 -07:00
Ido Schimmel
39defcbba0 mlxsw: spectrum_trap: Add support for setting of packet trap group parameters
Implement support for setting of packet trap group parameters by
invoking the trap_group_init() callback with the new parameters.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 17:54:59 -07:00
Ido Schimmel
d12d846821 mlxsw: spectrum_trap: Switch to use correct packet trap group
Some packet traps are currently exposed to user space as being member of
"l3_drops" trap group, but internally they are member of a different
group.

Switch these traps to use the correct group so that they are all subject
to the same policer, as exposed to user space.

Set the trap priority of packets trapped due to loopback error during
routing to the lowest priority. Such packets are not routed again by the
kernel and therefore should not mask other traps (e.g., host miss) that
should be routed.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 17:54:59 -07:00
Ido Schimmel
bc82521e3b mlxsw: spectrum_trap: Do not initialize dedicated discard policer
The policer is now initialized as part of the registration with devlink,
so there is no need to initialize it before the registration.

Remove the initialization.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 17:54:59 -07:00
Ido Schimmel
13f2e64b94 mlxsw: spectrum_trap: Add devlink-trap policer support
Register supported packet trap policers with devlink and implement
callbacks to change their parameters and read their counters.

Prevent user space from passing invalid policer parameters down to the
device by checking their validity and communicating the failure via an
appropriate extack message.

v2:
* Remove the max/min validity checks from __mlxsw_sp_trap_policer_set()

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 17:54:59 -07:00