Commit Graph

6331 Commits

Author SHA1 Message Date
Eran Ben Elisha
90bf1c8dbd net/mlx5: Move internal timer read function to clock library
Move mlx5_read_internal_timer() into lib/clock.c file as it is being
used there. As such, make this function a static one.

In addition, rearrange headers include to support function move.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-15 15:44:25 -07:00
Paul Blakey
49c0355d30 net/mlx5: Wait for inactive autogroups
Currently, if one thread tries to add an entry to an autogrouped table
with no free matching group, while another thread is in the process of
creating a new matching autogroup, it doesn't wait for the new group
creation, and creates an unnecessary new autogroup.

Instead of skipping inactive, wait on the write lock of those groups.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-15 15:44:22 -07:00
Parav Pandit
41798df9bf net/mlx5: Drain wq first during PCI device removal
mlx5_unload_one() is done with cleanup = true only once.

So instead of doing health wq drain inside the if(), directly do
during PCI device removal.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-15 15:44:20 -07:00
Parav Pandit
4162f58b47 net/mlx5: Have single error unwinding path
Having multiple error unwinding path are error prone.
Lets have just one error unwinding path.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-15 15:44:17 -07:00
Eran Ben Elisha
e7f860e210 net/mlx5: Fix a bug of releasing wrong chunks on > 4K page size systems
On systems with page size larger than 4K, a fwp object has few 4K chunks.
Fix a bug in fwp free flow where the chunk address was dropped and
fwp->addr was used instead (first chunk address). This caused a wrong
update of fwp->bitmask which later can cause errors in re-alloc fwp
chunk flow.

In order to fix this it, re-factor the release flow:
- Free 4k: Releases a specific 4k chunk inside the fwp, defined by
  starting address.
- Free fwp: Unconditionally release the whole fwp and its resources.
Free addr will call free fwp if all chunks were released, in order to do
code sharing.

In addition, fix npages to count for all released chunks correctly.

Fixes: c6168161f6 ("net/mlx5: Add support for release all pages event")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-15 15:44:15 -07:00
Eran Ben Elisha
2726cd4a29 net/mlx5: Dedicate fw page to the requesting function
The cited patch assumes that all chuncks in a fw page belong to the same
function, thus the driver must dedicate fw page to the requesting
function, which is actually what was intedned in the original fw pages
allocator design, hence the fwp->func_id !

Up until the cited patch everything worked ok, but now "relase all pages"
is broken on systems with page_size > 4k.

Fix this by dedicating fw page to the requesting function id via adding a
func_id parameter to alloc_4k() function.

Fixes: c6168161f6 ("net/mlx5: Add support for release all pages event")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-15 15:44:12 -07:00
Jesper Dangaard Brouer
d628ee4fef mlx5: Rx queue setup time determine frame_sz for XDP
The mlx5 driver have multiple memory models, which are also changed
according to whether a XDP bpf_prog is attached.

The 'rx_striding_rq' setting is adjusted via ethtool priv-flags e.g.:
 # ethtool --set-priv-flags mlx5p2 rx_striding_rq off

On the general case with 4K page_size and regular MTU packet, then
the frame_sz is 2048 and 4096 when XDP is enabled, in both modes.

The info on the given frame size is stored differently depending on the
RQ-mode and encoded in a union in struct mlx5e_rq union wqe/mpwqe.
In rx striding mode rq->mpwqe.log_stride_sz is either 11 or 12, which
corresponds to 2048 or 4096 (MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ).
In non-striding mode (MLX5_WQ_TYPE_CYCLIC) the frag_stride is stored
in rq->wqe.info.arr[0].frag_stride, for the first fragment, which is
what the XDP case cares about.

To reduce effect on fast-path, this patch determine the frame_sz at
setup time, to avoid determining the memory model runtime. Variable
is named frame0_sz to make it clear that this is only the frame
size of the first fragment.

This mlx5 driver does a DMA-sync on XDP_TX action, but grow is safe
as it have done a DMA-map on the entire PAGE_SIZE. The driver also
already does a XDP length check against sq->hw_mtu on the possible
XDP xmit paths mlx5e_xmit_xdp_frame() + mlx5e_xmit_xdp_frame_mpwqe().

V3+4: Change variable name first_frame_sz to frame0_sz

V2: Fix that frag_size need to be recalc before creating SKB.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Link: https://lore.kernel.org/bpf/158945348021.97035.12295039384250022883.stgit@firesoul
2020-05-14 21:21:56 -07:00
Jesper Dangaard Brouer
d201ea9ebc mlx4: Add XDP frame size and adjust max XDP MTU
The mlx4 drivers size of memory backing the RX packet is stored in
frag_stride. For XDP mode this will be PAGE_SIZE (normally 4096).
For normal mode frag_stride is 2048.

Also adjust MLX4_EN_MAX_XDP_MTU to take tailroom into account.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Link: https://lore.kernel.org/bpf/158945341893.97035.2688142527052329942.stgit@firesoul
2020-05-14 21:21:55 -07:00
Jiri Pirko
67ed68fc0c mlxsw: spectrum_flower: Forbid to insert flower rules in collision with matchall rules
On ingress, the matchall rules doing mirroring and sampling are offloaded
into hardware blocks that are processed before any flower rules.
On egress, the matchall mirroring rules are offloaded into hardware
block that is processed after all flower rules.

Therefore check the priorities of inserted flower rules against
existing matchall rules and ensure the correct ordering.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-09 16:02:43 -07:00
Jiri Pirko
18346b70ab mlxsw: spectrum_matchall: Forbid to insert matchall rules in collision with flower rules
On ingress, the matchall rules doing mirroring and sampling are offloaded
into hardware blocks that are processed before any flower rules.
On egress, the matchall mirroring rules are offloaded into hardware
block that is processed after all flower rules.

Therefore check the priorities of inserted matchall rules against
existing flower rules and ensure the correct ordering.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-09 16:02:43 -07:00
Jiri Pirko
aed65285fb mlxsw: spectrum_matchall: Expose a function to get min and max rule priority
Introduce an infrastructure that allows to get minimum and maximum
rule priority for specified chain. This is going to be used by
a subsequent patch to enforce ordering between flower and
matchall filters.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-09 16:02:43 -07:00
Jiri Pirko
5a2939b9d7 mlxsw: spectrum_matchall: Put matchall list into substruct of flow struct
As there are going to be other matchall specific fields in flow
structure, put the existing list field into matchall substruct.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-09 16:02:43 -07:00
Jiri Pirko
593bb84379 mlxsw: spectrum_flower: Expose a function to get min and max rule priority
Introduce an infrastructure that allows to get minimum and maximum
rule priority for specified chain. This is going to be used by
a subsequent patch to enforce ordering between flower and
matchall filters.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-09 16:02:43 -07:00
Jiri Pirko
18aa23b31f mlxsw: spectrum_matchall: Restrict sample action to be allowed only on ingress
HW supports packet sampling on ingress only. Check and fail if user
is adding sample on egress.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-09 16:02:43 -07:00
Tariq Toukan
28bff09518 net/mlx5e: Enhance ICOSQ WQE info fields
The same WQE opcode might be used in different ICOSQ flows
and WQE types.
To have a better distinguishability, replace it with an enum that
better indicates the WQE type and flow it is used for.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09 01:05:42 -07:00
Tariq Toukan
6b74f60ef5 net/mlx5: Accel, Remove unnecessary header include
The include of Ethernet driver header in core is not needed
and actually wrong.
Remove it.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09 01:05:42 -07:00
Tariq Toukan
41a8e4ebb4 net/mlx5e: Use struct assignment for WQE info updates
Struct assignment looks more clean, and implies resetting
the not assigned fields to zero, instead of holding values
from older ring cycles.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09 01:05:41 -07:00
Tariq Toukan
05dfd57082 net/mlx5e: Take TX WQE info structures out of general EN header
Into the txrx header file.
The mlx5e_sq_wqe_info structure describes WQE info for the ICOSQ,
rename it to better reflect this.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09 01:05:41 -07:00
Tariq Toukan
f713ce1de8 net/mlx5e: kTLS, Do not fill edge for the DUMP WQEs in TX flow
Every single DUMP WQE resides in a single WQEBB.
As the pi is calculated per each one separately, there is
no real need for a contiguous room for them, allow them to populate
different WQ fragments.
This reduces WQ waste and improves its utilization.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09 01:05:41 -07:00
Tariq Toukan
ab1e0ce99d net/mlx5e: kTLS, Fill work queue edge separately in TX flow
For the static and progress context params WQEs, do the edge
filling separately.
This improves the WQ utilization, code readability, and reduces
the chance of future bugs.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09 01:05:40 -07:00
Maxim Mikityanskiy
714c88a38b net/mlx5e: Split TX acceleration offloads into two phases
After previous modifications, the offloads are no longer called one by
one, the pi is calculated and the wqe is cleared on between of TLS and
IPSEC offloads, which doesn't quite fit mlx5e_accel_handle_tx's purpose.

This patch splits mlx5e_accel_handle_tx into two functions that
correspond to two logical phases of running offloads:

1. Before fetching a WQE. Here runs the code that can post WQEs on its
own, before the main WQE is fetched. It's the main part of TLS offload.

2. After fetching a WQE. Here runs the code that updates the WQE's
fields, but can't post other WQEs any more. It's a minor part of TLS
offload that sets the tisn field in the cseg, and eseg-based offloads
(currently IPSEC, and later patches will move GENEVE and checksum
offloads there, too).

It allows to make mlx5e_xmit take care of all actions needed to transmit
a packet in the right order, improve the structure of the code and
reduce unnecessary operations. The structure will be further improved in
the following patches (all eseg-based offloads will be moved to a single
place, and reserving space for the main WQE will happen between phase 1
and phase 2 of offloads to eliminate unneeded data movements).

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09 01:05:40 -07:00
Maxim Mikityanskiy
5546100038 net/mlx5e: Update UDP fields of the SKB for GSO first
mlx5e_udp_gso_handle_tx_skb updates the length field in the UDP header
in case of GSO. It doesn't interfere with other offloads, so do it first
to simplify further restructuring of the code. This way we'll make all
independent modifications to the SKB before starting to work with WQEs.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Raed Salem <raeds@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09 01:05:40 -07:00
Maxim Mikityanskiy
2eeb6e3841 net/mlx5e: Make TLS offload independent of wqe and pi
TLS offload may write a 32-bit field (tisn) to the cseg of the WQE. To
do that, it receives pi and wqe pointers. As TLS offload may also send
additional WQEs, it has to update pi and wqe, and in many cases it even
doesn't use pi calculated before and wqe zeroed before and does it
itself. Also, mlx5e_sq_xmit has to copy the whole cseg if it goes to the
mlx5e_fill_sq_frag_edge flow. This all is not efficient.

It's more efficient to do the following:

1. Just return tisn from TLS offload and make the caller fill it in a
more appropriate place.

2. Calculate pi and clear wqe after calling TLS offload.

3. If TLS offload has to send WQEs, calculate pi and clear wqe just
before that. It's already done in all places anyway, so this commit
allows to remove some redundant memsets and calls.

Copying of cseg will be eliminated in one of the following commits, and
all other stuff is done here.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09 01:05:40 -07:00
Maxim Mikityanskiy
0bdb078c74 net/mlx5e: Pass only eseg to IPSEC offload
IPSEC offload needs to modify the eseg of the WQE that is being filled,
but it receives a pointer to the whole WQE. To make the contract
stricter, pass only the pointer to the eseg of that WQE. This commit is
preparation for the following refactoring of offloads in the TX path and
for the MPWQE support.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09 01:05:39 -07:00
Maxim Mikityanskiy
3df711db05 net/mlx5e: Return void from mlx5e_sq_xmit and mlx5i_sq_xmit
mlx5e_sq_xmit and mlx5i_sq_xmit always return NETDEV_TX_OK. Drop the
return value to simplify the code.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09 01:05:39 -07:00
Maxim Mikityanskiy
7f8546f3f0 net/mlx5e: Unify checks of TLS offloads
Both INNOVA and ConnectX TLS offloads perform the same checks in the
beginning. Unify them to reduce repeating code. Do WARN_ON_ONCE on
netdev mismatch and finish with an error in both offloads, not only in
the ConnectX one.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09 01:05:39 -07:00
Maxim Mikityanskiy
f02bac9ad6 net/mlx5e: Return bool from TLS and IPSEC offloads
TLS and IPSEC offloads currently return struct sk_buff *, but the value
is either NULL or the same skb that was passed as a parameter. Return
bool instead to provide stronger guarantees to the calling code (it
won't need to support handling a different SKB that could be potentially
returned before this change) and to simplify restructuring this code in
the following commits.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09 01:05:39 -07:00
Saeed Mahameed
76cd622fe2 Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
This merge includes updates to bonding driver needed for the rdma stack,
to avoid conflicts with the RDMA branch.

Maor Gottlieb Says:

====================
Bonding: Add support to get xmit slave

The following series adds support to get the LAG master xmit slave by
introducing new .ndo - ndo_get_xmit_slave. Every LAG module can
implement it and it first implemented in the bond driver.
This is follow-up to the RFC discussion [1].

The main motivation for doing this is for drivers that offload part
of the LAG functionality. For example, Mellanox Connect-X hardware
implements RoCE LAG which selects the TX affinity when the resources
are created and port is remapped when it goes down.

The first part of this patchset introduces the new .ndo and add the
support to the bonding module.

The second part adds support to get the RoCE LAG xmit slave by building
skb of the RoCE packet based on the AH attributes and call to the new
.ndo.

The third part change the mlx5 driver driver to set the QP's affinity
port according to the slave which found by the .ndo.
====================

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-09 01:05:30 -07:00
Jacob Keller
c75a33c84b net: remove newlines in NL_SET_ERR_MSG_MOD
The NL_SET_ERR_MSG_MOD macro is used to report a string describing an
error message to userspace via the netlink extended ACK structure. It
should not have a trailing newline.

Add a cocci script which catches cases where the newline marker is
present. Using this script, fix the handful of cases which accidentally
included a trailing new line.

I couldn't figure out a way to get a patch mode working, so this script
only implements context, report, and org.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-07 17:56:14 -07:00
Jason Yan
5a7c45097c net: mlx4: remove unneeded variable "err" in mlx4_en_ethtool_add_mac_rule()
Fix the following coccicheck warning:

drivers/net/ethernet/mellanox/mlx4/en_ethtool.c:1396:5-8: Unneeded
variable: "err". Return "0" on line 1411

Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-07 13:04:21 -07:00
David S. Miller
3793faad7b Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Conflicts were all overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06 22:10:13 -07:00
Pablo Neira Ayuso
16f8036086 net: flow_offload: skip hw stats check for FLOW_ACTION_HW_STATS_DONT_CARE
This patch adds FLOW_ACTION_HW_STATS_DONT_CARE which tells the driver
that the frontend does not need counters, this hw stats type request
never fails. The FLOW_ACTION_HW_STATS_DISABLED type explicitly requests
the driver to disable the stats, however, if the driver cannot disable
counters, it bails out.

TCA_ACT_HW_STATS_* maintains the 1:1 mapping with FLOW_ACTION_HW_STATS_*
except by disabled which is mapped to FLOW_ACTION_HW_STATS_DISABLED
(this is 0 in tc). Add tc_act_hw_stats() to perform the mapping between
TCA_ACT_HW_STATS_* and FLOW_ACTION_HW_STATS_*.

Fixes: 319a1d1947 ("flow_offload: check for basic action hw stats type")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06 20:13:10 -07:00
Jason Yan
f9cbf19c7f net: mlx4: remove unneeded variable "err" in mlx4_en_get_rxfh()
Fix the following coccicheck warning:

drivers/net/ethernet/mellanox/mlx4/en_ethtool.c:1238:5-8: Unneeded
variable: "err". Return "0" on line 1252

Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06 13:57:30 -07:00
Tariq Toukan
40e473071d net/mlx4_core: Fix use of ENOSPC around mlx4_counter_alloc()
When ENOSPC is set the idx is still valid and gets set to the global
MLX4_SINK_COUNTER_INDEX.  However gcc's static analysis cannot tell that
ENOSPC is impossible from mlx4_cmd_imm() and gives this warning:

drivers/net/ethernet/mellanox/mlx4/main.c:2552:28: warning: 'idx' may be
used uninitialized in this function [-Wmaybe-uninitialized]
 2552 |    priv->def_counter[port] = idx;

Also, when ENOSPC is returned mlx4_allocate_default_counters should not
fail.

Fixes: 6de5f7f6a1 ("net/mlx4_core: Allocate default counter per port")
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-04 10:42:06 -07:00
Maor Gottlieb
c6bc6041b1 net/mlx5: Add support to get lag physical port
Add function to get the device physical port of the lag slave.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-01 12:15:38 -07:00
Maor Gottlieb
64363e61c7 net/mlx5: Change lag mutex lock to spin lock
The lag lock could be a spin lock, the critical section is short
and there is no need that the thread will sleep.
Change the lock that protects the LAG structure from mutex
to spin lock. It is required for next patch that need to
access this structure from context that we can't sleep.
In addition there is no need to hold this lock when query the
congestion counters.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-05-01 12:15:38 -07:00
Jiri Pirko
6ef4889fc0 mlxsw: spectrum_acl_tcam: Position vchunk in a vregion list properly
Vregion helpers to get min and max priority depend on the correct
ordering of vchunks in the vregion list. However, the current code
always adds new chunk to the end of the list, no matter what the
priority is. Fix this by finding the correct place in the list and put
vchunk there.

Fixes: 22a677661f ("mlxsw: spectrum: Introduce ACL core with simple TCAM implementation")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30 20:41:14 -07:00
David S. Miller
3857c77624 mlx5-updates-2020-04-30
1) Add release all pages support, From Eran.
    to release all FW pages at once on driver unload, when supported by FW.
 
 2) From Maxim and Tariq, Trivial Data path cleanup and code improvements
    in preparation for their next features, TLS offload and TX performance
     improvements
 
 3) Multiple cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl6rBpYACgkQSD+KveBX
 +j5L4Qf+MA17+ENeqPfMLRKHtSn9D50M3+8uDuYd4VK0uqQIQHbBpxHNM4FLa1sI
 WTBx/HnFKq5eZdOvYiZExVxpOcBWk+KVoIq8r4IPHsCU3Y2BqQOc6qqbi8haQ7J8
 fgrdi+gbS02N8MD45uUbNP/8JhZxN+4s0uEaH9cQ68sSorZOF1VtExAttTpQoqso
 Zur9gQH3MfYmQBPbr7mj4OsKiho7cb17UPadASyiLjvD7QDJ+++73PHF5YCGuSy0
 HTSvdBZOesBzaeD7qTwdq3bFILYiuiNmMlLotdvXmuAceXfa7lO8bqvwlc8an+No
 UMOBELwlw6tqVjSohMtqlbEMS9PNXQ==
 =XSFa
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2020-04-30' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2020-04-30

1) Add release all pages support, From Eran.
   to release all FW pages at once on driver unload, when supported by FW.

2) From Maxim and Tariq, Trivial Data path cleanup and code improvements
   in preparation for their next features, TLS offload and TX performance
    improvements

3) Multiple cleanups.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30 13:15:16 -07:00
Ido Schimmel
ca0892235a mlxsw: spectrum_span: Remove old SPAN API
Remove the old SPAN API now that matchall-based and flower-based
mirroring were converted to use the new API.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30 13:02:32 -07:00
Ido Schimmel
835d6b8c1a mlxsw: spectrum_span: Use new analyzed ports list during speed / MTU change
As previously explained, each port whose outgoing traffic is analyzed
needs to have an egress mirror buffer.

The size of the egress mirror buffer is calculated based on various
parameters, two of which are the speed and the MTU of the port.

Therefore, when the MTU or the speed of a port change, the SPAN code is
called to see if the egress mirror buffer of the port needs to be
adjusted.

Currently, this is done by traversing all the SPAN agents and for each
SPAN agent the list of bound ports is traversed.

Instead of the above, traverse the recently added list of analyzed
ports.

This will later allow us to remove the old SPAN API.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30 13:02:32 -07:00
Ido Schimmel
7240db69c3 mlxsw: spectrum_acl: Convert flower-based mirroring to new SPAN API
In flower-based mirroring, mirroring is done with ACLs and the SPAN
agent is not bound to a port. Instead its identifier is specified in an
ACL action.

Convert this type of mirroring to use the new API.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30 13:02:32 -07:00
Ido Schimmel
c1d7845dfb mlxsw: spectrum: Convert matchall-based mirroring to new SPAN API
In matchall-based mirroring, mirroring is not done with ACLs, but a SPAN
agent is bound to the ingress / egress of a port and all incoming /
outgoing traffic is mirrored.

Convert this type of mirroring to use the new API.

First the SPAN agent is resolved, then the port is marked as analyzed
and its egress mirror buffer is potentially allocated. Lastly, the
binding is performed.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30 13:02:32 -07:00
Ido Schimmel
c056618c53 mlxsw: spectrum_span: Add APIs to bind / unbind a SPAN agent
Currently, a SPAN agent can only be bound to a per-port trigger where
the trigger is either an incoming packet (INGRESS) or an outgoing packet
(EGRESS) to / from the port.

A follow-up patch set will introduce the concept of global triggers and
per-{port, TC} enablement. With global triggers, the trigger entry is
only keyed by a trigger and not by a port and a trigger. The trigger can
be, for example, a packet that was early dropped.

While the binding between the SPAN agent and the trigger is performed
only once, the trigger entry needs to be reference counted, as the
trigger can be enabled on multiple ports.

Add APIs to bind / unbind a SPAN agent to a trigger and reference count
the trigger entry in preparation for global triggers.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30 13:02:32 -07:00
Ido Schimmel
14366da6b5 mlxsw: spectrum_span: Wrap buffer change in a function
The code that adjusts the egress buffer size is not symmetric at the
moment. The update is done via a call to
mlxsw_sp_span_port_buffer_update(), but the disablement is done inline
by invoking the write to SBIB register directly.

Wrap the disablement code in mlxsw_sp_span_port_buffer_disable().

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Suggested-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30 13:02:32 -07:00
Ido Schimmel
eb773c3a2d mlxsw: spectrum_span: Rename function
Next patch will introduce mlxsw_sp_span_port_buffer_disable() function
that disables the egress buffer on an analyzed port. Rename the opposite
function that updates the buffer on an analyzed port accordingly.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Suggested-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30 13:02:32 -07:00
Ido Schimmel
ed04458d4a mlxsw: spectrum_span: Add APIs to get / put an analyzed port
An analyzed port is a port whose incoming / outgoing traffic is mirrored
to a SPAN agent and analyzed on a remote server.

A port can be analyzed by multiple tc filters and therefore the
corresponding analyzed port entry needs to be reference counted. This is
significant because ports whose outgoing traffic is analyzed need to
have an egress mirror buffer.

Add APIs to get / put an analyzed port. Allocate an egress mirror buffer
on a port when it is first inspected at egress and free the buffer when
it is no longer inspected at egress.

Protect the list of analyzed ports with a mutex, as a later patch will
traverse it from a context in which RTNL lock is not held.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30 13:02:32 -07:00
Ido Schimmel
466010342e mlxsw: spectrum_span: Add APIs to get / put a SPAN agent
Given a netdev that packets should be mirrored to, create a SPAN agent
and return its identifier to the caller.

The SPAN agent is reference counted, as multiple tc-mirred actions can
point to the same destination netdev.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30 13:02:32 -07:00
Maxim Mikityanskiy
ec9cdca066 net/mlx5e: Unify reserving space for WQEs
In our fast-path design, a WQE (Work Queue Element) must not cross the
page boundary. To enforce that, for WQEs consisting of more than one BB
(Basic Block), the driver checks the available contiguous space in the
WQ in advance, and if it's not enough, it pads it with NOPs.

This patch modifies the code that calculates the position of next WQE,
considering the padding, and prepares the WQE. This code is common for
all SQ types. In this patch it's reorganized in a way that makes the
usage pattern unified for all SQ types, and makes the implementations
self-contained and look almost the same, preparing the repeating code to
further attempts to deduplicate it.

One place is left as is: mlx5e_sq_xmit and mlx5e_fill_sq_frag_edge call
inside, because it is special in a way that it may also copy WQE's cseg
and eseg when reserving space. This will be eliminated in one of the
following patches, and this place will be converted to the new approach,
too.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30 10:10:46 -07:00
Maxim Mikityanskiy
7d42c8e9ab net/mlx5e: Rename ICOSQ WQE info struct and field
Structs mlx5e_txqsq and mlx5e_xdpsq contain wqe_info arrays to store
supplementary information corresponding to WQEs in the queue. Struct
mlx5e_icosq also has such an array, but it's called differently -
ico_wqe. This patch renames it to unify with the other SQs.

In addition, rename the struct to emphasize its specific usage.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30 10:10:45 -07:00
Maxim Mikityanskiy
fed0c6cfcd net/mlx5e: Fetch WQE: reuse code and enforce typing
There are multiple functions mlx5{e,i}_*_fetch_wqe that contain the same
code, that is repeated, because they operate on different SQ struct
types. mlx5e_sq_fetch_wqe also returns void *, instead of the concrete
WQE type.

This commit generalizes the fetch WQE operation by putting this code
into a single function. To simplify calls of the generic function in
concrete use cases, macros are provided that substitute the right WQE
size and cast the return type.

Before this patch, fetch_wqe used to calculate pi itself, but the value
was often known to the caller. This calculation is moved outside to
eliminate this unnecessary step and prepare for the fill_frag_edge
refactoring in the next patch.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30 10:10:45 -07:00