Commit 1c30d1836a ("mlxsw: spectrum: Enable VxLAN enslavement to
bridges") enabled the enslavement of VxLAN devices to bridges that have
mlxsw ports (or their upper) as slaves. This patch extends mlxsw to also
support VLAN-aware bridges.
The patch is similar in nature to mentioned commit, but there is one
major difference. With VLAN-aware bridges, the VxLAN device's VNI is
mapped to the VLAN that is configured as PVID and egress untagged on the
bridge port.
Therefore, the driver is extended to listen to VLAN configuration on
VxLAN devices of interest and enable / disable NVE encapsulation on the
corresponding 802.1Q FIDs.
To prevent ambiguity, the driver makes sure that a given VLAN is not
configured as PVID and egress untagged on multiple VxLAN devices. This
sanitization takes place both when a port is enslaved to a bridge with
existing VxLAN devices and when a VLAN is added to / removed from a
VxLAN device of interest.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The vxlan_join() function resolves the FID on which the VNI should be
set and then sets the VNI. Currently, the FID is simply resolved
according to the ifindex of the bridge device to which the VxLAN device
is enslaved. This works because only VLAN-unaware bridges are supported.
With VLAN-aware bridges the FID would need to be resolved based on the
VLAN to which the VNI is mapped to.
Add the VLAN ID to the argument list of the function.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function mlxsw_sp_bridge_vxlan_leave() is currently split between
VLAN-aware and VLAN-unaware bridges, but actually both types can use the
same function.
The function needs to resolve the FID that corresponds to the VxLAN
device and disable NVE encapsulation on it. Instead of looking up the
FID differently for VLAN-aware and VLAN-unaware bridges, we can always
use the VxLAN's device VNI.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In a similar fashion to commit 564c6d727a ("mlxsw: spectrum_fid: Add
APIs to lookup FID without creating it"), add a corresponding API to
lookup 802.1Q FIDs.
This is a prerequisite to VxLAN support with VLAN-aware bridges and will
allow us to resolve a 802.1Q FID by its VLAN when an FDB entry is added
on the bridge port of the VxLAN device.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace 802.1Q FIDs and VLAN RIFs with their emulated counterparts.
The emulated 802.1Q FIDs are actually 802.1D FIDs and thus use the same
flood tables, of per-FID type. Therefore, add 4K-1 entries to the
per-FID flood tables for the new FIDs and get rid of the FID-offset
flood tables that were used by the old 802.1Q FIDs.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Router interfaces (RIFs) constructed on top of VLAN-aware bridges are of
"VLAN" type, whereas RIFs constructed on top of VLAN-unaware bridges of
"FID" type.
In other words, the RIF type is derived from the underlying FID type.
VLAN RIFs are used on top of 802.1Q FIDs, whereas FID RIFs are used on
top of 802.1D FIDs.
Since the previous patch emulated 802.1Q FIDs using 802.1D FIDs, this
patch emulates VLAN RIFs using FID RIFs.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver uses 802.1Q FIDs when offloading a VLAN-aware bridge.
Unfortunately, it is not possible to assign a VNI to such FIDs, which
prompts the driver to forbid the enslavement of VxLAN devices to a
VLAN-aware bridge.
Workaround this hardware limitation by creating a new family of FIDs,
emulated 802.1Q FIDs. These FIDs are emulated using 802.1D FIDs, which
can be assigned a VNI.
The downside of this approach is that multiple {Port, VID}->FID entries
are required, whereas only a single VID->FID is required with "true"
802.1Q FIDs.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
802.1D FIDs use a per-FID flood table, where the flood index into the
table is calculated by subtracting 4K from the FID's index.
Currently, 802.1D FIDs start at 4K, so the calculation is correct, but
if it was ever to change, the calculation will no longer be correct.
In addition, this change will allow us to reuse the flood index
calculation function in the next patch, where we are going to emulate
802.1Q FIDs using 802.1D FIDs.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When configuring an FDB entry pointing to a LAG netdev (or its upper),
the driver should only set the 'lag_vid' field when the FID (filtering
identifier) is of 802.1D type.
Extend the 802.1D FID family with an attribute indicating whether this
field should be set and based on its value set the field or leave it
blank.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Drop switchdev_ops.switchdev_port_obj_add and _del. Drop the uses of
this field from all clients, which were migrated to use switchdev
notification in the previous patches.
Add a new function switchdev_port_obj_notify() that sends the switchdev
notifications SWITCHDEV_PORT_OBJ_ADD and _DEL.
Update switchdev_port_obj_del_now() to dispatch to this new function.
Drop __switchdev_port_obj_add() and update switchdev_port_obj_add()
likewise.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Following patches will change the way of distributing port object
changes from a switchdev operation to a switchdev notifier. The
switchdev code currently recursively descends through layers of lower
devices, eventually calling the op on a front-panel port device. The
notifier will instead be sent referencing the bridge port device, which
may be a stacking device that's one of front-panel ports uppers, or a
completely unrelated device.
To handle SWITCHDEV_PORT_OBJ_ADD and _DEL, subscribe to the blocking
notifier chain. Dispatch to mlxsw_sp_port_obj_add() resp. _del() to
maintain the behavior that the switchdev operation based code currently
has. Defer to switchdev_handle_port_obj_add() / _del() to handle the
recursive descend, because mlxsw supports a number of upper types.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Perform CQ initialization in the driver when the capability is supported
by the FW. When passing the CQ to HW indicate that the CQ buffer has
been pre-initialized.
Doing so decreases CQ creation time. Testing on P8 showed a single 2048
entry CQ creation time was reduced from ~395us to ~170us, which is
2.3x faster.
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Up until now the driver returned an error when learning was enabled on a
VxLAN device enslaved to an offloaded bridge.
Previous patches added VxLAN learning support, so remove the check.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow users to delete learned FDB entries from the bridge's FDB before
enabling VxLAN learning.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Start processing two new entry types in addition to current ones:
* Learned unicast tunnel entry
* Aged-out unicast tunnel entry
In both cases the device reports on a new {MAC, FID, IP address} tuple
that was learned / aged-out. Based on this notification, the driver
instructs the device to add / delete the entry to / from its database.
The driver also makes sure to notify the bridge and VxLAN drivers about
the new entry.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
FDB notifications for entries learned from an NVE tunnel contain the IP
address of the remote VTEP. In the case of IPv4 underlay, the IP address
is specified as-is. IPv6 addresses on the other hand, are specified as
handles which then need to be used to query the actual address from the
device.
Only IPv4 underlay is currently supported, so we cannot receive
notifications for IPv6 addresses and therefore an error is returned when
one tries to resolve such an address.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When processing a notification about a new FDB entry learned from a
VxLAN tunnel, the driver is provided with the FID index among other
parameters.
The driver potentially needs to update the bridge and VxLAN drivers
about the new entry using a pointer to the VxLAN device and the
corresponding VNI.
These two parameters are stored in the FID, so add a new function that
allows looking up a FID based on its index.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver periodically polls for new FDB entries learned by the device.
In the case of an FDB entry learned from a VxLAN tunnel, the
notification includes the IP of the remote VTEP, the filtering
identifier (FID) and the source MAC address of the overlay packet.
Assuming learning is enabled in the VxLAN and bridge drivers, the driver
needs to generate a notification and update them about the new FDB
entry.
Store the ifindex of the NVE device in the FID so that the driver will
be able to update the VxLAN and bridge drivers using it.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Will be used to process learned FDB records from an NVE tunnel.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Extend cooling device with cooling levels vector to allow more
flexibility of PWM setting.
Thermal zone algorithm operates with the numerical states for PWM
setting. Each state is the index, defined in range from 0 to 10 and it's
mapped to the relevant duty cycle value, which is written to PWM
controller. With the current definition fan speed is set to 0% for state
0, 10% for state 1, and so on up to 100% for the maximum state 10.
Some systems have limitation for the PWM speed minimum. For such systems
PWM setting speed to 0% will just disable the ability to increase speed
anymore and such device will be stall on zero speed. Cooling levels
allow to configure state vector according to the particular system
requirements. For example, if PWM speed is not allowed to be below 30%,
cooling levels could be configured as 30%, 30%, 30%, 30%, 40%, 50% and
so on.
Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJb80jFAAoJEEg/ir3gV/o+eUEH/0/bUhiaxaN88VFZXA24WlU2
v5jaeMg1P3ZuWYNS/AxGWedYWYr7oOgin79/4LIBKJMFdTdLGS2Oa0cTi0ycSNpx
0WMhVchClX3/UNTn6u/G23VsHeYav/AP+uaqnI+Tbtv4o8tc9ecpFgCXo4pMJaW/
osUOJC3MmU68yRAo91uwhhSHWMsl+8tceMdoS9N1Q6faM2v9XcJrMvKHMNpDvPpf
VXvC18fTX/qi/loJ6oKepTXn6lNVKDIgKmCjjudgwfRHPiw9R5uMlunaOoHO1wnH
T8ttdg/Dn94BLOOmHlOoW9TT7LVFLk7w2y5NoefUY0wUKSjByMxzyzqmdI+qXZU=
=SSaw
-----END PGP SIGNATURE-----
Merge tag 'mlx5-fixes-2018-11-19' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
Mellanox, mlx5 fixes 2018-11-19
The following fixes are for mlx5 core and netdev driver.
For -stable v4.16
bc7fda7d4637 ('net/mlx5e: IPoIB, Reset QP after channels are closed')
For -stable v4.17
36917a270395 ('net/mlx5: IPSec, Fix the SA context hash key')
For -stable v4.18
6492a432be3a ('net/mlx5e: Always use the match level enum when parsing TC rule match')
c3f81be236b1 ('net/mlx5e: Removed unnecessary warnings in FEC caps query')
c5ce2e736b64 ('net/mlx5e: Fix selftest for small MTUs')
For -stable v4.19
effcd896b25e ('net/mlx5e: Adjust to max number of channles when re-attaching')
394cbc5acd68 ('net/mlx5e: RX, verify received packet size in Linear Striding RQ')
447cbb3613c8 ('net/mlx5e: Don't match on vlan non-existence if ethertype is wildcarded')
c223c1574612 ('net/mlx5e: Claim TC hw offloads support only under a proper build config')
Please pull and let me know if there's any problem.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Querying interface FEC caps with 'ethtool [int]' after link reset
throws warning regading link speed.
This warning is not needed as there is already an indication in
user space that the link is not up.
Fixes: 0696d60853 ("net/mlx5e: Receive buffer configuration")
Signed-off-by: Shay Agroskin <shayag@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
This bug would result in reading wrong FEC capabilities for 10G/40G.
Fixes: 2095b26414 ("net/mlx5e: Add port FEC get/set functions")
Signed-off-by: Shay Agroskin <shayag@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Some speeds don't support turning FEC policy off. In case a requested
FEC policy is not supported for a speed (including current speed), its new
FEC policy would be:
no FEC - if disabling FEC is supported for that speed
unchanged - else
Fixes: 2095b26414 ("net/mlx5e: Add port FEC get/set functions")
Signed-off-by: Shay Agroskin <shayag@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Loopback test had fixed packet size, which can be bigger than configured
MTU. Shorten the loopback packet size to be bigger than minimal MTU
allowed by the device. Text field removed from struct 'mlx5ehdr'
as redundant to allow send small packets as minimal allowed MTU.
Fixes: d605d66 ("net/mlx5e: Add support for ethtool self diagnostics test")
Signed-off-by: Valentine Fatiev <valentinef@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
In case of striding RQ, we use MPWRQ (Multi Packet WQE RQ), which means
that WQE (RX descriptor) can be used for many packets and so the WQE is
much bigger than MTU. In virtualization setups where the port mtu can
be larger than the vf mtu, if received packet is bigger than MTU, it
won't be dropped by HW on too small receive WQE. If we use linear SKB in
striding RQ, since each stride has room for mtu size payload and skb
info, an oversized packet can lead to crash for crossing allocated page
boundary upon the call to build_skb. So driver needs to check packet
size and drop it.
Introduce new SW rx counter, rx_oversize_pkts_sw_drop, which counts the
number of packets dropped by the driver for being too large.
As a new field is added to the RQ struct, re-open the channels whenever
this field is being used in datapath (i.e., in the case of linear
Striding RQ).
Fixes: 619a8f2a42 ("net/mlx5e: Use linear SKB in Striding RQ")
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The mirror and not the output count is the one denoting a split.
Fix to condition the offload attempt on the mirror count being > 0
along the firmware to have the related capability.
Fixes: 592d365159 ("net/mlx5e: Parse mirroring action for offloaded TC eswitch flows")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Yossi Kuperman <yossiku@mellanox.com>
Reviewed-by: Chris Mi <chrism@mellanox.com>
Acked-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When core driver enters deattach/attach flow after pci reset,
Number of logical CPUs may have changed.
As a result we need to update the cpu affiliated resource tables.
1. indirect rqt list
2. eq table
Reproduction (PowerPC):
echo 1000 > /sys/kernel/debug/powerpc/eeh_max_freezes
ppc64_cpu --smt=on
# Restart driver
modprobe -r ... ; modprobe ...
# Link up
ifconfig ...
# Only physical CPUs
ppc64_cpu --smt=off
# Inject PCI errors so PCI will reset - calling the pci error handler
echo 0x8000000000000000 > /sys/kernel/debug/powerpc/<PCI BUS>/err_injct_inboundA
Call trace when trying to add non-existing rqs to an indirect rqt:
mlx5e_redirect_rqt+0x84/0x260 [mlx5_core] (unreliable)
mlx5e_redirect_rqts+0x188/0x190 [mlx5_core]
mlx5e_activate_priv_channels+0x488/0x570 [mlx5_core]
mlx5e_open_locked+0xbc/0x140 [mlx5_core]
mlx5e_open+0x50/0x130 [mlx5_core]
mlx5e_nic_enable+0x174/0x1b0 [mlx5_core]
mlx5e_attach_netdev+0x154/0x290 [mlx5_core]
mlx5e_attach+0x88/0xd0 [mlx5_core]
mlx5_attach_device+0x168/0x1e0 [mlx5_core]
mlx5_load_one+0x1140/0x1210 [mlx5_core]
mlx5_pci_resume+0x6c/0xf0 [mlx5_core]
Create cq will fail when trying to use non-existing EQ.
Fixes: 89d44f0a6c ("net/mlx5_core: Add pci error handlers to mlx5_core driver")
Signed-off-by: Yuval Avnery <yuvalav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
We get the match level (none, l2, l3, l4) while going over the match
dissectors of an offloaded tc rule. When doing this, the match level
enum and the not min inline enum values should be used, fix that.
This worked accidentally b/c both enums have the same numerical values.
Fixes: d708f90298 ('net/mlx5e: Get the required HW match level while parsing TC flow matches')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Currently, we are only supporting tc hw offloads when the eswitch
support is compiled in, but we are not gating the adevertizment
of the NETIF_F_HW_TC feature on this config being set.
Fix it, and while doing that, also avoid dealing with the feature
on ethtool when the config is not set.
Fixes: e8f887ac6a ('net/mlx5e: Introduce tc offload support')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
For the "all" ethertype we should not care whether the packet has
vlans. Besides being wrong, the way we did it caused FW error
for rules such as:
tc filter add dev eth0 protocol all parent ffff: \
prio 1 flower skip_sw action drop
b/c the matching meta-data (outer headers bit in struct mlx5_flow_spec)
wasn't set. Fix that by matching on vlan non-existence only if we were
also told to match on the ethertype.
Fixes: cee2648762 ('net/mlx5e: Set vlan masks for all offloaded TC rules')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reported-by: Slava Ovsiienko <viacheslavo@mellanox.com>
Reviewed-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The mlx5e channels should be closed before mlx5i_uninit_underlay_qp
puts the QP into RST (reset) state during mlx5i_close. Currently QP
state incorrectly set to RST before channels got deactivated and closed,
since mlx5_post_send request expects QP in RTS (Ready To Send) state.
The fix is to keep QP in RTS state until mlx5e channels get closed
and to reset QP afterwards.
Also this fix is simply correct in order to keep the open/close flow
symmetric, i.e mlx5i_init_underlay_qp() is called first thing at open,
the correct thing to do is to call mlx5i_uninit_underlay_qp() last thing
at close, which is exactly what this patch is doing.
Fixes: dae37456c8 ("net/mlx5: Support for attaching multiple underlay QPs to root flow table")
Signed-off-by: Denis Drozdov <denisd@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The commit "net/mlx5: Refactor accel IPSec code" introduced a
bug where asynchronous short time change in hash key value
by create/release SA context might happen during an asynchronous
hash resize operation this could cause a subsequent remove SA
context operation to fail as the key value used during resize is
not the same key value used when remove SA context operation is
invoked.
This commit fixes the bug by defining the SA context hash key
such that it includes only fields that never change during the
lifetime of the SA context object.
Fixes: d6c4f0298c ("net/mlx5: Refactor accel IPSec code")
Signed-off-by: Raed Salem <raeds@mellanox.com>
Reviewed-by: Aviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Expose packets discard counters via ethtool to help with debugging.
Signed-off-by: Shalom Toledo <shalomt@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
UBSAN: Undefined behavior in
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:626:29
signed integer overflow: 1802201963 + 1802201963 cannot be represented
in type 'int'
The union of res_reserved and res_port_rsvd[MLX4_MAX_PORTS] monitors
granting of reserved resources. The grant operation is calculated and
protected, thus both members of the union cannot be negative. Changed
type of res_reserved and of res_port_rsvd[MLX4_MAX_PORTS] from signed
int to unsigned int, allowing large value.
Fixes: 5a0d0a6161 ("mlx4: Structures and init/teardown for VF resource quotas")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Initialize the uid variable to zero to avoid the compilation warning.
Fixes: 7a89399ffa ("net/mlx4: Add mlx4_bitmap zone allocator")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When re-registering a user mr, the mpt information for the
existing mr when running SRIOV is obtained via the QUERY_MPT
fw command. The returned information includes the mpt's lkey.
This retrieved mpt information is used to move the mpt back
to hardware ownership in the rereg flow (via the SW2HW_MPT
fw command when running SRIOV).
The fw API spec states that for SW2HW_MPT, the lkey field
must be zero. Any ConnectX-3 PF driver which checks for strict spec
adherence will return failure for SW2HW_MPT if the lkey field is not
zero (although the fw in practice ignores this field for SW2HW_MPT).
Thus, in order to conform to the fw API spec, set the lkey field to zero
before invoking SW2HW_MPT when running SRIOV.
Fixes: e630664c83 ("mlx4_core: Add helper functions to support MR re-registration")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow ERP sharing for multiple mask. Do it by properly implementing
delta_create() objagg object. Use the computed delta info for inserting
rules in A-TCAM.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Later on the same code is going to be needed for deltas as well. So push
the procedures related to increment and decrement of num_ctcam_erps
into a separate helpers.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since two remaining users of mlxsw_afk_encode() do not specify
block ranges to work on, remove the args. Also, key/mask is always
non-NULL now, so skip the checks.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
No need to do key encoding again in
mlxsw_sp_acl_atcam_12kb_lkey_id_get(). Instead of that, introduce
a new helper that would just clear unused blocks.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change order so it is aligned with the usual case where the "write_to"
buffer comes as the first arg.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The device requires that the master mask of each region will be
composed from a logical OR between all the unmasked bits in the region.
Currently, this is just a logical OR between all the eRPs used in the
region, but the next patch is going to introduce delta bits support
which need to be taken into account as well.
Since the eRP does not include the delta bits, pass the key pointer to
mlxsw_sp_acl_erp_master_mask_set/clear instead. Convert key->mask to
the bitmap on fly.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently the ERPs are tracked internally in a hashtable. Benefit from
the newly introduced objagg library and use it to track ERPs. At this
point, there is no nesting of objects done, as the delta_create callback
always returns -EOPNOTSUPP. On the way, add "mask" into ERP mask get and
set functions and struct names.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The CPU policer used to police packets being trapped via a local route
(IP2ME) was incorrectly configured to police based on bytes per second
instead of packets per second.
Change the policer to police based on packets per second and avoid
packet loss under certain circumstances.
Fixes: 9148e7cf73 ("mlxsw: spectrum: Add policers for trap groups")
Signed-off-by: Shalom Toledo <shalomt@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>