Add a trap for packets that the device decided to drop because they are
not supposed to be routed. For example, IGMP queries can be flooded by
the device in layer 2 and reach the router. Such packets should not be
routed and instead dropped.
Signed-off-by: Amit Cohen <amitc@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
IRIF_DISABLED and ERIF_DISABLED are driver specific traps. Packets are
dropped for these reasons when they need to be routed through/from
existing router interfaces (RIF) which are disabled.
Add devlink driver-specific traps and mlxsw trap IDs used to report
these traps.
Signed-off-by: Amit Cohen <amitc@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Saeed Mahameed says:
====================
Mellanox, mlx5 E-Switch chains and prios
This series has two parts,
1) A merge commit with mlx5-next branch that include updates for mlx5
HW layouts needed for this and upcoming submissions.
2) From Paul, Increase the number of chains and prios
Currently the Mellanox driver supports offloading tc rules that
are defined on the first 4 chains and the first 16 priorities.
The restriction stems from the firmware flow level enforcement
requiring a flow table of a certain level to point to a flow
table of a higher level. This limitation may be ignored by setting
the ignore_flow_level bit when creating flow table entries.
Use unmanaged tables and ignore flow level to create more tables than
declared by fs_core steering. Manually manage the connections between the
tables themselves.
HW table is instantiated for every tc <chain,prio> tuple. The miss rule
of every table either jumps to the next <chain,prio> table, or continues
to slow_fdb. This logic is realized by following this sequence:
1. Create an auto-grouped flow table for the specified priority with
reserved entries
Reserved entries are allocated at the end of the flow table.
Flow groups are evaluated in sequence and therefore it is guaranteed
that the flow group defined on the last FTEs will be the last to evaluate.
Define a "match all" flow group on the reserved entries, providing
the platform to add table miss actions.
2. Set the miss rule action to jump to the next <chain,prio> table
or the slow_fdb.
3. Link the previous priority table to point to the new table by
updating its miss rule.
Please pull and let me know if there's any problem.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The variable xmit_done is being initialized with a value that is never
read and it is being updated later with a new value. The initialization
is redundant and can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The scope of function .ndo_tx_timeout was changed to include the hang
queue when a TX timeout event occurs. See commit 0290bd291c
("netdev: pass the stuck queue to the timeout handler") for more
details. Now, drivers don't need to identify which queue is stopped.
Drivers can simply use the queue index provided by dev_watchdog and
execute all actions needed to restore network traffic. This commit do
some cleanups into Intel ice driver to remove a redundant loop to find
stopped queue.
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The scope of function .ndo_tx_timeout was changed to include the hang
queue when a TX timeout event occurs. See commit 0290bd291c
("netdev: pass the stuck queue to the timeout handler") for more
details. Now, drivers don't need to identify which queue is stopped.
Drivers can simply use the queue index provided by dev_watchdog and
execute all actions needed to restore network traffic. This commit do
some cleanups into Intel i40e driver to remove a redundant loop to find
stopped queue.
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Make use of the new txqueue parameter to the .ndo_tx_timeout function.
In fm10k_tx_timeout, remove the now unnecessary loop to determine which
Tx queue is stuck. Instead, just double check the specified queue
This could be improved further to attempt resetting only the specific
queue that got stuck. However, that is a much larger refactor and has
been left as a future improvement.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
PHY power management control should provide a reliable and accurate
indication of PHY reset completion and decrease the delay time
after a PHY reset
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
TCP segmentation offload allows a device to segment a single frame
into multiple frames with a data payload size specified in socket buffer.
As a result we can now send data approximately up to seven percents fast
than was previously possible on my system.
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add support for blank NVM SKU
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Remove the unused IGC_FUNC_0 definition and make the code cleaner
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Fix typo in a context descriptor comment
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
It is not an error to unplug a cable from the ENETC port even with TSN
offloads, so don't spam the log with link-related messages from the
tc-taprio offload subsystem, a single notification is sufficient:
[10972.351859] fsl_enetc 0000:00:00.0 eno0: Qbv PSPEED set speed link down.
[10972.360241] fsl_enetc 0000:00:00.0 eno0: Link is Down
Fixes: 2e47cb415f ("enetc: update TSN Qbv PSPEED set according to adjust link speed")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Increase the number of chains and priorities to support
the whole range available in tc.
We use unmanaged tables and ignore flow level to create more
tables than what we declared to fs_core steering, and we manage
the connections between the tables themselves.
To support that we need FW with ignore_flow_level capability.
Otherwise the old behaviour will be used, where we are limited
by the number of levels we declared (4 chains, 16 prios).
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
To support the entire chain and prio range (32bit + 16bit),
instead of a using a static array of chains/prios of limited size, create
them dynamically, and use a rhashtable to search for existing chains/prio
combinations.
This will be used in next patch to actually increase the number using
unamanged tables support and ignore flow level capability.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Before changing the chain from original chain to ft offload chain,
make sure user doesn't actually use chains.
While here, normalize the prio range to that which we support.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
FT chain is defined as the next chain after tc.
To prepare for next patches that will increase the number of tc
chains available at runtime, use a getter function to get this
value.
The define is still used in static fs_core allocation,
to calculate the number of chains. This static allocation
will be used if the relevant capabilities won't be available
to support dynamic chains.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Exclude the last n entries for an autogrouped flow table.
Reserving entries at the end of the FT will ensure that this FG will be
the last to be evaluated. This will be used in the next patch to create
a miss group enabling custom actions on FT miss.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
If user sets ignore flow level flag on a rule, that rule can point to
a flow table of any level, including those with levels equal or less
than the level of the flow table it is added on.
This with unamanged tables will be used to create a FDB chain/prio
hierarchy much larger than currently supported level range.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Currently, Most of the steering tree is statically declared ahead of time,
with steering prios instances allocated for each fdb chain to assign max
number of levels for each of them. This allows fs_core to manage the
connections and levels of the flow tables hierarcy to prevent loops, but
restricts us with the number of supported chains and priorities.
Introduce unmananged flow tables, allowing the user to manage the flow
table connections. A unamanged table is detached from the fs_core flow
table hierarcy, and is only connected back to the hierarchy by explicit
FTEs forward actions.
This will be used together with firmware that supports ignoring the flow
table levels to increase the number of supported chains and prios.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
This merge syncs with mlx5-next latest HW bits and layout updates for next
features, in addition one patch that improves
mlx5_create_auto_grouped_flow_table() API across all mlx5 users.
* 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
net/mlx5: Refactor mlx5_create_auto_grouped_flow_table
net/mlx5e: Add discard counters per priority
net/mlx5e: Expose FEC feilds and related capability bit
net/mlx5: Add mlx5_ifc definitions for connection tracking support
net/mlx5: Add copy header action struct layout
net/mlx5: Expose resource dump register mapping
net/mlx5: Add structures and defines for MIRC register
net/mlx5: Read MCAM register groups 1 and 2
net/mlx5: Add structures layout for new MCAM access reg groups
net/mlx5: Expose vDPA emulation device capabilities
net/mlx5: Add Virtio Emulation related device capabilities
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Refactor mlx5_create_auto_grouped_flow_table() to use ft_attr param
which already carries the max_fte, prio and flags memebers, and is
used the same in similar mlx5_create_flow_table() function.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add counters that count (per priority) the number of received
packets that dropped due to lack of buffers on a physical port. If
this counter is increasing, it implies that the adapter is
congested and cannot absorb the traffic coming from the network.
Signed-off-by: Aharon Landau <aharonl@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
On load, Driver caches MCAM (Management Capabilities Mask Register)
registers. in addition to the only MCAM register group (0) the driver
already reads, here we add support for reading groups 1 and 2.
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Previous patches added support for two hardware flags for IPv4 and IPv6
routes: 'RTM_F_OFFLOAD' and 'RTM_F_TRAP'. Both indicate the presence of
the route in hardware. The first indicates that traffic is actually
offloaded from the kernel, whereas the second indicates that packets
hitting such routes are trapped to the kernel for processing (e.g., host
routes).
Use these two flags in mlxsw. The flags are modified in two places.
Firstly, whenever a route is updated in the device's table. This
includes the addition, deletion or update of a route. For example, when
a host route is promoted to perform NVE decapsulation, its action in the
device is updated, the 'RTM_F_OFFLOAD' flag set and the 'RTM_F_TRAP'
flag cleared.
Secondly, when a route is replaced and overwritten by another route, its
flags are cleared.
v2:
* Convert to new fib_alias_hw_flags_set() interface
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver currently uses the 'RTNH_F_OFFLOAD' flag for both routes and
nexthops, which is cumbersome and unnecessary now that we have separate
flag for the route itself.
Separate the offload indication for nexthops from routes and call it
whenever the offload state within the nexthop group changes.
Note that IPv6 (unlike IPv4) does not share the same nexthop group
between different routes, whereas mlxsw does. Therefore, whenever the
offload indication within an IPv6 nexthop group changes, all the linked
routes need to be updated.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Socionext driver can run on dma coherent and non-coherent devices.
Get rid of huge dma_sync_single_for_device in netsec_alloc_rx_data since
now the driver can let page_pool API to managed needed DMA sync
Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a new test for TBS feature which is used in ETF scheduler. In this
test, we send a packet with a launch time specified as now + 500ms and
check if the packet was transmitted on that time frame.
Changes from v2:
- Use the TBS bitfield
- Remove debug message
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In the upcoming commit for TBS selftest we will need to send a packet on
a specific Queue. As stmmac fallsback to netdev_pick_tx() on the select
Queue callback, we need to switch all selftests logic to
dev_direct_xmit() so that we can send the given SKB on a specific Queue.
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Adds more information regarding HW Capabilities in the corresponding
DebugFS file.
Changes from v2:
- Remove the TX/RX queues in use (Jakub)
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Enable TBS support on GMAC5 PCI entry for all Queues except Queue 0.
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Adds all the necessary HW hooks to support TBS feature in QoS cores.
Changes from v1:
- Remove unneeded LT shift as the IP already does this.
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Adds all the necessary HW hooks to support TBS feature in XGMAC cores.
Changes from v1:
- Remove unneeded LT shift as the IP already does this.
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Adds the support for ETF scheduler using TBS feature which is available
in XGMAC and QoS IPs.
Changes from v2:
- Fix checkpatch issues (Jakub)
- Use the TBS bitfield
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Adds the initial hooks for TBS support. This needs a 32 byte descriptor
in order for it to work with current HW. Adds all the logic for Enhanced
Descriptors in main core but no HW related logic for now.
Changes from v2:
- Use bitfield for TBS status / support (Jakub)
- Remove unneeded cache alignment (Jakub)
- Fix checkpatch issues
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
With 'commit 44768decb7 ("page_pool: handle page recycle for NUMA_NO_NODE
condition")' we can safely change nid to NUMA_NO_NODE and accommodate
future NUMA aware hardware using mvneta network interface
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Fix sparse warning:
drivers/net/ethernet/sfc/ethtool_common.c
warning: symbol 'efx_fill_test' was not declared. Should it be static?
warning: symbol 'efx_fill_loopback_test' was not declared.
Should it be static?
warning: symbol 'efx_describe_per_queue_stats' was not declared.
Should it be static?
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Use the print_hex_dump_debug() helper, instead of open-coding the same
operations.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Remove unused fields, copied from the Sun LANCE driver eons ago.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Use the netdevice struct device .parent field when calling
dma_pool_create(): the .dma_coherent_mask and .dma_mask
pertains to the bus device on the hardware (platform)
bus in this case, not the struct device inside the network
device. This makes the pool allocation work.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In order to probe this ethernet interface from the device tree
all physical MMIO regions must be passed as resources. Begin
this rewrite by first passing the port base address as a
resource for all platforms using this driver, remap it in
the driver and avoid using any reference of the statically
mapped virtual address in the driver.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Simplify and correct a bunch of messages using printk
directly to use the netdev_* macros. I have not changed
all of them, just the low-hanging fruit.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Use "ndev" for the struct net_device and "dev" for the
struct device in probe() and remove(). Add the local
"dev" pointer for later use in refactoring.
Take this opportunity to fix inverse christmas tree
coding style.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The IXP4xx driver was initializing the MDIO bus before even
probing, in the callbacks supposed to be used for setting up
the module itself, and with the side effect of trying to
register the MDIO bus as soon as this module was loaded or
compiled into the kernel whether the device was discovered
or not.
This does not work with multiplatform environments.
To get rid of this: set up the MDIO bus from the probe()
callback and remove it in the remove() callback. Rename
the probe() and remove() calls to reflect the most common
conventions.
Since there is a bit of checking for the ethernet feature
to be present in the MDIO registering function, making the
whole module not even be registered if we can't find an
MDIO bus, we need something similar: register the MDIO
bus when the corresponding ethernet is probed, and
return -EPROBE_DEFER on the other interfaces until this
happens. If no MDIO bus is present on any of the
registered interfaces we will eventually bail out.
None of the platforms I've seen has e.g. MDIO on EthB
and only uses EthC, there is always a Ethernet hardware
on the NPE (B, C) that has the MDIO bus, we just might
have to wait for it.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The platform data is needed to compile the driver as standalone,
so move it to a global location along with similar files.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The ixp46x ptp driver has a somewhat unusual setup, where the ptp
driver and the ethernet driver are in different directories but
access the same registers that are defined a platform specific
header file.
Moving everything into drivers/net/ makes it look more like most
other ptp drivers and allows compile-testing this driver on
other targets.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
hclge_reset_prepare_down() is only used to inform VF that PF is
going to do function reset, then using hclge_func_reset_sync_vf()
in hclge_reset_prepare_wait() to query whether VF is ready before
asserting PF function reset. To make the code more readable,
this patch uses a new function hclge_function_reset_notify_vf()
to do this job.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When synchronizes with VFs fail before PF function reset,
PF driver should go on its function reset, otherwise it
can not run normally anymore. So, hclge_func_reset_sync_vf()
should not affect the processing of PF reset, this patch
modifies its return type to void.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>