Having a separate "slow event" handler isn't needed because all
interrupt events trigger asynchronous activity. And in case of SYSErr
we have bigger problems than performance anyway.
This patch also allows to get rid of acking interrupt events in the
NAPI poll callback.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Trivial non-functional change added to simplify getting multiple
references to device pointer in lpc_eth_drv_probe().
Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A trivial change which removes an unused local variable, the issue
is reported as a compile time warning:
drivers/net/ethernet/nxp/lpc_eth.c: In function 'lpc_eth_drv_probe':
drivers/net/ethernet/nxp/lpc_eth.c:1250:21: warning: variable 'phydev' set but not used [-Wunused-but-set-variable]
struct phy_device *phydev;
^~~~~~
Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The MAC controller device is available on NXP LPC32xx platform only,
and the LPC32xx platform supports OF builds only, so additional
checks in the device driver are not needed.
Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The change removes all unnecessary included headers from the driver
source code, the remaining list is sorted in alphabetical order.
Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
net/sched/cls_api.c has overlapping changes to a call to
nlmsg_parse(), one (from 'net') added rtm_tca_policy instead of NULL
to the 5th argument, and another (from 'net-next') added cb->extack
instead of NULL to the 6th argument.
net/ipv4/ipmr_base.c is a case of a bug fix in 'net' being done to
code which moved (to mr_table_dump)) in 'net-next'. Thanks to David
Ahern for the heads up.
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no need to have the 'struct rocker_desc_info *desc_info'
variable static since new value always be assigned before use it.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Trivial fix to spelling mistake in DP_INFO message.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This series provides misc updates to mlx5 core and netdevice driver.
1) From Tariq Toukan: Refactor fragmented buffer struct fields and init flow.
2) From Vlad Buslov, Flow counters cache improvements and fixes follow up.
as a follow up work for the previous series of the mlx5 flow counters,
Vlad provides two fixes:
2.1) Take fs_counters dellist before addlist
Fixes: 6e5e228391 ("net/mlx5: Add new list to store deleted flow counters")
2.2) Remove counter from idr after removing it from list
Fixes: 12d6066c3b ("net/mlx5: Add flow counters idr")
From Shay Agroskin,
3) Add FEC set/get FW commands and FEC ethtool callbacks support
4) Add new ethtool statistics to cover errors on rx, such as FEC errors.
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJbyO6dAAoJEEg/ir3gV/o+nqIIAJP9ETql9EnCVQiuG/OwWXNM
fPV5iBeubquGtTRlUIL3ww6N1ab+wtYPIqd4CEMiG2a5NSDrNWjPQJ+D6P+3kGOV
51V+pgzPRGkWzHijzRb96Fgr3nYXjpRgJDzesASAYv8Du201aFNP6yjEdjET1ZDO
maIHQHtENU7Pwb3Rg47ILAdImsd4eqm630au9pUv5GtWTe9l+Z3a1LOpWAftWCjV
eRZv/CcCCickOTFTWBn2HGHTh8qC+FN0YY0NZYgqSizjdOdW7KyEqoTqKVvANJ1G
N01z9hDPyLNJ010+zprFjNBz52fBYK6eMHLP+dzq0qYw924EryeuMgdOMnS6gig=
=LD4j
-----END PGP SIGNATURE-----
Merge tag 'mlx5-updates-2018-10-18' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2018-10-18
This series provides misc updates to mlx5 core and netdevice driver.
1) From Tariq Toukan: Refactor fragmented buffer struct fields and init flow.
2) From Vlad Buslov, Flow counters cache improvements and fixes follow up.
as a follow up work for the previous series of the mlx5 flow counters,
Vlad provides two fixes:
2.1) Take fs_counters dellist before addlist
Fixes: 6e5e228391 ("net/mlx5: Add new list to store deleted flow counters")
2.2) Remove counter from idr after removing it from list
Fixes: 12d6066c3b ("net/mlx5: Add flow counters idr")
From Shay Agroskin,
3) Add FEC set/get FW commands and FEC ethtool callbacks support
4) Add new ethtool statistics to cover errors on rx, such as FEC errors.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds checksum offload and TSO support for the HiNIC
driver. Perfomance test (Iperf) shows more than 100% improvement
in TCP streams.
Signed-off-by: Zhao Chen <zhaochen6@huawei.com>
Signed-off-by: Xue Chaojing <xuechaojing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On multi adapter setup if the uld registration fails even on
one adapter, the allocated resources for the uld on all the
adapters are freed, rendering the functioning adapters unusable.
This commit fixes the issue by freeing the allocated resources
only for the failed adapter.
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The struct type was copied from the line before but it should be "tx"
instead of "rx". I have reviewed the code and I can't immediately see
that this bug causes a runtime issue.
Fixes: 36e53349b6 ("bnxt_en: Add additional extended port statistics.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These are counters for errors received on rx side, such as
FEC errors.
Signed-off-by: Shay Agroskin <shayag@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Driver callback function for 'ethtool --show-fec',
'ethtool --set-fec' commands.
The query function returns active and configured FEC policy
for current link speed.
The set function sets FEC policy for all supported link
speeds.
1) If current link speed doesn't support requested FEC policy,
the function fails.
2) If a different link speed doesn't support requested FEC
policy, FEC capbilities for this speed are turned off.
Signed-off-by: Shay Agroskin <shayag@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Added functions to query and set link FEC policy.
To get/set FEC capabilities in PPLM reg we need to query
current link speed.
'mlx5_get_fec_speed_field' queries current link speed and returns
correct field offset.
FEC Query's return value is divided into 'active FEC policy', which is
the FEC policy used by the link, and 'configured FEC policy', which
is the FEC policy requested by the user.
The two values may differ if:
1) FEC policy was configured to 'auto',
in which case the active FEC policy would be the default FEC policy
for current link speed.
2) FEC policy was changed, but no link reset is performed. In which case,
the active FEC policy would become the configured one after a link
reset.
FEC set function sets FEC policy for all link speeds and perform link
reset.
1) If current link speed doesn't support requested FEC policy,
the function fails.
2) If a different link speed doesn't support requested FEC policy,
FEC capbilities for this speed are turned off and a warning message
is printed.
Signed-off-by: Shay Agroskin <shayag@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Fs_counters list can temporary become unsorted when new counters are
created/deleted concurrently. Idr is used to quickly lookup position to
insert new counter in logarithmic time. However, if new flows are
concurrently inserted during time window when flows with adjacent ids are
already removed from idr but are still present in counters list,
mlx5_fc_stats_work() observes counters list in inconsistent state, which
results following warning:
[ 1839.561955] mlx5_core 0000:81:00.0: mlx5_cmd_fc_bulk_get:587:(pid 729): Flow counter id (0x102d5) out of range (0x1c0a8..0x1c10b). Counter ignored.
Move idr_remove() call to be executed synchronously with counter deletion
from list. Extract this code to mlx5_fc_stats_remove() helper function that
is called by workqueue job handler mlx5_fc_stats_work().
Fixes: 12d6066c3b ("net/mlx5: Add flow counters idr")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
In fs_counters elements from both addlist and dellist are removed by
mlx5_fc_stats_work() without any locking. This introduces race condition
when batch of new rules is created and then immediately deleted (for
example, when error occurred during flow creation). In such case some of
the rules might be in dellist, but not in addlist when mlx5_fc_stats_work()
is executed concurrently with tc, which will result rule deletion and
use-after-free on next iteration because deleted rules are still in
addlist.
Always take dellist first to guarantee that rules can only be deleted after
they were removed from addlist.
Fixes: 6e5e228391 ("net/mlx5: Add new list to store deleted flow counters")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reported-by: Chris Mi <chrism@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Take struct mlx5_frag_buf out of mlx5_frag_buf_ctrl, as it is not
needed to manage and control the datapath of the fragmented buffers API.
struct mlx5_frag_buf contains control info to manage the allocation
and de-allocation of the fragmented buffer.
Its fields are not relevant for datapath, so here I take them out of the
struct mlx5_frag_buf_ctrl, except for the fragments array itself.
In addition, modified mlx5_fill_fbc to initialise the frags pointers
as well. This implies that the buffer must be allocated before the
function is called.
A set of type-specific *_get_byte_size() functions are replaced by
a generic one.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Now that the documents have been updated to conform to the reStructured Text
guidelines, we can now change the file extensions and update the other
related references.
This converts all of the Intel wired LAN driver documentation to *.rst.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Added the fm10k kernel documentation, which apparently was missing.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
rtl_rx() and rtl_tx() are called only if the respective bits are set
in the interrupt status register. Under high load NAPI may not be
able to process all data (work_done == budget) and it will schedule
subsequent calls to the poll callback.
rtl_ack_events() however resets the bits in the interrupt status
register, therefore subsequent calls to rtl8169_poll() won't call
rtl_rx() and rtl_tx() - chip interrupts are still disabled.
Fix this by calling rtl_rx() and rtl_tx() independent of the bits
set in the interrupt status register. Both functions will detect
if there's nothing to do for them.
Fixes: da78dbff2e ("r8169: remove work from irq handler.")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
1GbE Intel Wired LAN Driver Updates 2018-10-17
This series adds support for the new igc driver.
The igc driver is the new client driver supporting the Intel I225
Ethernet Controller, which supports 2.5GbE speeds. The reason for
creating a new client driver, instead of adding support for the new
device in e1000e, is that the silicon behaves more like devices
supported in igb driver. It also did not make sense to add a client
part, to the igb driver which supports only 1GbE server parts.
This initial set of patches is designed for basic support (i.e. link and
pass traffic). Follow-on patch series will add more advanced support
like VLAN, Wake-on-LAN, etc..
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
========================================================================
From Or Gerlitz <ogerlitz@mellanox.com>:
This series from Paul adds support to mlx5 e-switch tc offloading of multiple priorities and chains.
This is made of four building blocks (along with few minor driver refactors):
[1] Split FDB fast path prio to multiple namespaces
Currently the FDB name-space contains two priorities, fast path (p0) and slow path (p1).
The slow path contains the per representor SQ send-to-vport TX rule and the match-all
RX miss rule. As a pre-step to support multi-chains and priorities, we split the FDB fast path
to multiple namespaces (sub namespaces), each with multiple priorities.
[2] E-Switch chains and priorities
A chain is a group of priorities. We use the fdb parallel sub-namespaces to implement chains,
and a flow table for each priority in them.
Because these namespaces are parallel and in series to the slow path
fdb, the chains aren't connected to each other (but to the slow path),
and one must use a explicit goto action to reach a different chain.
Flow tables for the priorities are created on demand and destroyed
once not used.
[3] Add a no-append flow insertion mode, use it for TC offloads
Enhance the driver fs core, such that if a no-append flag is set by the caller,
we add a new FTE, instead of appending the actions of the inserted rule when
the same match already exists.
For encap rules, we defer the HW offloading till we have a valid neighbor. This can
result in the packet hitting a lower priority rule in the HW DP. Use the no-append API
to push these packets to the slow path FDB table, so they go to the TC kernel DP as done
before priorities where supported.
[4] Offloading tc priorities and chains for eswitch flows
Using [1], [2] and [3] above we add the support for offloading both chains
and priorities. To get to a new chain, use the tc goto action. We support
a fixed prio range 1-16, and chains 0-3.
=============================================================================
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJbx6k1AAoJEEg/ir3gV/o+l40H/14rNaV27vefjuALgOvNX4DY
iSI5UFv9ILnAemcD2xkVfJeGolwdzoRhCXJ5oyCylCPnP4tb9zgDgwu9V/WmIRG+
DOaPLu+0V6jqfEGO5sXJPMhJNUR8WWAjfu66htJ0Nc1HV2OM5eYrcvjaYCfW4Egr
QFWGyq4sPyYcpbb7wURbhmkfs8Vwxcj9c2cZIfXo3VJsKxULqU9Mj5hZnirI1OAy
UhjLssb/8wfHmwNcqETI9ae7O+vPDMLkxdQvpviEBI+HJ7vZ6op2X4lVEsn/Bx2E
/KrHGQObkwim8thTOYkQeJtqptWbiRvkpNnwryUV1fwjWPl6X1r3bXH7RdeRwCg=
=aFCc
-----END PGP SIGNATURE-----
Merge tag 'mlx5-updates-2018-10-17' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
mlx5-updates-2018-10-17
========================================================================
From Or Gerlitz <ogerlitz@mellanox.com>:
This series from Paul adds support to mlx5 e-switch tc offloading of multiple priorities and chains.
This is made of four building blocks (along with few minor driver refactors):
[1] Split FDB fast path prio to multiple namespaces
Currently the FDB name-space contains two priorities, fast path (p0) and slow path (p1).
The slow path contains the per representor SQ send-to-vport TX rule and the match-all
RX miss rule. As a pre-step to support multi-chains and priorities, we split the FDB fast path
to multiple namespaces (sub namespaces), each with multiple priorities.
[2] E-Switch chains and priorities
A chain is a group of priorities. We use the fdb parallel sub-namespaces to implement chains,
and a flow table for each priority in them.
Because these namespaces are parallel and in series to the slow path
fdb, the chains aren't connected to each other (but to the slow path),
and one must use a explicit goto action to reach a different chain.
Flow tables for the priorities are created on demand and destroyed
once not used.
[3] Add a no-append flow insertion mode, use it for TC offloads
Enhance the driver fs core, such that if a no-append flag is set by the caller,
we add a new FTE, instead of appending the actions of the inserted rule when
the same match already exists.
For encap rules, we defer the HW offloading till we have a valid neighbor. This can
result in the packet hitting a lower priority rule in the HW DP. Use the no-append API
to push these packets to the slow path FDB table, so they go to the TC kernel DP as done
before priorities where supported.
[4] Offloading tc priorities and chains for eswitch flows
Using [1], [2] and [3] above we add the support for offloading both chains
and priorities. To get to a new chain, use the tc goto action. We support
a fixed prio range 1-16, and chains 0-3.
=============================================================================
Signed-off-by: David S. Miller <davem@davemloft.net>
The ocelot_vlant_wait_for_completion() function is very similar to the
ocelot_mact_wait_for_completion(). It seemed to have be copied but the
comment was not updated, so let's fix it.
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Kconfig limitation of X86 is to too wide.
The ENA driver only requires a little endian dependency.
Change the dependency to be on little endian CPU.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the switch driver (e.g., mlxsw_spectrum) determines it needs to
flash a new firmware version it resets the ASIC after the flashing
process. The bus driver (e.g., mlxsw_pci) then registers itself again
with mlxsw_core which means (among other things) that the device
registers itself again with the hwmon subsystem again.
Since the device was registered with the hwmon subsystem using
devm_hwmon_device_register_with_groups(), then the old hwmon device
(registered before the flashing) was never unregistered and was
referencing stale data, resulting in a use-after free.
Fix by removing reliance on device managed APIs in mlxsw_hwmon_init().
Fixes: c86d62cc41 ("mlxsw: spectrum: Reset FW after flash")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Alexander Petrovskiy <alexpe@mellanox.com>
Tested-by: Alexander Petrovskiy <alexpe@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Similar to d49c88d767 ("r8169: Enable MSI-X on RTL8106e") after
e9d0ba506ea8 ("PCI: Reprogram bridge prefetch registers on resume")
we can safely assume that this also fixes the root cause of
the issue worked around by 7c53a72245 ("r8169: don't use MSI-X on
RTL8168g"). So let's revert it.
Fixes: 7c53a72245 ("r8169: don't use MSI-X on RTL8168g")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clang currently warns:
drivers/net/ethernet/qlogic/qla3xxx.c:384:24: warning: signed shift
result (0xF00000000) requires 37 bits to represent, but 'int' only has
32 bits [-Wshift-overflow]
((ISP_NVRAM_MASK << 16) | qdev->eeprom_cmd_data));
~~~~~~~~~~~~~~ ^ ~~
1 warning generated.
The warning is certainly accurate since ISP_NVRAM_MASK is defined as
(0x000F << 16) which is then shifted by 16, resulting in 64424509440,
well above UINT_MAX.
Given that this is the only location in this driver where ISP_NVRAM_MASK
is shifted again, it seems likely that ISP_NVRAM_MASK was originally
defined without a shift and during the move of the shift to the
definition, this statement wasn't properly removed (since ISP_NVRAM_MASK
is used in the statenent right above this). Only the maintainers can
confirm this since this statment has been here since the driver was
first added to the kernel.
Link: https://github.com/ClangBuiltLinux/linux/issues/127
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for a RVU PF/VF to disable all RQ/SQ/CQ
contexts of a NIX LF via mbox. This will be used by PF/VF drivers
upon teardown or while freeing up HW resources.
A HW context which is not INIT'ed cannot be modified and a
RVU PF/VF driver may or may not INIT all the RQ/SQ/CQ contexts.
So a bitmap is introduced to keep track of enabled NIX RQ/SQ/CQ
contexts, so that only enabled hw contexts are disabled upon LF
teardown.
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Stanislaw Kardach <skardach@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for a RVU PF/VF to submit instructions to NIX AQ
via mbox. Instructions can be to init/write/read RQ/SQ/CQ/RSS
contexts. In case of read, context will be returned as part of
response to the mbox msg received.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allocate bitmaps and memory for PFVF mapping info for
maintaining NIX transmit scheduler queues maintenance.
PF/VF drivers will request for alloc, free e.t.c of
Tx schedulers via mailbox.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Config LSO formats for TSOv4 and TSOv6 offloads.
These formats tell HW which fields in the TCP packet's
headers have to be updated while performing segmentation
offload.
Also report PF/VF drivers the LSO format indices as part
of response to NIX_LF_ALLOC mbox msg. These indices are
used in SQE extension headers while framing SQE for pkt
transmission with TSO offload.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Upon receiving NIX_LF_ALLOC mbox message allocate memory for
NIXLF's CQ, SQ, RQ, CINT, QINT and RSS HW contexts and configure
respective base iova HW. Enable caching of contexts into NIX NDC.
Return SQ buffer (SQB) size, this PF/VF MAC address etc info
e.t.c to the mbox msg sender.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Initialize NIX admin queue (AQ) i.e alloc memory for
AQ instructions and for the results. All NIX LFs will submit
instructions to AQ to init/write/read RQ/SQ/CQ/RSS contexts
and in case of read, get context from result memory.
Also before configuring/using NIX block calibrate X2P bus
and check if NIX interfaces like CGX and LBK are in active
and working state.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for a RVU PF/VF to disable all Aura/Pool
contexts of a NPA LF via mbox. This will be used by PF/VF drivers
upon teardown or while freeing up HW resources.
A HW context which is not INIT'ed cannot be modified and a
RVU PF/VF driver may or may not INIT all the Aura/Pool contexts.
So a bitmap is introduced to keep track of enabled NPA Aura/Pool
contexts, so that only enabled hw contexts are disabled upon LF
teardown.
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Stanislaw Kardach <skardach@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for a RVU PF/VF to submit instructions to NPA AQ
via mbox. Instructions can be to init/write/read Aura/Pool/Qint
contexts. In case of read, context will be returned as part of
response to the mbox msg received.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Upon receiving NPA_LF_ALLOC mbox message allocate memory for
NPALF's aura, pool and qint contexts and configure the same
to HW. Enable caching of contexts into NPA NDC.
Return pool related info like stack size, num pointers per
stack page e.t.c to the mbox msg sender.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Initialize NPA admin queue (AQ) i.e alloc memory for
AQ instructions and for the results. All NPA LFs will submit
instructions to AQ to init/write/read Aura/Pool contexts
and in case of read, get context from result memory.
Added some common APIs for allocating memory for a queue
and get IOVA in return, these APIs will be used by
NIX AQ and for other purposes.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support to enable or disable internal loopback mode in CGX.
New mbox IDs CGX_INTLBK_ENABLE/DISABLE added for this.
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Linu Cherian <lcherian@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Upon receiving notification from firmware the CGX event handler
in the AF driver gets the current link info such as status, speed,
duplex etc from CGX driver and sends it across to PFs who have
registered to receive such notifications.
To support above
- Mbox messaging support for sending msgs from AF to PF has been added.
- Added mbox msgs so that PFs can register/unregister for link events.
- Link notifications are sent to PF under two scenarioss.
1. When a asynchronous link change notification is received from
firmware with notification flag turned on for that PF.
2. Upon notification turn on request, the current link status is
send to the PF.
Also added a new mailbox msg using which RVU PF/VF can retrieve
their mapped CGX LMAC's current link info. Link info includes
status, speed, duplex and lmac type.
Signed-off-by: Linu Cherian <lcherian@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for setting MAC address filters in CGX
for PF interfaces. Also PF interfaces can be put in promiscuous
mode. Dataplane PFs access this functionality using mailbox
messages to the AF driver.
Signed-off-by: Vidhya Raman <vraman@marvell.com>
Signed-off-by: Stanislaw Kardach <skardach@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for a RVU PF/VF driver to retrieve
it's mapped CGX LMAC Rx and Tx stats from AF via mbox.
New mailbox msg is added is added.
Signed-off-by: Christina Jacob <cjacob@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Added new mailbox msgs for RVU PF/VFs to request AF
to enable/disable their mapped CGX::LMAC Rx & Tx.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Linu Cherian <lcherian@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of looping on a integer timeout, use time_before(jiffies),
so that maximum poll time is capped.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the device, VxLAN encapsulation takes place in the FDB table where
certain {MAC, FID} entries are programmed with an underlay unicast IP.
MAC addresses that are not programmed in the FDB are flooded to the
relevant local ports and also to a list of underlay unicast IPs that are
programmed using the all zeros MAC address in the VxLAN driver.
One difference between the hardware and software data paths is the fact
that in the software data path there are two FDB lookups prior to the
encapsulation of the packet. First in the bridge's FDB table using {MAC,
VID} and another in the VxLAN's FDB table using {MAC, VNI}.
Therefore, when a new VxLAN FDB entry is notified, it is only programmed
to the device if there is a corresponding entry in the bridge's FDB
table. Similarly, when a new bridge FDB entry pointing to the VxLAN
device is notified, it is only programmed to the device if there is a
corresponding entry in the VxLAN's FDB table.
Note that the above scheme will result in a discrepancy between both
data paths if only one FDB table is populated in the software data path.
For example, if only the bridge's FDB is populated with an entry
pointing to a VxLAN device, then a packet hitting the entry will only be
flooded by the kernel to remote VTEPs whereas the device will also flood
the packets to other local ports member in the VLAN.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enslavement of VxLAN devices to offloaded bridges was never forbidden by
mlxsw, but this patch makes sure the required configuration is performed
in order to allow VxLAN encapsulation and decapsulation to take place in
the device.
The patch handles both the case where a VxLAN device is enslaved to an
already offloaded bridge and the case where the first mlxsw port is
enslaved to a bridge that already has VxLAN device configured.
Invalid configurations are sanitized and an error string is returned via
extack.
Since encapsulation and decapsulation do not occur when the VxLAN device
is down, the driver makes sure to enable / disable these functionalities
based on NETDEV_PRE_UP and NETDEV_DOWN events.
Note that NETDEV_PRE_UP is used in favor of NETDEV_UP, as the former
allows to veto the operation, if necessary.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, an FDB entry only ceases being offloaded when it is deleted.
This changes with VxLAN encapsulation.
Devices capable of performing VxLAN encapsulation usually have only one
FDB table, unlike the software data path which has two - one in the
bridge driver and another in the VxLAN driver.
Therefore, bridge FDB entries pointing to a VxLAN device are only
offloaded if there is a corresponding entry in the VxLAN FDB.
Allow clearing the offload indication in case the corresponding entry
was deleted from the VxLAN FDB.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the ability to determine whether a netdev is a VxLAN netdev by
calling the above mentioned function that checks the netdev's
rtnl_link_ops.
This will allow modules to identify netdev events involving a VxLAN
netdev and act accordingly. For example, drivers capable of VxLAN
offload will need to configure the underlying device when a VxLAN netdev
is being enslaved to an offloaded bridge.
Convert nfp to use the newly introduced helper.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a local route that matches the source IP of an offloaded NVE tunnel
is notified, the driver needs to program it to perform NVE decapsulation
instead of merely trapping packets to the CPU.
This patch complements "mlxsw: spectrum_router: Enable local routes
promotion to perform NVE decap" where existing local routes were
promoted to perform NVE decapsulation.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
802.1D FIDs are used to represent VLAN-unaware bridges and currently
this is the only type of FID that supports NVE configuration.
Since the NVE tunnel device does not take a reference on the FID, it is
possible for the FID to be destroyed when it still has NVE
configuration.
Therefore, when destroying the FID make sure to disable its NVE
configuration.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The common NVE core expects each encapsulation type to implement a
certain set of operations that are specific to this type and the
currently used ASIC. These operations include things such as the ability
to determine whether a certain NVE configuration can be offloaded and
ASIC-specific initialization for this type.
Implement these operations for VxLAN on the Spectrum ASIC. Spectrum-2
support will be added by a future patchset.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Spectrum ASIC supports different types of NVE encapsulations (e.g.,
VxLAN, NVGRE) with more types to be supported by future ASICs.
Despite being different, all these encapsulations share some common
functionality such as the enablement of NVE encapsulation on a given
filtering identifier (FID) and the addition of remote VTEPs to the
linked-list of VTEPs that traffic should be flooded to.
Implement this common core and allow different ASICs to register
different operations for different encapsulation types.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the device, different VRFs (routing tables) are represented using
different virtual routers (VRs) and thus the kernel's table IDs are
mapped to VR IDs.
Allow internal users of the IP router to query the VR ID based on a
kernel table ID.
This is needed - for example - when configuring the underlay VR where
VxLAN encapsulated packets will undergo an L3 lookup. In this case, the
kernel's table ID is derived from the VxLAN device's configuration.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When an NVE tunnel with an IP underlay (e.g., VxLAN) is configured the
local route to the tunnel's source IP needs to be promoted to perform
NVE decapsulation.
Expose an API in the unicast IP router to promote / demote local routes.
The case where a local route is configured after the creation of the NVE
tunnel will be handled in a subsequent patch in the set.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current APIs only allow looking for a FID and creating it in case it
does not exist.
With VxLAN, in case the bridge to which the VxLAN device was enslaved
does not already have a corresponding FID, then it means that something
went wrong that we need to be aware of.
Add an API to look up a FID, but without creating it in order to catch
above-mentioned situation.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the device, the VNI and the list of remote VTEPs a packet should be
flooded to is a property of the filtering identifier (FID).
During encapsulation, the VNI is taken from the FID the packet was
classified to. During decapsulation, the overlay packet is injected into
a bridge and classified to a FID based on the VNI it came with.
Allow NVE configuration for a FID. Currently, this is only supported
with 802.1D FIDs which are used for VLAN-unaware bridges. However, NVE
configuration is going to be supported with 802.1Q FIDs which is why the
related fields are placed in the common FID struct.
Since the device requires a 1:1 mapping between FID and VNI, the driver
maintains a hashtable keyed by VNI and checks if the VNI is already
associated with an existing FID.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently we fail when user specify a non-zero chain, this patch adds the
support for it and tc priorities. To get to a new chain, use the tc
goto action.
Currently we support a fixed prio range 1-16, and chain range 0-3.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When adding a vxlan tc rule, and a neighbour isn't available, we
don't insert any rule to hardware. Once we enable offloading flows
with multiple priorities, a packet that should have matched this rule
will continue in hardware pipeline and might match a wrong one.
This is unlike in tc software path where it will be matched and
forwarded to the vxlan device (which will cause a ARP lookup
eventually) and stop processing further tc filters.
To address that, when when a neighbour isn't available (EAGAIN from
attach_encap), or gets deleted, change the original action to be a
forward to slow path instead. Neighbour update will restore the original
action once the neighbour becomes available. This will be done atomically
so at any given time we will have a the correct match.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
A pre-step for the tc offloads code to use this when a neigh is
not available for encap rules.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The code for adding/deleting fdb flow is repeated when
user-space does flow add/del and when we add/del from
the neigh update path - unify them to avoid the duplication.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When replacing a tc flower rule, flower first requests to add the
new rule (new action), then deletes the old one.
But currently when asked to add a new tc flower flow, we append the
actions (and counters to it).
This can result in a fte with two flow counters or conflicting
actions (drop and encap action) which firmware complains/errs
about and isn't achieving what the user aimed for.
Instead, insert the flow using the new no-append flag which will add a
new HW rule, the old flow and rule will be deleted later by flower
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanmox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
If no-append flag is set, we will add a new FTE, instead of appending
the actions of the inserted rule when the same match already exists.
While here, move the has_flow_tag boolean indicator to be a flag too.
This patch doesn't change any functionality.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanmox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
A chain is a group of priorities, so use the fdb parallel
sub namespaces to implement chains, and a flow table for each
priority in them.
Because these namespaces are parallel and in series to the slow path
fdb, the chains aren't connected to one another (but to the slow path),
and one must use a explicit goto action to reach a different chain.
Flow tables for the priorities will be created on demand and destroyed
once not used.
The Firmware has four pools of tables for sizes S/XS/M/L (4k, 64k, 1m, 4m).
We maintain ghost copies of the pools occupancy.
When a new table is to be created, we scan the pools from large to small
and find the 1st table size which can be now created. When a table is
destroyed, we update the relevant pool.
Multi chain/prio isn't enabled yet by this patch, for now all flows
will use the default chain 0, and prio 1.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Be symmetric with the e-switch API to add rules which has a
specific function to add fwd rules which are used as part of
vport mirroring.
This patch doesn't change any functionality.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Towards supporting multi-chains and priorities, split the FDB fast path
to multiple namespaces (sub namespaces), each with multiple priorities.
This patch adds a new flow steering type, FS_TYPE_PRIO_CHAINS, which is
like current FS_TYPE_PRIO, but may contain only namespaces, and those
will be in parallel to one another in terms of managing of the flow
tables connections inside them. Meaning, while searching for the next
or previous flow table to connect for a new table inside such namespace
we skip the parallel namespaces in the same level under the
FS_TYPE_PRIO_CHAINS prio we originated from.
We use this new type for splitting the fast path prio into multiple
parallel namespaces, each containing normal prios.
The prios inside them (and their tables) will be connected to one
another, but not from one parallel namespace to another, instead the
last prio in each namespace will be connected to the next prio in
the containing FDB namespace, which is the slow path prio.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Acked-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Move to have clear separation on the code path to add nic vs e-switch
flows. While here we break the code that deals with adding offloaded
TC tool to few smaller stages, each on helper function.
Besides getting us simpler and readable code, these are pre-steps
for being able to have two HW flows serving one SW TC flow for some
e-switch use cases.
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Refactor the flow add utility functions to return err code instead of rule
pointers. This will allow for simpler logic when one tc rule is
duplicated to two HW rules in downstream patches.
Signed-off-by: Rabie Loulou <rabiel@mellanox.com>
Signed-off-by: Shahar Klein <shahark@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Currently, when a flow rule is created using the FS core layer, the caller
has to pass the entire flow counter object and not just the counter HW
handle (ID). This requires both the FS core and the caller to have
knowledge about the inner implementation of the FS layer flow counters
cache and limits the possible users.
Move to use the counter ID across the place when dealing with flows.
Doing this decoupling, now can we privatize the inner implementation
of the flow counters.
Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
There's no real reason for the e-switch logic to manage the creation of
counters for offloaded flows. The API already has the directive for the
caller to denote they want to attach a counter to the created flow.
As such, we go and move the management of flow counters to the mlx5e
tc offload logic. This also lets us remove an inelegant interface where
the FS layer had to provide a way to retrieve a counter from a flow rule.
Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
mlx5 updates for both net-next and rdma-next
* 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: (21 commits)
net/mlx5: Expose DC scatter to CQE capability bit
net/mlx5: Update mlx5_ifc with DEVX UID bits
net/mlx5: Set uid as part of DCT commands
net/mlx5: Set uid as part of SRQ commands
net/mlx5: Set uid as part of SQ commands
net/mlx5: Set uid as part of RQ commands
net/mlx5: Set uid as part of QP commands
net/mlx5: Set uid as part of CQ commands
net/mlx5: Rename incorrect naming in IFC file
net/mlx5: Export packet reformat alloc/dealloc functions
net/mlx5: Pass a namespace for packet reformat ID allocation
net/mlx5: Expose new packet reformat capabilities
{net, RDMA}/mlx5: Rename encap to reformat packet
net/mlx5: Move header encap type to IFC header file
net/mlx5: Break encap/decap into two separated flow table creation flags
net/mlx5: Add support for more namespaces when allocating modify header
net/mlx5: Export modify header alloc/dealloc functions
net/mlx5: Add proper NIC TX steering flow tables support
net/mlx5: Cleanup flow namespace getter switch logic
net/mlx5: Add memic command opcode to command checker
...
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add link establishment methods
Add auto negotiation methods
Add read MAC address method
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add PHY's ID support
Add support for initialization, acquire and release of PHY
Enable register access
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add code for NVM support and get MAC address, complete probe
method.
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add code for hardware initialization and reset
Add code for semaphore handling
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds support for allocating, configuring, and freeing Tx/Rx ring
resources. With these changes in place the descriptor queues are in a
state where they are ready to transmit or receive if provided buffers.
This also adds the transmit and receive fastpath and interrupt handlers.
With this code in place the network device is now able to send and receive
frames over the network interface using a single queue.
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change adds the defines and structures necessary to support both Tx
and Rx descriptor rings.
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch set adds interrupt support for the igc interfaces.
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Now that we have the ability to configure the basic settings on the device
we can start allocating and configuring a netdev for the interface.
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds the basic defines and structures needed by the PF for
operation. With this it is possible to bring up the interface,
but without being able to configure any of the filters on
the interface itself.
Add skeleton for a function pointers.
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds the beginning framework onto which I am going to add
the igc driver which supports the Intel(R) I225-LM/I225-V 2.5G
Ethernet Controller.
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Update cxgb4 to send No. of Tx Queue created in lldinfo struct
and use the same ntxq in chcr driver.
This patch depends on following commit
commit add92a817e
"Fix memory corruption in DMA Mapped buffers"
v2:
Free txq_info in error case as pointed by Lino Sanfilippo.
Signed-off-by: Harsh Jain <harsh@chelsio.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Program translation stage checks that program can be offloaded to
the netdev which was passed during the load (bpf_attr->prog_ifindex).
After program sharing was introduced, however, the netdev on which
program is loaded can theoretically be different, and therefore
we should recheck the program size and max stack size at load time.
This was found by code inspection, AFAIK today all vNICs have
identical caps.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Atomic operations on the NFP are currently always in big endian.
The driver keeps track of regions of memory storing atomic values
and byte swaps them accordingly. There are corner cases where
the map values may be initialized before the driver knows they
are used as atomic counters. This can happen either when the
datapath is performing the update and the stack contents are
unknown or when map is updated before the program which will
use it for atomic values is loaded.
To avoid situation where user initializes the value to 0 1 2 3
and then after loading a program which uses the word as an atomic
counter starts reading 3 2 1 0 - only allow atomic counters to be
initialized to endian-neutral values.
For updates from the datapath the stack information may not be
as precise, so just allow initializing such values to 0.
Example code which would break:
struct bpf_map_def SEC("maps") rxcnt = {
.type = BPF_MAP_TYPE_HASH,
.key_size = sizeof(__u32),
.value_size = sizeof(__u64),
.max_entries = 1,
};
int xdp_prog1()
{
__u64 nonzeroval = 3;
__u32 key = 0;
__u64 *value;
value = bpf_map_lookup_elem(&rxcnt, &key);
if (!value)
bpf_map_update_elem(&rxcnt, &key, &nonzeroval, BPF_ANY);
else
__sync_fetch_and_add(value, 1);
return XDP_PASS;
}
$ offload bpftool map dump
key: 00 00 00 00 value: 00 00 00 03 00 00 00 00
should be:
$ offload bpftool map dump
key: 00 00 00 00 value: 03 00 00 00 00 00 00 00
Reported-by: David Beckett <david.beckett@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
When sending a big fragment using multiple buffer descriptor,
hns3 does one maping, but do multiple unmapping when tx is done,
which may cause unmapping problem.
To fix it, this patch makes sure the value of desc_cb.length of
the non-first bd is zero. If desc_cb.length is zero, we do not
unmap the buffer.
Fixes: 76ad4f0ee7 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To keep symmetrical, this patch renames hns_nic_dma_unmap to
hns3_clear_desc.
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch unifies big tx fragment handling for tso and non-tso
case.
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To solve the L3 checksum error problem which happens when driver
does not clear L3 checksum, DMA map should be done after calling
skb_cow_head.
This patch moves DMA map into hns3_fill_desc to ensure that DMA
map is done after calling skb_cow_head.
Fixes: 76ad4f0ee7 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes hns3_fill_desc_tso in preparation for
fixing some desc filling bug, because for tso or non-tso
case, we will use the unified hns3_fill_desc.
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Newly added link modes are required to be added
during setting link modes. If the new link mode
is not available during qed_set_link, it may cause
link getting down due to empty supported capability,
being passed to MFW, after setting autoneg off/on
with current/supported speed.
Signed-off-by: Rahul Verma <Rahul.Verma@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Set link mode after checking available "supported" link caps
of the port.
Signed-off-by: Rahul Verma <Rahul.Verma@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Added transceiver type, speed capability and board types
in HSI, are utilizing to display the accurate link
information in ethtool.
Signed-off-by: Rahul Verma <Rahul.Verma@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Added transceiver modes with different speed and media type,
speed capability and supported board types in HSI, which
will be utilizing to display correct specification of link
modes and speed type.
Signed-off-by: Rahul Verma <Rahul.Verma@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Align the use of local PTT to propagate through the qed_mcp* API's.
Global ptt should not be used.
Register access should be done through layers. Register address is
mapped into a PTT, PF translation table. Several interface functions
require a PTT to direct read/write into register. There is a pool of
PTT maintained, and several PTT are used simultaneously to access
device registers in different flows. Same PTT should not be used in
flows that can run concurrently.
To avoid running out of PTT resources, too many PTT should not be
acquired without releasing them. Every PF has a global PTT, which is
used throughout the life of PF, in most important flows for register
access. Generic functions acquire the PTT locally and release after
the use. This patch aligns the use of Global PTT and Local PTT
accordingly.
Signed-off-by: Rahul Verma <rahul.verma@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes the following sparse warning:
drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils_fw2x.c:282:5: warning:
symbol 'aq_fw2x_update_stats' was not declared. Should it be static?
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously when populating the set ipv6 address action, we incorrectly
made use of pedit's key index to determine which 32bit word should be
set. We now calculate which word has been selected based on the offset
provided by the pedit action.
Fixes: 354b82bb32 ("nfp: add set ipv6 source and destination address")
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously we only allowed a single header key per pedit action to
change the header. This used to result in the last header key in the
pedit action to overwrite previous headers. We now keep track of them
and allow multiple header keys per pedit action.
Fixes: c0b1bd9a8b ("nfp: add set ipv4 header action flower offload")
Fixes: 354b82bb32 ("nfp: add set ipv6 source and destination address")
Fixes: f8b7b0a6b1 ("nfp: add set tcp and udp header action flower offload")
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously we did not correctly change headers when using multiple
pedit actions with partial masks. We now take this into account and
no longer just commit the last pedit action.
Fixes: c0b1bd9a8b ("nfp: add set ipv4 header action flower offload")
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit db65f35f50 ("net: fec: add support of ethtool get_regs") introduce
ethool "--register-dump" interface to dump all FEC registers.
But not all silicon implementations of the Freescale FEC hardware module
have the FRBR (FIFO Receive Bound Register) and FRSR (FIFO Receive Start
Register) register, so we should not be trying to dump them on those that
don't.
To fix it we create a quirk flag, FEC_QUIRK_HAS_RFREG, and check it before
dump those RX FIFO registers.
Signed-off-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a new poll function that polls for NQ events. If the NQ event is
a CQ notification, we locate the CP ring from the cq_handle and call
__bnxt_poll_work() to handle RX/TX events on the CP ring.
Add a new has_more_work field in struct bnxt_cp_ring_info to indicate
budget has been reached. __bnxt_poll_cqs_done() is called to update or
ARM the CP rings if budget has not been reached or not. If budget
has been reached, the next bnxt_poll_p5() call will continue to poll
from the CQ rings directly. Otherwise, the NQ will be ARMed for the
next IRQ.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Separate the CP ring polling logic in bnxt_poll_work() into 2 separate
functions __bnxt_poll_work() and __bnxt_poll_work_done(). Since the logic
is separated, we need to add tx_pkts and events fields to struct bnxt_napi
to keep track of the events to handle between the 2 functions. We also
add had_work_done field to struct bnxt_cp_ring_info to indicate whether
some work was performed on the CP ring.
This is needed to better support the 57500 chips. We need to poll up to
2 separate CP rings before we update or ARM the CP rings on the 57500 chips.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On legacy chips, the CP ring may be shared between RX and TX and so only
setup the RX coalescing parameters in such a case. On 57500 chips, we
always have a dedicated CP ring for TX so we can always set up the
TX coalescing parameters in bnxt_hwrm_set_coal().
Also, the min_timer coalescing parameter applies to the NQ on the new
chips and a separate firmware call needs to be made to set it up.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the RX code path, we current use the bnxt_napi struct pointer to
identify the associated RX/CP rings. Change it to use the struct
bnxt_cp_ring_info pointer instead since there are now up to 2
CP rings per MSIX.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
RSS context allocation and RSS indirection table setup are very different
on the new chip. Refactor bnxt_setup_vnic() to call 2 different functions
to set up RSS for the vnic based on chip type. On the new chip, the
number of RSS contexts and the indirection table size depends on the
number of RX rings. Each indirection table entry is also different
on the new chip since ring groups are no longer used.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On the new 57500 chips, we need to allocate one RSS context for every
64 RX rings. In previous chips, only one RSS context per vnic is
required regardless of the number of RX rings. So increase the max
RSS context array count to 8.
Hardware ring groups are not used on the new chips. Note that the
software ring group structure is still maintained in the driver to
keep track of the rings associated with the vnic.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On the new 57500 chips, we allocate/free one CP ring for each RX ring or
TX ring separately. Using separate CP rings for RX/TX is an improvement
as TX events will no longer be stuck behind RX events.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Firmware ring allocation semantics are slightly different for most
ring types on 57500 chips. Allocation/deallocation for NQ rings are
also added for the new chips.
A CP ring handle is also added so that from the NQ interrupt event,
we can locate the CP ring.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On the new 57500 chips, getting the associated CP ring ID associated with
an RX ring or TX ring is different than before. On the legacy chips,
we find the associated ring group and look up the CP ring ID. On the
57500 chips, each RX ring and TX ring has a dedicated CP ring even if
they share the MSIX. Use these helper functions at appropriate places
to get the CP ring ID.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On 57500 chips, the original bnxt_cp_ring_info struct now refers to the
NQ. bp->cp_nr_rings refer to the number of NQs on 57500 chips. There
are now 2 pointers for the CP rings associated with RX and TX rings.
Modify bnxt_alloc_cp_rings() and bnxt_free_cp_rings() accordingly.
With multiple CP rings per NAPI, we need to add a pointer in
bnxt_cp_ring_info struct to point back to the bnxt_napi struct.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ring reservation functions have to be modified for P5 chips in the
following ways:
- bnxt_cp_ring_info structs map to internal NQs as well as CP rings.
- Ring groups are not used.
- 1 CP ring must be available for each RX or TX ring.
- number of RSS contexts to reserve is multiples of 64 RX rings.
- RFS currently not supported.
Also, RX AGG rings are only used for jumbo frames, so we need to
unconditionally call bnxt_reserve_rings() in __bnxt_open_nic()
to see if we need to reserve AGG rings in case MTU has changed.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Store the maximum MSIX capability in PCIe config. space earlier. When
we call firmware to query capability, we need to compare the PCIe
MSIX max count with the firmware count and use the smaller one as
the MSIX count for 57500 (P5) chips.
The new chips don't use ring groups. But previous chips do and
the existing logic limits the available rings based on resource
calculations including ring groups. Setting the max ring groups to
the max rx rings will work on the new chips without changing the
existing logic.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 57500 series chips have a new 64-bit doorbell format. Use a new
bnxt_db_info structure to unify the new and the old 32-bit doorbells.
Add a new bnxt_set_db() function to set up the doorbell addreses and
doorbell keys ahead of time. Modify and introduce new doorbell
helpers to help abstract and unify the old and new doorbells.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
57500 series is a new chip class (P5) that requires some driver changes
in the next several patches. This adds basic chip ID, doorbells, and
the notification queue (NQ) structures. Each MSIX is associated with an
NQ instead of a CP ring in legacy chips. Each NQ has up to 2 associated
CP rings for RX and TX. The same bnxt_cp_ring_info struct will be used
for the NQ.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Call firmware to configure the DMA addresses of all context memory
pages on new devices requiring context memory.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
New device requires host context memory as a backing store. Call
firmware to check for context memory requirements and store the
parameters. Allocate host pages accordingly.
We also need to move the call bnxt_hwrm_queue_qportcfg() earlier
so that all the supported hardware queues and the IDs are known
before checking and allocating context memory.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Newer chips require the PTU_PTE_VALID bit to be set for every page
table entry for context memory and rings. Additional bits are also
required for page table entries for all rings. Add a flags field to
bnxt_ring_mem_info struct to specify these additional bits to be used
when setting up the pages tables as needed.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the DMA page table and vmem fields in bnxt_ring_struct to a new
bnxt_ring_mem_info struct. This will allow context memory management
for a new device to re-use some of the existing infrastructure.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
New firmware spec. allows interrupt coalescing parameters, such as
maximums, timer units, supported features to be queried. Update
the driver to make use of the new call to query these parameters
and provide the legacy defaults if the call is not available.
Replace the hard-coded values with these parameters.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Support the max_ext_req_len field from the HWRM_VER_GET_RESPONSE.
If this field is valid and greater than the mailbox size, use the
short command format to send firmware messages greater than the
mailbox size. Newer devices use this method to send larger messages
to the firmware.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Latest firmware spec. has some additional rx extended port stats and new
tx extended port stats added. We now need to check the size of the
returned rx and tx extended stats and determine how many counters are
valid. New counters added include CoS byte and packet counts for rx
and tx.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Among the new changes are trusted VF support, 200Gbps support, and new
API to dump ring information on the new chips.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Trivial fix to spelling mistake in DP_INFO message
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
netif_device_detach() stops all tx queues already, so we don't need
this call.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simplify this function, no functional change intended.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The newly added driver causes a warning about a function that is
not used anywhere:
drivers/net/ethernet/marvell/octeontx2/af/cgx.c:320:12: error: 'cgx_fwi_link_change' defined but not used [-Werror=unused-function]
Remove it for now, until a user gets added. If we want to use this
function from another module, we also need a declaration in a header
file, which is currently missing, so it would have to change anyway.
Fixes: 1463f382f5 ("octeontx2-af: Add support for CGX link management")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit makes it possible to use devlink to split the 100G CXP
Netronome into two 40G interfaces. Currently when you ask for 2
interfaces, the math in src/nfp_devlink.c:nfp_devlink_port_split
calculates that you want 5 lanes per port because for some reason
eth_port.port_lanes=10 (shouldn't this be 12 for CXP?). What we really
want when asking for 2 breakout interfaces is 4 lanes per port. This
commit makes that happen by calculating based on 8 lanes if 10 are
present.
Signed-off-by: Ryan C Goodfellow <rgoodfel@isi.edu>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Greg Weeks <greg.weeks@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
According to the hardware ArchDef, the PTV1 field in FD[CTRL]
is ignored by WRIOP, so setting it for Tx FDs is pointless.
Remove all references to it from the code.
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ch parameter is never used in the dpaa2_eth_tx_conf function but
since its prototype must match the type defined in the consume field of
struct dpaa2_eth_fq, just mark it as __always_unused.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The priv parameter is never used in the build_linear_skb and
drain_channel function. Remove it from the function definitions.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
All 3 cases of possible uninitialized variables are false
positives since they are used only as output parameters.
Nonetheless, fix the warnings.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The dpaa2_eth_set_dist_key function is only used in a single file.
Make it static.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Both ARCH_LAYERSCAPE and COMPILE_TEST dependencies are already implied
through the FSL_MC_BUS dep, so there's no need to state it explicitly.
Also, the fsl-mc bus depends on COMPILE_TEST only for some
architectures (arm, arm64, ppc, x86), so it's not correct to
claim build support unconditionally.
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In dual-emac mode the cpsw driver sends directed packets, that means
that packets go to the directed port, but an ALE lookup is performed
to determine untagged egress only. It means that on tx side no need
to add port bit for ALE mcast entry mask, and basically ALE entry
for port identification is needed only on rx side.
So, add only host port in dual_emac mode as used directed
transmission, and no need in one more port. For single port boards
and switch mode all ports used, as usual, so no changes for them.
Also it simplifies farther changes.
In other words, mcast entries for dual-emac should behave exactly
like unicast. It also can help avoid leaking packets between ports
with same vlan on h/w level if ports could became members of same vid.
So now, for instance, if mcast address 33:33:00:00:00:01 is added then
entries in ALE table:
vid = 1, addr = 33:33:00:00:00:01, port_mask = 0x1
vid = 2, addr = 33:33:00:00:00:01, port_mask = 0x1
Instead of:
vid = 1, addr = 33:33:00:00:00:01, port_mask = 0x3
vid = 2, addr = 33:33:00:00:00:01, port_mask = 0x5
With the same considerations, set only host port for unregistered
mcast for dual-emac mode in case of IFF_ALLMULTI is set, exactly like
it's done in cpsw_ale_set_allmulti().
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Whenever kernel or user decides to call rx mode update, it clears
every multicast entry from forwarding table and in some time adds
it again. This time can be enough to drop incoming multicast packets.
That's why clear only staled multicast entries and update or add new
one afterwards.
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
It allows to use function under callbacks with same const qualifier of
mac address for farther changes.
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
On GENETv5, there is a hardware issue which prevents the GENET hardware
from generating a link UP interrupt when the link is operating at
10Mbits/sec. Since we do not have any way to configure the link
detection logic, fallback to polling in that case.
Fixes: 421380856d ("net: bcmgenet: add support for the GENETv5 hardware")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJbvqcMAAoJEEg/ir3gV/o+eFsH/2TbJH+i1BuGMVwCB8o+U1Rz
C01pJmR7Lb7WwQZ8ZKTOqQkS7BkGX1hNGyIlc4i6ZnP+4gsVJAbP6LKPjTvyD7e6
TNb8bvxTUCOovknrevKkGba8tzoTTsC4wwwbHLGHd1hkKSY1P5hXg8R7vpear+n6
/PFJwzpIXDAa8AHqeORCNYj7MneUm3kaahcmSOxOhvDbRx3UG9cgy7tEhPjZbRn5
jPFsxFCSPcGedtI+g8bzodmpneTcu1KF6QCunrl2bGt5EzgDrbaw1UUoctxD2CJR
Ch45W807EvBJoFiJXXCNf9N+p5020F/Q+mTmK7khPirUjdtoLdcT9Goswpjfbtk=
=vJIA
-----END PGP SIGNATURE-----
Merge tag 'mlx5-fixes-2018-10-10' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
Mellanox, mlx5 fixes 2018-10-10
This pull request includes some fixes to mlx5 driver,
Please pull and let me know if there's any problem.
For -stable v4.11:
('net/mlx5: Take only bit 24-26 of wqe.pftype_wq for page fault type')
For -stable v4.17:
('net/mlx5: Fix memory leak when setting fpga ipsec caps')
For -stable v4.18:
('net/mlx5: WQ, fixes for fragmented WQ buffers API')
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
IPoIB netlink support and mlx5e pre-allocated netdevice initialization
IP link was broken due to the changes in IPoIB for the rdma_netdev
support after commit cd565b4b51
("IB/IPoIB: Support acceleration options callbacks").
This patchset fixes IPoIB pkey creation and removal using rtnetlink by
adding support in both IPoIB ULP layer and mlx5 layer:
From Jason and Denis:
1) Introduces changes in the RDMA netdev code in order to
allow allocation of the netdev to be done by the rtnl netdev code.
2) Reworks IPoIB initialization to use the two step rdma_netdev
creation.
From Feras and Saeed, mlx5e netdev layer refactoring to allow accepting
pre-allocated netdevs:
3) Adds support to initialize/cleanup netdevs that are not created
by mlx5 driver.
4) Change mlx5e netdevice layer to accept the pre-allocated netdevice
queue number.
5) Initialize mlx5e generic structures in one place to be used for all
netdevs types NIC/representors/IPoIB (both mlx5 allocated and
pre-allocted).
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJbvqHhAAoJEEg/ir3gV/o+TRQH+wYWgvi5AbqYlGpT8xSho8Dg
5tuFE9AbYDGLLWptZgtYXsBsTaltGVqKnwWy//dAuuSI7LvqtjeKqq0G1JKau70L
/49+vv8NTQ6vov2yY95yzzEo5B1/zVcYEl9dmJl4y6Xlwc+YIteewReVqRj+tucR
xO7ghLWc1o4Pl7gWBAcOULyOsZc+CtG8ElwEaBROXTtYRpsR7FKn7vN+WdWZ2VOG
MjtCUaG0vZlK1vaF76J3P9bL2V3y6o6i6S8RZwg1kPxO4jnrzyGrkxWMOX6f32KB
JAUcLVlJW5wckcRAKfTwLxsFMrc9vLBPHyj4XneXVWGKh8/dGUGY5gdGIJV5Jlc=
=m0tM
-----END PGP SIGNATURE-----
Merge tag 'mlx5e-updates-2018-10-10' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5e-updates-2018-10-10
IPoIB netlink support and mlx5e pre-allocated netdevice initialization
IP link was broken due to the changes in IPoIB for the rdma_netdev
support after commit cd565b4b51
("IB/IPoIB: Support acceleration options callbacks").
This patchset fixes IPoIB pkey creation and removal using rtnetlink by
adding support in both IPoIB ULP layer and mlx5 layer:
From Jason and Denis:
1) Introduces changes in the RDMA netdev code in order to
allow allocation of the netdev to be done by the rtnl netdev code.
2) Reworks IPoIB initialization to use the two step rdma_netdev
creation.
From Feras and Saeed, mlx5e netdev layer refactoring to allow accepting
pre-allocated netdevs:
3) Adds support to initialize/cleanup netdevs that are not created
by mlx5 driver.
4) Change mlx5e netdevice layer to accept the pre-allocated netdevice
queue number.
5) Initialize mlx5e generic structures in one place to be used for all
netdevs types NIC/representors/IPoIB (both mlx5 allocated and
pre-allocted).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Originally, we have an issue where r8169 MSI-X interrupt is broken after
S3 suspend/resume on RTL8106e of ASUS X441UAR.
02:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd.
RTL8101/2/6E PCI Express Fast/Gigabit Ethernet controller [10ec:8136]
(rev 07)
Subsystem: ASUSTeK Computer Inc. RTL810xE PCI Express Fast
Ethernet controller [1043:200f]
Flags: bus master, fast devsel, latency 0, IRQ 16
I/O ports at e000 [size=256]
Memory at ef100000 (64-bit, non-prefetchable) [size=4K]
Memory at e0000000 (64-bit, prefetchable) [size=16K]
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [70] Express Endpoint, MSI 01
Capabilities: [b0] MSI-X: Enable+ Count=4 Masked-
Capabilities: [d0] Vital Product Data
Capabilities: [100] Advanced Error Reporting
Capabilities: [140] Virtual Channel
Capabilities: [160] Device Serial Number 01-00-00-00-36-4c-e0-00
Capabilities: [170] Latency Tolerance Reporting
Kernel driver in use: r8169
Kernel modules: r8169
We found the all of the values in PCI BAR=4 of the ethernet adapter
become 0xFF after system resumes. That breaks the MSI-X interrupt.
Therefore, we can only fall back to MSI interrupt to fix the issue at
that time.
However, there is a commit which resolves the drivers getting nothing in
PCI BAR=4 after system resumes. It is 04cb3ae895d7 "PCI: Reprogram
bridge prefetch registers on resume" by Daniel Drake.
After apply the patch, the ethernet adapter works fine before suspend
and after resume. So, we can revert the workaround after the commit
"PCI: Reprogram bridge prefetch registers on resume" is merged into main
tree.
This patch reverts commit 7bb05b85bc
"r8169: don't use MSI-X on RTL8106e".
Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=201181
Fixes: 7bb05b85bc ("r8169: don't use MSI-X on RTL8106e")
Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts were easy to resolve using immediate context mostly,
except the cls_u32.c one where I simply too the entire HEAD
chunk.
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch resumes promisc mode and vlan filter status after
loopback test.
Fixes: 3b75c3df59 ("net: hns3: net: hns3: Add support for IFF_ALLMULTI flag")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch resumes promisc mode and vlan filter status after reset.
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the driver does nothing when mac vlan table is full.
In this case, the packet with new mac address will be dropped
by hardware. This patch adds check for the result of sync mac
address, and enable promisc mode when mac vlan table is full.
Furtherly, disable vlan filter when enable promisc by user
command.
Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace the repeated license text with SDPX identifiers.
While at it bump the Copyright dates for files we touched
this year.
Signed-off-by: Edwin Peer <edwin.peer@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Nic Viljoen <nick.viljoen@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It has been reported that since
commit 05212ba813 ("r8169: set RxConfig after tx/rx is enabled for RTL8169sb/8110sb devices")
at least RTL_GIGA_MAC_VER_38 NICs work erratically after a resume from
suspend.
The problem has been traced to a missing RX_MULTI_EN bit in the RxConfig
register.
We already set this bit for RTL_GIGA_MAC_VER_35 NICs of the same 8168F
chip family so let's do it also for its other siblings: RTL_GIGA_MAC_VER_36
and RTL_GIGA_MAC_VER_38.
Curiously, the NIC seems to work fine after a system boot without having
this bit set as long as the system isn't suspended and resumed.
Fixes: 05212ba813 ("r8169: set RxConfig after tx/rx is enabled for RTL8169sb/8110sb devices")
Reported-by: Chris Clayton <chris2553@googlemail.com>
Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
Tested-by: Chris Clayton <chris2553@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 63ae7949e9 ("net: socionext: Use descriptor info instead of MMIO reads on Rx")
removed constant mmio reads from the driver and started using a descriptor
field to check if packet should be processed.
This lead the napi rx handler being constantly called while no packets
needed processing and ksoftirq getting 100% cpu usage. Issue one mmio read
to clear the irq correcty after processing packets
Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
During boot, mlx4_core sets the driverinit configuration parameters and
updates the devlink module on the initial values calling
devlink_param_driverinit_value_set().
If devlink_param_driverinit_value_set() returns an error mlx4_core
reports kernel module warning.
This caused false alarm during boot in case kernel was compiled with
CONFIG_NET_DEVLINK off.
Fix by removing warning reported in case
devlink_param_driverinit_value_set() fails.
This actually makes the function mlx4_devlink_set_init_value()
redundant to using directly devlink_param_driverinit_value_set() and so
removed.
It fixes the following kernel trace:
mlx4_core 0000:00:06.0: devlink set parameter 0 value failed (err = -95)
mlx4_core 0000:00:06.0: devlink set parameter 1 value failed (err = -95)
mlx4_core 0000:00:06.0: devlink set parameter 4 value failed (err = -95)
mlx4_core 0000:00:06.0: devlink set parameter 5 value failed (err = -95)
mlx4_core 0000:00:06.0: devlink set parameter 3 value failed (err = -95)
Fixes: bd1b51dc66 ("mlx4: Add mlx4 initial parameters table and register it")
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With CONFIG_THERMAL=m, we get a build error:
drivers/net/ethernet/chelsio/cxgb4/cxgb4_thermal.c: In function 'cxgb4_thermal_get_trip_type':
drivers/net/ethernet/chelsio/cxgb4/cxgb4_thermal.c:48:11: error: 'struct adapter' has no member named 'ch_thermal'
Once that is fixed by using IS_ENABLED() checks, we get a link error
against the thermal subsystem when cxgb4 is built-in:
drivers/net/ethernet/chelsio/cxgb4/cxgb4_thermal.o: In function `cxgb4_thermal_init':
cxgb4_thermal.c:(.text+0x180): undefined reference to `thermal_zone_device_register'
drivers/net/ethernet/chelsio/cxgb4/cxgb4_thermal.o: In function `cxgb4_thermal_remove':
cxgb4_thermal.c:(.text+0x1e0): undefined reference to `thermal_zone_device_unregister'
Finally, since CONFIG_THERMAL can be =m, the Makefile fails to pick up the
extra file into built-in.a, and we get another link failure against the
cxgb4_thermal_init/cxgb4_thermal_remove files, so the Makefile has to
be adapted as well to work for both CONFIG_THERMAL=y and =m.
Fixes: b187191577 ("cxgb4: Add thermal zone support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove redundant spinlock acquire parameter from ena_com_admin_init()
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Improves socket memory utilization when receiving packets larger
than 128 bytes (the previous rx copybreak) and smaller than 256 bytes.
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently Rx refill is done when the number of required descriptors is
above 1/8 queue size. With a default of 1024 entries per queue the
threshold is 128 descriptors.
There is intention to increase the queue size to 8196 entries.
In this case threshold of 1024 descriptors is too large and can hurt
latency.
Add another limitation to Rx threshold to be at most 256 descriptors.
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Set skb->ip_summed to the correct value as reported by the device.
Add counter for the case where rx csum offload is enabled but
device didn't check it.
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch includes all code changes necessary in ena_netdev to enable
packet sending via the LLQ placemnt mode.
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch introduces APIs for detection, initialization, configuration
and actual usage of low latency queues(LLQ). It extends transmit API with
creation of LLQ descriptors in device memory (which include host buffers
descriptors as well as packet header)
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Low Latency Queues(LLQ) allow usage of device's memory for descriptors
and headers. Such queues decrease processing time since data is already
located on the device when driver rings the doorbell.
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add new fields and definitions to host info and fill them
according to the latest ENA spec version.
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reduce fastpath overhead by making ena_com_tx_comp_req_id_get() inline.
Also move it to ena_eth_com.h file with its dependency function
ena_com_cq_inc_head().
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The DECAP_ECN0 trap will be used to trap packets where the overlay
packet is marked with Non-ECT, but the underlay packet is marked with
either ECT(0), ECT(1) or CE. When trapped, such packets will be counted
as errors by the VxLAN driver and thus provide better visibility.
The NVE_ENCAP_ARP trap will be used to trap ARP packets undergoing NVE
encapsulation. This is needed in order to support E-VPN ARP suppression,
where the Linux bridge does not flood ARP packets through tunnel ports
in case it can answer the ARP request itself.
Note that all the packets trapped via these traps are marked with
'offload_fwd_mark', so as to not be re-flooded by the Linux bridge
through the ASIC ports.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the following resources to be used by the NVE code:
* Number of IPv4 underlay destination IPs in a single TNUMT record
* Number of IPv6 underlay destination IPs in a single TNUMT record
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This register is used for setting up the parsing for hash, policy-engine
and routing.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Will be used to program the device with FDB records pointing to a NVE
tunnel.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The TNQDR register configures the default QoS settings for NVE
encapsulation.
It will be used to set the default DSCP of each port to 0, so that when
DSCP is set to inherit and the overlay packet does not have an IP header
the outer DSCP will be set to 0, in accordance with the software data
path.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The register configures how QoS is set in Encapsulation into the
underlay network.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This register configures the actions that are done during NVE
decapsulation based on the ECN bits.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This register performs mapping from overlay ECN to underlay ECN during
NVE encapsulation.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This register builds the linked list of underlay destination IPs used
for BUM traffic on the overlay.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This register enables / disables learning on different types of tunnel
ports (e.g., NVE, VPLS).
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This register configures global NVE configuration such as source IP of
the NVE tunnel and UDP source port calculation.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the seed of the LAG hash function is always set to 0, which
means it is identical across all switches. Instead, use a random number.
This is especially important now that VxLAN is supported, as the LAG
hash function is used to calculate the UDP source port of the
encapsulated packet.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The device has the ability to flush all the FDB records that perform NVE
encapsulation or only a subset of these with a specific filtering
identifier (FID).
Expose these types so that they could be used by subsequent patches
where we need to flush the FDB records when an NVE device is unlinked
from a bridge (FID).
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the device needs to flood an overlay packet to remote VTEPs it
retrieves a pointer to the head of a linked-list of records that store
the IP addresses of these VTEPs.
These records are stored in the KVD linear memory and configured via the
Tunneling NVE Underlay Multicast Table (TNUMT) register.
Add a new KVD linear entry type for these records, so that we will be
able to allocate and free them.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The L3 protocol and address definitions are going to be used by the NVE
code, so move them to the global header file from the one private to the
router.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
VxLAN notifications are going to use a different notifier information
type, so cast to the correct type based on the received event.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
VxLAN FDB updates are sent with the VxLAN device which is not our upper
and will therefore be ignored by current code.
Solve this by checking whether the upper device (bridge) is our upper.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
VxLAN FDB notifications need to be handled differently than bridge FDB
notifications, so initialize the work item based on the received
notification and rename the invoked function accordingly.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The spectrum_router.h header file is private to the router block and
should only be included by direct consumers of it, such as dpipe and the
multicast routing code.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/net/ethernet/marvell/octeontx2/af/cgx.c: In function 'cgx_fwi_event_handler':
drivers/net/ethernet/marvell/octeontx2/af/cgx.c:257:17: warning:
variable 'dev' set but not used [-Wunused-but-set-variable]
It never be used since introduction in
commit 1463f382f5 ("octeontx2-af: Add support for CGX link management")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for the NETIF_F_RXFCS feature in the Mscc
Ethernet driver. This feature is disabled by default and allow a user
to request the driver not to drop the FCS and to extract it into the skb
for debugging purposes.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Drivers should call skb_set_hash to set the hash and its type
in an skbuff.
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds RSS tuple support for VF in revision
0x21.
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds RSS key, hash algorithm configuration support
for VF in revision 0x21.
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds ETH_RSS_HASH_XOR hash algorithm supports, which
is supported by hw revision 0x21.
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Read the host context count symbols provided by firmware and use
it to determine the number of allocated stats ids. Previously it
won't be possible to offload more than 2^17 filter even if FW was
able to do so.
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make use of an array stats instead of storing stats per flow which
would require a hash lookup at critical times.
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make use of relativistic hash tables for tracking flows instead
of fixed sized hash tables.
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are several variables being initialized that are being set later
and hence the initialization is redundant and can be removed. Remove
then.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mlx5e netdevice used to calculate fragment edges by a call to
mlx5_wq_cyc_get_frag_size(). This calculation did not give the correct
indication for queues smaller than a PAGE_SIZE, (broken by default on
PowerPC, where PAGE_SIZE == 64KB). Here it is replaced by the correct new
calls/API.
Since (TX/RX) Work Queues buffers are fragmented, here we introduce
changes to the API in core driver, so that it gets a stride index and
returns the index of last stride on same fragment, and an additional
wrapping function that returns the number of physically contiguous
strides that can be written contiguously to the work queue.
This obsoletes the following API functions, and their buggy
usage in EN driver:
* mlx5_wq_cyc_get_frag_size()
* mlx5_wq_cyc_ctr2fragix()
The new API improves modularity and hides the details of such
calculation for mlx5e netdevice and mlx5_ib rdma drivers.
New calculation is also more efficient, and improves performance
as follows:
Packet rate test: pktgen, UDP / IPv4, 64byte, single ring, 8K ring size.
Before: 16,477,619 pps
After: 17,085,793 pps
3.7% improvement
Fixes: 3a2f703312 ("net/mlx5: Use order-0 allocations for all WQ types")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The HW spec defines only bits 24-26 of pftype_wq as the page fault type,
use the required mask to ensure that.
Fixes: d9aaed8387 ("{net,IB}/mlx5: Refactor page fault handling")
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Allocated memory for context should be freed once
finished working with it.
Fixes: d6c4f0298c ("net/mlx5: Refactor accel IPSec code")
Signed-off-by: Talat Batheesh <talatb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The current design of mlx5e driver ignores the netdevice TX/RX queues
number for netdevices that RDMA IPoIB ULP creates. Instead, the queue
number is initialized to the maximum number that mlx5 thinks best for
performance. As a result, ULP drivers that choose to create a netdevice
with queue number that is less than the maximum channels mlx5 creates,
will get a memory corruption.
This fix changes the mlx5e netdev logic to respect ULP netdevices TX/RX
queue number and use it when creating resources instead of the maximum
channel number.
Fixes: cd565b4b51 ("IB/IPoIB: Support acceleration options callbacks")
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Convert mlx5e update stats work to a normal work structure, since it is
never used delayed.
Add a helper function to queue update stats work on demand which checks
for some conditions and reduce code duplication to have a better
abstraction.
Fixes: ed56c5193a ("net/mlx5e: Update NIC HW stats on demand only")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Move all mlx5e generic structures initializations to mlx5e_netdev_init.
The common structure new initializer function will be used to initialize
mlx5 context for netlink created netdevs such as IPoIB mlx5 accelerated
child netdevs.
Fixes: cd565b4b51 ("IB/IPoIB: Support acceleration options callbacks")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
mlx5e_detach_netdev cancels update_stats work which was not initialized
in ipoib netdevice profile, as a result, the following assert occurs:
ODEBUG: assert_init not available (active state 0) object type:
timer_list hint:(null)
This change moves the update stats work to be initialized for all
mlx5e netdevices.
Fixes: cd565b4b51 ("IB/IPoIB: Support acceleration options callbacks")
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Introduce a helper init/cleanup function that initializes mlx5e generic
netdev private structure, and use them from all profiles init/cleanup
callbacks.
This patch will also be helpful to initialize/cleanup netdevs that are
not created by mlx5 driver, e.g: accelerated ipoib child netdevs.
Fixes: 26e59d8077 ("net/mlx5e: Implement mlx5e interface attach/detach callbacks")
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>