The 4th and 5th parameter of hclge_cmd_query_error is useless, so this
patch removes them.
Signed-off-by: Weihang Li <liweihang@hisilicon.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When hclge_tm_schd_info_update calls hclge_tm_schd_info_init to
initialize the schedule info, hdev->tm_info.num_pg and
hdev->tx_sch_mode is not changed, which makes the checking in
hclge_tm_schd_info_init unnecessary.
So this patch moves the hdev->tm_info.num_pg and hdev->tx_sch_mode
checking into hclge_tm_schd_init and changes the return type of
hclge_tm_schd_info_init from int to void.
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The unused_count variable is used to indicate how many
RX BD need attaching new buffer in hns3_clean_rx_ring,
and the clean_count variable has the similar meaning.
This patch removes the clean_count variable and use
unused_count to uniformly indicate the RX BD that need
attaching new buffer.
This patch also clean up some coding style related to
variable assignment in hns3_clean_rx_ring.
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The local variable return_status in hclge_get_mac_val_cmd_status()
is useless. So this patch returns the error code directly, instead of
using this variable. Also, replace some '%d' with '%u' in
hclge_get_mac_val_cmd_status().
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously, when changing the ring parameters, we free the old
ring resources firstly, and then setup the new ring resources.
In some case of an memory allocation fail, there will be no
resources to use. This patch refines it by setup new ring
resources and free the old ring resources in order.
Also reduce the max ring BD number to 32760 according to UM.
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some commands are not supported on DCB-unsupported ports.
This patch distinguishes these commands and does not query
unsupported commands in debugfs.
This patch also fix an error in the dump "qos buf cfg"
command in debugfs.
Fixes: 2849d4e7a1 ("net: hns3: Add "tc config" info query function")
Fixes: 7d9d7f8864 ("net: hns3: Add "qos buffer" config info query function")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When NCSI has HW error, the IMP will report this error to the driver
by sending a mailbox. After received this message, the driver should
assert a global reset to fix this kind of HW error.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously, PF updates link status per second. For some scenario,
it requires link down event being reported more quickly.
To solve it, firmware pushes the link change event to PF with
CMDQ message, and driver updates the link status directly.
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This series includes updates to mlx5 driver,
1) Simplifications, cleanup and warning prints improvements
2) From Vlad Buslov:
Refactor mlx5 tc flow handling for unlocked execution (Part 1)
Currently, all cls API hardware offloads driver callbacks require caller
to hold rtnl lock when calling them. Cls API has already been updated to
update software filters in parallel (on classifiers that support
unlocked execution), however hardware offloads code still obtains rtnl
lock before calling driver tc callbacks. This set implements partial
support for unlocked execution that is leveraged by follow up
refactorings in specific mlx5 tc subsystems and patch to cls API that
implements API that allows drivers to register their callbacks as
rtnl-unlocked.
In mlx5 tc code mlx5e_tc_flow is the main structure that is used to
represent tc filter. Currently, code the structure itself and its
handlers in both tc and eswitch layers do not implement any kind of
synchronizations and assume external global synchronization provided by
rtnl lock instead. Implement following changes to remove dependency on
rtnl lock in flow handling code that are intended to be used a
groundwork for following changes to provide fully rtnl-independent mlx5
tc:
- Extend struct mlx5e_tc_flow with atomic reference counter and rcu to
allow concurrent access from multiple tc and neigh update workqueue
instances without introducing any additional locks specific to the
structure. Its 'flags' field type is changed to atomic bitmask ops which
is necessary for tc to interact with other concurrent tc instances or
concurrent neigh update that need to skip flows that are not fully
initialized (new INIT_DONE flow flag) and can change the flags
according to neighbor state (flipping OFFLOADED flag).
- Protect unready flows list by new uplink_priv->unready_flows_lock
mutex.
- Convert calls to netdev APIs that require rtnl lock in flow handling
code to their rcu counterparts.
- Modify eswitch code that is called from tc layer and assume implicit
external synchronization to be concurrency safe: change
esw->offloads.num_flows type to atomic integer and re-arrange
esw->state_lock usage to protect additional data.
Some of approaches to synchronizations presented in this patch set are
quite complicated (lockless concurrent usage of data structures with rcu
and reference counting, using fine-grained locking when necessary, retry
mechanisms to handle concurrent insertion of another instance of data
structure with same key, etc.). This is necessary to allow calling the
firmware in parallel in most cases, which is the main motivation of this
change since firmware calls are mach heavier operation than atomic
operations, multitude of locks and potential multiple retries during
concurrent accesses to same elements.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl0/g+oACgkQSD+KveBX
+j7vqAf/TXz9uGKkXR1EHfRMcrWIjQH6M3zKkrwo2K3wTwssi3ymCNfZH+VYkUwk
LMrVnBQxHHPjuhUfmYfXbswVZiGm5Mul0zFNoeWChvcaEbQzsisHXCNNtzQFy9p2
zQUmmvmvNNFQ0KPuwGuK/0rbUmXM3tXqWs2mtBsRLHxZUvqpGh7hETCFvZ0TnBaO
EeFrWV6h8rfynBsRVeDuFGk5hDoo/ASEbWVTsocpPm6UPhIwKyH/Tbv4+wKZgZfr
p/yPKSjwtQw9jPXj/lYxRXBbbZXey+BEptIurmwh3d8MLyrM4VElP6rgaoAb2Wzq
ov5pnEGR5qsAINkOnImBZ4Mw4HN9WQ==
=UKN2
-----END PGP SIGNATURE-----
Merge tag 'mlx5-updates-2019-07-29' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2019-07-29
This series includes updates to mlx5 driver,
1) Simplifications, cleanup and warning prints improvements
2) From Vlad Buslov:
Refactor mlx5 tc flow handling for unlocked execution (Part 1)
Currently, all cls API hardware offloads driver callbacks require caller
to hold rtnl lock when calling them. Cls API has already been updated to
update software filters in parallel (on classifiers that support
unlocked execution), however hardware offloads code still obtains rtnl
lock before calling driver tc callbacks. This set implements partial
support for unlocked execution that is leveraged by follow up
refactorings in specific mlx5 tc subsystems and patch to cls API that
implements API that allows drivers to register their callbacks as
rtnl-unlocked.
In mlx5 tc code mlx5e_tc_flow is the main structure that is used to
represent tc filter. Currently, code the structure itself and its
handlers in both tc and eswitch layers do not implement any kind of
synchronizations and assume external global synchronization provided by
rtnl lock instead. Implement following changes to remove dependency on
rtnl lock in flow handling code that are intended to be used a
groundwork for following changes to provide fully rtnl-independent mlx5
tc:
- Extend struct mlx5e_tc_flow with atomic reference counter and rcu to
allow concurrent access from multiple tc and neigh update workqueue
instances without introducing any additional locks specific to the
structure. Its 'flags' field type is changed to atomic bitmask ops which
is necessary for tc to interact with other concurrent tc instances or
concurrent neigh update that need to skip flows that are not fully
initialized (new INIT_DONE flow flag) and can change the flags
according to neighbor state (flipping OFFLOADED flag).
- Protect unready flows list by new uplink_priv->unready_flows_lock
mutex.
- Convert calls to netdev APIs that require rtnl lock in flow handling
code to their rcu counterparts.
- Modify eswitch code that is called from tc layer and assume implicit
external synchronization to be concurrency safe: change
esw->offloads.num_flows type to atomic integer and re-arrange
esw->state_lock usage to protect additional data.
Some of approaches to synchronizations presented in this patch set are
quite complicated (lockless concurrent usage of data structures with rcu
and reference counting, using fine-grained locking when necessary, retry
mechanisms to handle concurrent insertion of another instance of data
structure with same key, etc.). This is necessary to allow calling the
firmware in parallel in most cases, which is the main motivation of this
change since firmware calls are mach heavier operation than atomic
operations, multitude of locks and potential multiple retries during
concurrent accesses to same elements.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
* lots more HE (802.11ax) support, particularly things
relevant for the the AP side, but also mesh support
* debugfs cleanups from Greg
* some more work on extended key ID
* start using genl parallel_ops, as preparation for
weaning ourselves off RTNL and getting parallelism
* various other changes all over
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAl1BtsAACgkQB8qZga/f
l8QTdQ//bD3NLitiIW4rBmC/w9str2TRpUbRN8Hf9xTnRh1smLGQop8jI8xpYdAE
hXFlWn4glCkTkZrqtT8IDMcERb2YPHI94L7lMR1L6dXH7e1tiBZ7Qum5+rojLUCQ
/XXZJnVa9AbdzBDtkhzCjkN8dCldMnkId0m6vkGdFre/b70FLDg2GlolqIchbDCz
GxeKaw/uTLwu9nxEJhspDmiXDQtQd1ZsGNZRrovQ2nUuUfvX9sU54OmaoDHhqrUM
iSwFZJAhrCV7nidOLs2X1rVs8hRymbrYnGu6NdUlfTQMqaOVgWhqniM1RMnJnnVi
Ec/1OQg4KIj2rjJIXW/3QXAzAO6TDwV8xNk9al3m+yAkkSXR7tXP01plNUoMU4UW
YxcaPD341H919GIqa71lieeEMyi7XIKKFfaZvGFLiDzidKKi0JkDuC+1w5UOJrbj
PM/dzvKqvwOMEa5XahjrUQLsiGMgxBkDo36+F4rfaelzDKPuJBOokDS23vg/lYf5
2v74WbeJVc45cAhSwp5/0RTsG0xhmFImfrE66VNB1vcJUVLvHTuATfXsBbRahOz/
SglGBAGqigxJwIeYfgec3lPNfdV+0LwYnPAEjBIjCO8qlGvst4jLroHg7ayApr2E
JfUbaWtpEb9cgAz7X6tEjv/BAnsRWPzpxb8Nm6JoZttTtA2VakE=
=eUnh
-----END PGP SIGNATURE-----
Merge tag 'mac80211-next-for-davem-2019-07-31' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Johannes Berg says:
====================
We have a reasonably large number of changes:
* lots more HE (802.11ax) support, particularly things
relevant for the the AP side, but also mesh support
* debugfs cleanups from Greg
* some more work on extended key ID
* start using genl parallel_ops, as preparation for
weaning ourselves off RTNL and getting parallelism
* various other changes all over
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
A few lines above, we have:
tx_size = BIT(tx->order);
So use 'tx_size' directly to be consistent with the way 'rx->descs_cpu' and
'rx->descs_dma' are computed below.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add Microchip KSZ8795 DSA driver.
Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com>
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: David S. Miller <davem@davemloft.net>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Tristram Ha <Tristram.Ha@microchip.com>
Cc: Vivien Didelot <vivien.didelot@gmail.com>
Cc: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We don't need dev_err() messages when platform_get_irq() fails now that
platform_get_irq() prints an error message itself when something goes
wrong. Let's remove these prints with a simple semantic patch.
// <smpl>
@@
expression ret;
struct platform_device *E;
@@
ret =
(
platform_get_irq(E, ...)
|
platform_get_irq_byname(E, ...)
);
if ( \( ret < 0 \| ret <= 0 \) )
{
(
-if (ret != -EPROBE_DEFER)
-{ ...
-dev_err(...);
-... }
|
...
-dev_err(...);
)
...
}
// </smpl>
While we're here, remove braces on if statements that only have one
statement (manually).
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Lorenzo Bianconi <lorenzo@kernel.org>
Cc: netdev@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use accessor functions for skb fragment's page_offset instead
of direct references, in preparation for bvec conversion.
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
TC flow table is created when first flow is added, and destroyed when last
flow is removed. This assumes that all accesses to the table are externally
synchronized with rtnl lock. To remove dependency on rtnl lock, add new
mutex mlx5e_tc_table->t_lock and use it to protect the flow table.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Function netdev_master_upper_dev_get() generates warning if caller doesn't
hold rtnl lock. Modify rules update path to use rcu version of that
function.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
esw->state_lock is already used to protect vlan vport configuration change.
However, all preparation and correctness checks, and code that sets vport
data are not protected by this lock and assume external synchronization by
rtnl lock. In order to remove dependency on rtnl lock, extend
esw->state_lock protection to whole eswitch vlan add/del functions.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Eswitch implements its own locking by means of state_lock mutex and
multiple fine-grained lock in containing data structures, and is supposed
to not rely on rtnl lock. However, eswitch offloads num_flows type is a
regular long long integer and cannot be modified concurrently. This is an
implicit assumptions that mlx5 tc is serialized (by rtnl lock or any other
means). In order to remove implicit dependency on rtnl lock, change
num_flows type to atomic64 to allow concurrent modifications.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
In order to remove dependency on rtnl lock for protecting unready_flows
list when reoffloading unready flows on workqueue, extend representor
uplink private structure with dedicated 'unready_flows_lock' mutex. Take
the lock in all users of unready_flows list before accessing it. Implement
helper functions to add and delete unready flow.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
In order to remove dependency on rtnl lock, access to tc flows hashtable
must be explicitly protected from concurrent flows removal.
Extend tc flow structure with rcu to allow concurrent parallel access. Use
rcu read lock to safely lookup flow in tc flows hash table, and take
reference to it. Use rcu free for flow deletion to accommodate concurrent
stats requests.
Add new DELETED flow flag. Imlement new flow_flag_test_and_set() helper
that is used to set a flag and return its previous value. Use it to
atomically set the flag in mlx5e_delete_flower() to guarantee that flow can
only be deleted once, even when same flow is deleted concurrently by
multiple tasks.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
To remove dependency on rtnl lock and allow concurrent modification of
'flags' field of tc flow structure, change flow flag type to unsigned long
and use atomic bit ops for reading and changing the flags. Implement
auxiliary functions for setting, resetting and getting specific flag, and
for checking most often used flag values.
Always set flags with smp_mb__before_atomic() to ensure that all
mlx5e_tc_flow are updated before concurrent readers can read new flags
value. Rearrange all code paths to actually set flow->rule[] pointers
before setting the OFFLOADED flag. On read side, use smp_mb__after_atomic()
when accessing flags to ensure that offload-related flow fields are only
read after the flags.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
With new classifier type that doesn't require rtnl lock, following
invariant holds:
- Filter with specified cookie created only once.
- Filter with specified cookie deleted only once.
- Stats updates can be performed in parallel to each other.
Extend tc flow with rcu and reference counter. To protect from concurrent
delete, get reference to tc flow when:
- Reading flow stats.
- Accessing flow in neigh update handler.
- Accessing flow in neigh update used value handler.
Only free flow when reference counter reached zero. Modify flow cleanup to
account for flows that could be not fully initialized by checking if flow
is actually in the list of corresponding mod_hdr, hairpin and encap
entries. Don't cleanup flow directly in case of error to allow concurrent
neigh update (neigh update will be modified to always take reference to
flow when using it).
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The helper function has "if" branches that do the same. Merge them to
simplify the code.
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When call flow_block_cb_is_busy. The indr_priv is guaranteed to
NULL ptr. So there is no need to call flow_bock_cb_is_busy.
Fixes: 0d4fd02e71 ("net: flow_offload: add flow_block_cb_is_busy() and use it")
Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Don't choose who implements the rxnfc "get/set" callbacks according to
CONFIG_MLX5_EN_RXNFC, instead have the callbacks always available and
delegate to a function of a different driver module when needed
(en_fs_ethtool.c), have stubs in en/fs.h to fallback to when
en_fs_ethtool.c is compiled out, to avoid complications and ifdefs in
en_main.c.
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When disabling CQE compression in favor of time-stamping, don't show a
warning when CQE compression is already disabled.
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When user enables LRO via ethtool and if the RQ mode is legacy,
mlx5e_fix_features drops the request without any explanation.
Add netdev_warn to cover this case.
Fixes: 6c3a823e1e ("net/mlx5e: RX, Remove HW LRO support in legacy RQ")
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Define the 57508, 57504, and 57502 chip IDs that are all part of the
BNXT_CHIP_P5 family of chips.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With the new TPA feature in the 57500 chips, we need to discover the
feature first before setting up the netdev features. Refactor the
the firmware probe and init logic more cleanly into 2 functions and
and make these calls before setting up the netdev features.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Support the new expanded TPA v2 counters on 57500 B0 chips for
ethtool -S.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The new TPA implemantation has additional TPA counters that extend the
per-ring statistics block. Allocate the proper size accordingly.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current code assumes that the per ring statistics counters are
fixed. In newer chips that support a newer version of TPA, the
TPA counters are also changed. Refactor the code by defining these
counter names in arrays so that it is easy to add a new array for
a new set of counters supported by the newer chips.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a more optimized hardware GRO function to setup the SKB on 57500
chips. Some workaround code is no longer needed on 57500 chips and
the pseudo checksum is also calculated in hardware, so no need to
do the software pseudo checksum in the driver.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The new TPA feature on 57500 supports a larger number of concurrent TPAs
(up to 1024) divided among the functions. We need to add some logic to
map the hardware TPA ID to a software index that keeps track of each TPA
in progress. A 1:1 direct mapping without translation would be too
wasteful as we would have to allocate 1024 TPA structures for each RX
ring on each PCI function.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With all the previous refactoring, the TPA fast path can now be
modified slightly to support TPA on the new chips. The main
difference is that the agg completions are retrieved differently using
the bnxt_get_tpa_agg_p5() function on the new chips.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On 57500 chips, hardware GRO mode cannot be determined from the TPA
end, so we need to check bp->flags to determine if we are in hardware
GRO mode or not. Modify bnxt_set_features so that the TPA flags
in bp->flags don't change until the device is closed. This will ensure
that the fast path can safely rely on bp->flags to determine the
TPA mode.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 2 GRO functions to set up the hardware GRO SKB fields for 2
different hardware chips have practically identical logic for
tunneled packets. Refactor the logic into a separate bnxt_gro_tunnel()
function that can be used by both functions.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On the new 57500 chips, these new RX_AGG completions are not coalesced
at the TPA_END completion. Handle these by storing them in the
array in the bnxt_tpa_info struct, as they are seen when processing
the CMPL ring.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add an aggregation array to bnxt_tpa_info struct to keep track of the
aggregation completions. The aggregation completions are not
completed at the TPA_END completion on 57500 chips so we need to
keep track of them. The array is only allocated on the new chips
when required. An agg_count field is also added to keep track of the
number of these completions.
The maximum concurrent TPA is now discovered from firmware instead of
the hardcoded 64. Add a new bp->max_tpa to keep track of maximum
configured TPA.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>