Commit Graph

814510 Commits

Author SHA1 Message Date
Victor Raj
f70b9d5f44 ice: check for a leaf node presence
Check for a leaf node presence for a given VSI. This check is required
before removing a VSI since VSIs can't be removed with enabled queues
(with leaf nodes) from the FW scheduler tree unless its a reset.

Signed-off-by: Victor Raj <victor.raj@intel.com>
Reviewed-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25 08:56:01 -08:00
Victor Raj
6e9650d533 ice: flush Tx pipe on disable queue timeout
Set the flush Tx pipe flag instead of getting an EAGAIN error when FW
times out in processing the disable Tx queue command.

Signed-off-by: Victor Raj <victor.raj@intel.com>
Reviewed-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25 08:56:01 -08:00
Mitch Williams
82ba01282c ice: clear VF ARQLEN register on reset
On older devices like X710 and X722, the VF's ARQLEN register is cleared
on reset, so the VF driver uses that register to detect an unannounced
reset. Unfortunately, on devices controlled by ice, this register is NOT
cleared on reset. This causes the VF to miss resets, and even on
properly-announced resets, the VF driver complains that it didn't see
the reset.

To fix this, we'll do it in software. When we handle a VF reset (whether
triggered by software or VFLR), clear this register after the HW reset
is complete.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25 08:56:01 -08:00
Mitch Williams
4cf7bc0d27 ice: don't spam VFs with link messages
Don't send a link message to the VFs unless link actually changes state.
This avoids a small timing hole in some VF drivers that can cause an
apparent TX hang if they receive a link status message at the wrong time.

Although we have fixed the timing hole in the current VF driver, there
are still lots of drivers in the field that have this timing hole. Let's
not fall into it if we can avoid it.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25 08:56:01 -08:00
Brett Creeley
b751930c6c ice: only use the VF for ICE_VSI_VF in ice_vsi_release
In ice_vsi_release we are always assigning a value to the local VF
variable. Change this to only be assigned if the VSI is a VF VSI.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25 08:56:01 -08:00
Bruce Allan
32a64994db ice: fix numeric overflow warning
When compiling and analyzing the driver on newer kernels, a static
analyzer warns about the following "numeric overflow" issues:

  "The result of expression: 'budget-1' generates 4-byte type while casting
   to a bigger size of 8-byte".

  "The result of expression: '*words-words_read' generates 4-byte type
   while casting to a bigger size of 8-byte".

Fix them both.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25 08:56:01 -08:00
Brett Creeley
0e04e8e14b ice: fix issue where host reboots on unload when iommu=on
Currently if the kernel has the intel_iommu=on parameter set, on some
platforms removing the driver causes a system reboot. In initialization
we associate the control queue interrupts with the pf->hw_oicr_idx and
enable the interrupts by setting the CAUSE_ENA bit. The problem comes
on teardown because we are not clearing the CAUSE_ENA bit for the
control queues, but the vector at pf->hw_oicr_idx (miscellaneous
interrupt vector) gets disabled.

Fix this by clearing the CAUSE_ENA bit in the appropriate control queue
registers on when freeing the miscellaneous interrupt vector. Also,
move the call to ice_free_irq_msix_misc() to after ice_deinit_sw() in
ice_remove() because ice_deinit_sw() makes an AQ call, but
ice_free_irq_msix_misc() disables the miscellaneous vector and it's
associated interrupts.

Also, create two small helper functions to enable and disable the
control queue interrupts respectively.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25 08:56:01 -08:00
Jacob Keller
f9264dd687 ice: fix ice_remove_rule_internal vsi_list handling
When adding multiple VLANs to the same VSI, the ice_add_vlan code will
share the VSI list, so as not to create multiple unnecessary VSI lists.

Consider the following flow

  ice_add_vlan(hw, <VSI 0 VID 7, VSI 0 VID 8, VSI 0 VID 9>)

Where we add three VLAN filters for VIDs 7, 8, and 9, all for VSI 0.

The ice_add_vlan will create a single vsi_list and share it among all
the filters.

Later, if we try to remove a VLAN,

  ice_remove_vlan(hw, <VSI 0 VID 7>)

Then the removal code will update the vsi_list and remove VSI 0 from it.
But, since the vsi_list is shared, this breaks the list for the other
users who reference it. We actually even free the VSI list memory, and
may result in segmentation faults.

This is due to the way that VLAN rule share VSI lists with reference
counts, and is caused because we call ice_rem_update_vsi_list even when
the ref_cnt is greater than one.

To fix this, handle the case where ref_cnt is greater than one
separately. In this case, we need to remove the associated rule without
modifying the vsi_list, since it is currently being referenced by
another rule. Instead, we just need to decrement the VSI list ref_cnt.

The case for handling sharing of VSI lists with multiple VSIs is not
currently supported by this code. No such rules will be created today,
and this code will require changes if/when such code is added.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25 08:56:01 -08:00
Bruce Allan
198a666a45 ice: fix stack hogs from struct ice_vsi_ctx structures
struct ice_vsi_ctx has gotten large enough that function local declarations
of it on the stack are causing stack hogs.  Fix that by allocating the
structs on heap.  Cleanup some formatting issues in the code around these
changes and fix incorrect data type uses of returned functions in a couple
places.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25 08:56:01 -08:00
Bruce Allan
c6dfd690f1 ice: sizeof(<type>) should be avoided
With sizeof(), it is preferable to use the variable of type <type> instead
of sizeof(<type>).

There are multiple places where a temporary variable is used to hold a
'size' value which is then used for a subsequent alloc/memset. Get rid
of the temporary variable by calculating size as part of the alloc/memset
statement.

Also remove unnecessary type-cast.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25 08:56:01 -08:00
Victor Raj
0e8fd74df2 ice: Fix added in VSI supported nodes calc
VSI supported nodes are calculated in order to add the VSI parent or
intermediate nodes to the scheduler tree. If one of the node in below
layers (from VSI layer) has space to add the new VSI or intermediate node
above that layer then it's not required to continue the calculation further
for below layers.

Signed-off-by: Victor Raj <victor.raj@intel.com>
Reviewed-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25 08:56:01 -08:00
Maciej Fijalkowski
5ed5d316d9 ice: Fix the calculation of ICE_MAX_MTU
Currently ICE_MAX_MTU subtracts only ETH_HLEN from max frame size and
adds ETH_FCS_LEN and VLAN_HLEN, which is not what was intended.
The ETH_HLEN + ETH_FCS_LEN + VLAN_HLEN expression should be surrounded
with parentheses.

Wrap mentioned expression and take into account VLAN double tagging.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25 08:56:01 -08:00
Bruce Allan
99be37edeb ice: Mark extack argument as __always_unused
Commit 87b0984ebf ("net: Add extack argument to ndo_fdb_add()") in
net-next added an extended parameter to the .ndo_fdb_add op and changed
ice_fdb_add() accordingly. Update the function header and add the
__always_unused attribute to the new parameter to avoid -Wunused-parameter
warnings.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25 08:56:01 -08:00
Peter Oskolkov
bd16693f35 net: fix double-free in bpf_lwt_xmit_reroute
dst_output() frees skb when it fails (see, for example,
ip_finish_output2), so it must not be freed in this case.

Fixes: 3bd0b15281 ("bpf: add handling of BPF_LWT_REROUTE to lwt_bpf.c")
Signed-off-by: Peter Oskolkov <posk@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 22:24:50 -08:00
wenxu
186d93669f ip_tunnel: Add ip tunnel tun_info type dst_cache in ip_tunnel_xmit
ip l add dev tun type gretap key 1000

Non-tunnel-dst ip tunnel device can send packet through lwtunnel
This patch provide the tun_inf dst cache support for this mode.

Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 22:22:33 -08:00
David S. Miller
169431ed16 Merge branch 'dsa-mv88e6xxx-lockdep'
Andrew Lunn says:

====================
mv88e6xxx: Avoid false positive Lockdep splats

When acquiring the GPIO interrupt line for the switch, it is possible
to trigger lockdep splats. These are false positives, the mutex is in
a different IRQ descriptor. But fix it anyway, since it could mask
real locking issues.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 22:21:23 -08:00
Andrew Lunn
342a0ee70a net: dsa: mv88e6xxx: Release lock while requesting IRQ
There is no need to hold the register lock while requesting the GPIO
interrupt. By not holding it we can also avoid a false positive
lockdep splat.

Reported-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 22:21:23 -08:00
Andrew Lunn
f6d9758b12 net: dsa: mv88e6xxx: Add lockdep classes to fix false positive splat
The following false positive lockdep splat has been observed.

======================================================
WARNING: possible circular locking dependency detected
4.20.0+ #302 Not tainted
------------------------------------------------------
systemd-udevd/160 is trying to acquire lock:
edea6080 (&chip->reg_lock){+.+.}, at: __setup_irq+0x640/0x704

but task is already holding lock:
edff0340 (&desc->request_mutex){+.+.}, at: __setup_irq+0xa0/0x704

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&desc->request_mutex){+.+.}:
       mutex_lock_nested+0x1c/0x24
       __setup_irq+0xa0/0x704
       request_threaded_irq+0xd0/0x150
       mv88e6xxx_probe+0x41c/0x694 [mv88e6xxx]
       mdio_probe+0x2c/0x54
       really_probe+0x200/0x2c4
       driver_probe_device+0x5c/0x174
       __driver_attach+0xd8/0xdc
       bus_for_each_dev+0x58/0x7c
       bus_add_driver+0xe4/0x1f0
       driver_register+0x7c/0x110
       mdio_driver_register+0x24/0x58
       do_one_initcall+0x74/0x2e8
       do_init_module+0x60/0x1d0
       load_module+0x1968/0x1ff4
       sys_finit_module+0x8c/0x98
       ret_fast_syscall+0x0/0x28
       0xbedf2ae8

-> #0 (&chip->reg_lock){+.+.}:
       __mutex_lock+0x50/0x8b8
       mutex_lock_nested+0x1c/0x24
       __setup_irq+0x640/0x704
       request_threaded_irq+0xd0/0x150
       mv88e6xxx_g2_irq_setup+0xcc/0x1b4 [mv88e6xxx]
       mv88e6xxx_probe+0x44c/0x694 [mv88e6xxx]
       mdio_probe+0x2c/0x54
       really_probe+0x200/0x2c4
       driver_probe_device+0x5c/0x174
       __driver_attach+0xd8/0xdc
       bus_for_each_dev+0x58/0x7c
       bus_add_driver+0xe4/0x1f0
       driver_register+0x7c/0x110
       mdio_driver_register+0x24/0x58
       do_one_initcall+0x74/0x2e8
       do_init_module+0x60/0x1d0
       load_module+0x1968/0x1ff4
       sys_finit_module+0x8c/0x98
       ret_fast_syscall+0x0/0x28
       0xbedf2ae8

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&desc->request_mutex);
                               lock(&chip->reg_lock);
                               lock(&desc->request_mutex);
  lock(&chip->reg_lock);

&desc->request_mutex refer to two different mutex. #1 is the GPIO for
the chip interrupt. #2 is the chained interrupt between global 1 and
global 2.

Add lockdep classes to the GPIO interrupt to avoid this.

Reported-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 22:21:23 -08:00
wenxu
3d25eabbbf ip_tunnel: Add dst_cache support in lwtunnel_state of ip tunnel
The lwtunnel_state is not init the dst_cache Which make the
ip_md_tunnel_xmit can't use the dst_cache. It will lookup
route table every packets.

Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 22:13:49 -08:00
Vakul Garg
2b794c4098 tls: Return type of non-data records retrieved using MSG_PEEK in recvmsg
The patch enables returning 'type' in msghdr for records that are
retrieved with MSG_PEEK in recvmsg. Further it prevents records peeked
from socket from getting clubbed with any other record of different
type when records are subsequently dequeued from strparser.

For each record, we now retain its type in sk_buff's control buffer
cb[]. Inside control buffer, record's full length and offset are already
stored by strparser in 'struct strp_msg'. We store record type after
'struct strp_msg' inside 'struct tls_msg'. For tls1.2, the type is
stored just after record dequeue. For tls1.3, the type is stored after
record has been decrypted.

Inside process_rx_list(), before processing a non-data record, we check
that we must be able to return back the record type to the user
application. If not, the decrypted records in tls context's rx_list is
left there without consuming any data.

Fixes: 692d7b5d1f ("tls: Fix recvmsg() to be able to peek across multiple records")
Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 21:58:38 -08:00
David S. Miller
2bdeb8e5bd Merge branch 'ipv4-v6-icmp-small-cleanup-and-update'
Kefeng Wang says:

====================
ipv4/v6: icmp: small cleanup and update

v2:
- Add cover letter and user proper patch subject-prefix suggested-by Eric Dumazet

This patch series contains some small cleanup and update,
1) use icmp/v6_sk_exit when icmp_sk_init fails instead of open-code
2) use new percpu allocation interface for the ipv6.icmp_sk
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 21:57:26 -08:00
Kefeng Wang
75efc250d2 ipv6: icmp: use percpu allocation
Use percpu allocation for the ipv6.icmp_sk.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 21:57:26 -08:00
Kefeng Wang
3232a1ef0f ipv6: icmp: use icmpv6_sk_exit()
Simply use icmpv6_sk_exit() when inet_ctl_sock_create() fail
in icmpv6_sk_init().

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 21:57:26 -08:00
Kefeng Wang
e9128c14bf ipv4: icmp: use icmp_sk_exit()
Simply use icmp_sk_exit() when inet_ctl_sock_create() fail in icmp_sk_init().

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 21:57:26 -08:00
Herbert Xu
4ef595cbb3 ila: Fix uninitialised return value in ila_xlat_nl_cmd_flush
This patch fixes an uninitialised return value error in
ila_xlat_nl_cmd_flush.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 6c4128f658 ("rhashtable: Remove obsolete...")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 21:54:44 -08:00
wenxu
41411e2fd6 net/sched: act_tunnel_key: Add dst_cache support
The metadata_dst is not init the dst_cache which make the
ip_md_tunnel_xmit can't use the dst_cache. It will lookup
route table every packets.

Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 21:51:55 -08:00
David S. Miller
caf337bdef Merge branch 'code-optimizations-and-bugfixes-for-HNS3-driver'
Huazhong Tan says:

====================
code optimizations & bugfixes for HNS3 driver

This patchset includes bugfixes and code optimizations for
the HNS3 ethernet controller driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 20:27:51 -08:00
Huazhong Tan
186551284e net: hns3: fix improper error handling for hns3_client_start
If hns3_client_start() failed in the hns3_client_init(),
register_dev() should be undo in its error handling.

Fixes: a6d818e31d ("net: hns3: Add vport alive state checking support")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 20:27:50 -08:00
Shiju Jose
eb4c2ccbad net: hns3: fix setting of the hns reset_type for rdma hw errors
Presently the hns reset_type for the roce errors is set
in the hclge_log_and_clear_rocee_ras_error function.
This function is also called to detect and clear roce errors
while enabling the rdma error interrupts. However there is no hns
reset requested for this case. This can cause issue of wrong
reset_type used with subsequent hns reset as the
reset_type set in the above case was not cleared.

This patch moves setting of hns reset_type for the roce errors from
hclge_log_and_clear_rocee_ras_error function
to hclge_handle_rocee_ras_error.

Fixes: 630ba007f4 ("net: hns3: add handling of RDMA RAS errors")
Reported-by: Huazhong Tan <tanhuazhong@huawei.com>
Reported-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 20:27:50 -08:00
Jian Shen
a638b1d8cc net: hns3: fix get VF RSS issue
For revision 0x20, VF shares the same RSS config with PF.
In original codes, it always return 0 when query RSS hash
key for VF. This patch fixes it by return the hash key
got from PF.

Fixes: 374ad29176 ("net: hns3: net: hns3: Add RSS general configuration support for VF")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 20:27:50 -08:00
Jian Shen
30ebc576d7 net: hns3: enable VF VLAN filter for each VF when initializing
For revision 0x21, the switch of VF VLAN filter is per function.
It's necessary to enable VF VLAN filter for each VF when initializing.
Otherwise, VF will be able to receive broadcast packets with unknown
VLAN when PF enters promisc mode.

Fixes: 64d114f0a7 ("net: hns3: Add egress/ingress vlan filter for revision 0x21")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 20:27:50 -08:00
Peng Li
c042594423 net: hns3: add support to config depth for tx|rx ring separately
This patch adds support to config depth for tx|rx ring separately
by ethtool command "-G".

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 20:27:50 -08:00
Yunsheng Lin
e8149933b1 net: hns3: remove hnae3_get_bit in data path
The hnae3_get_bit uses hnae3_get_field, and hnae3_get_field
masks the data, which is unnecessary in data path.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 20:27:50 -08:00
Yunsheng Lin
cde4ffada8 net: hns3: replace hnae3_set_bit and hnae3_set_field in data path
hnae3_set_bit and hnae3_set_field masks the data before setting
the field or bit, which is unnecessary because the data is already
zero initialized.

Suggested-by: John Garry <john.garry@huawei.com>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 20:27:50 -08:00
Yunsheng Lin
0cccebac71 net: hns3: add unlikely for error handling in data path
This patch adds unlikely hint for error handling in critical data
path.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 20:27:50 -08:00
Yunsheng Lin
d40fa7eeab net: hns3: remove some ops in struct hns3_nic_ops
The fill_desc ops has only one implementation, and
get_rxd_bnum has not been used, so this patch removes
them.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 20:27:50 -08:00
Yunsheng Lin
47e7b13b0a net: hns3: limit some variable scope in critical data path
This patch limits some variables' scope as much as possible in
hns3_fill_desc.

Also, only set l3_type and l4_type when necessary.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 20:27:50 -08:00
Yunsheng Lin
3fe13ed95d net: hns3: avoid mult + div op in critical data path
This patch uses shift offset to avoid doing mult and div operation.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 20:27:50 -08:00
Yunsheng Lin
2a73ac3e6c net: hns3: add xps setting support for hns3 driver
This patch adds xps setting support for hns3 driver based on
the interrupt affinity info.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 20:27:49 -08:00
David S. Miller
834f9b057d Merge branch 'mlxsw-spectrum_acl-Don-t-take-rtnl-mutex-for-region-rehash'
Ido Schimmel says:

====================
mlxsw: spectrum_acl: Don't take rtnl mutex for region rehash

Jiri says:

During region rehash, a new region is created with a more optimized set
of masks (ERPs). When transitioning to the new region, all the rules
from the old region are copied one-by-one to the new region. This
transition can be time consuming and currently done under RTNL lock.

In order to remove RTNL lock dependency during region rehash, introduce
multiple smaller locks guarding dedicated structures or parts of them.
That is the vast majority of this patchset. Only patch #1 is simple
cleanup and patches 12-15 are improving or introducing new selftests.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 20:25:30 -08:00
Jiri Pirko
81d56d8292 selftests: mlxsw: spectrum-2: Add massive delta rehash test
Do insertions and removal of filters during rehash in higher volumes.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 20:25:29 -08:00
Jiri Pirko
f6eaf1c3ac selftests: mlxsw: spectrum-2: Check migrate end trace
Add checking of newly added trace.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 20:25:29 -08:00
Jiri Pirko
6375da3dc0 mlxsw: spectrum_acl: Add vregion migration end tracepoint
Hit the new tracepoint once the vregion migration ends.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 20:25:29 -08:00
Jiri Pirko
d39ca90f59 selftests: mlxsw: spectrum-2: Add IPv6 variant of simple delta rehash test
Track the basic codepaths of delta rehash handling,
using mlxsw tracepoints. Use IPv6 addresses.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 20:25:29 -08:00
Jiri Pirko
2bffc5322f mlxsw: spectrum_acl: Don't take mutex in mlxsw_sp_acl_tcam_vregion_rehash_work()
Other mutexes are taking care of proper locking for this, no longer
needed to take RTNL mutex here.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 20:25:29 -08:00
Jiri Pirko
c70b13275b mlxsw: spectrum_acl: Remove RTNL lock assertions from ERP code
No longer require RTNL lock in this code. Newly introduced mutexes take
care of guarding objagg and bloom filter. There is no need to guard
gen_pool_alloc()/gen_pool_free() as they are fine to be called lockless.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 20:25:29 -08:00
Jiri Pirko
7b0f62eefc mlxsw: spectrum_acl: Don't take rtnl lock during vregion_rehash_intrvl_set()
Relax dependency on rtnl mutex during vregion_rehash_intrvl_set(). The
vregion list is protected with newly introduced mutex.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 20:25:29 -08:00
Jiri Pirko
ddaa2875da mlxsw: spectrum_acl: Introduce a mutex to guard objagg instance manipulation
Protect objagg structures by adding a mutex to ERP code and take it
during the structure manipulation.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 20:25:29 -08:00
Jiri Pirko
6b86168247 mlxsw: spectrum_acl: Enable vregion rehash per-profile
For MR ACL profile is does not make sense to do periodical rehashes, as
there is only one mask in use during the whole vregion lifetime.
Therefore periodical work is scheduled but the rehash never happens.
So allow to enable/disable rehash for the whole group, which is added
per-profile. Disable rehashing for MR profile.

Addition to the vregion list is done only in case the rehash is enable
on the particular vregion. Also, the addition is moved after delayed
work init to avoid schedule of uninitialized work
from vregion_rehash_intrvl_set(). Symmetrically, deletion from
the list is done before canceling the delayed work so it is
not scheduled by vregion_rehash_intrvl_set() again.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 20:25:29 -08:00
Jiri Pirko
65e1903560 mlxsw: spectrum_acl: Introduce mutex to guard Bloom Filter updates
Bloom filter is shared within multiple regions. For updates, it needs to
be guarded by a separate mutex. Do that in order to not rely on RTNL
mutex.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-24 20:25:29 -08:00