Commit Graph

904396 Commits

Author SHA1 Message Date
Parav Pandit
ae24432cbc net/mlx5: Split eswitch mode check to different helper function
In order to check eswitch state under a lock, prepare code to split
capability check and eswitch state check into two helper functions.

Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Bodong Wang <bodong@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-25 23:19:21 -07:00
Parav Pandit
98fed6eb9b devlink: Rely on driver eswitch thread safety instead of devlink
devlink_nl_cmd_eswitch_set_doit() doesn't hold devlink->lock mutex while
invoking driver callback. This is likely due to eswitch mode setting
involves adding/remove devlink ports, health reporters or
other devlink objects for a devlink device.

So it is driver responsiblity to ensure thread safe eswitch state
transition happening via either sriov legacy enablement or via devlink
eswitch set callback.

Therefore, get() callback should also be invoked without holding
devlink->lock mutex.
Vendor driver can use same internal lock which it uses during eswitch
mode set() callback.
This makes get() and set() implimentation symmetric in devlink core and
in vendor drivers.

Hence, remove holding devlink->lock mutex during eswitch get() callback.

Failing to do so results into below deadlock scenario when mlx5_core
driver is improved to handle eswitch mode set critical section invoked
by devlink and sriov sysfs interface in subsequent patch.

devlink_nl_cmd_eswitch_set_doit()
   mlx5_eswitch_mode_set()
     mutex_lock(esw->mode_lock) <- Lock A
     [...]
     register_devlink_port()
       mutex_lock(&devlink->lock); <- lock B

mutex_lock(&devlink->lock); <- lock B
devlink_nl_cmd_eswitch_get_doit()
   mlx5_eswitch_mode_get()
   mutex_lock(esw->mode_lock) <- Lock A

In subsequent patch, mlx5_core driver uses its internal lock during
get() and set() eswitch callbacks.

Other drivers have been inspected which returns either constant during
get operations or reads the value from already allocated structure.
Hence it is safe to remove the lock in get( ) callback and let vendor
driver handle it.

Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-25 23:19:18 -07:00
Parav Pandit
f999b706b7 net/mlx5: Simplify mlx5_unload_one() and its callers
mlx5_unload_one() always returns 0.
Simplify callers of mlx5_unload_one() and remove the dead code.

Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-25 23:19:15 -07:00
Parav Pandit
ecd01db871 net/mlx5: Simplify mlx5_register_device to return void
mlx5_register_device() doesn't check for any error and always returns 0.
Simplify mlx5_register_device() to return void and its caller, remove
dead code related to it.

Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-25 23:19:13 -07:00
Eli Cohen
dc638d1122 net/mlx5: Avoid group version scan when not necessary
Group version is used when modifying a rule is allowed
(FLOW_ACT_NO_APPEND is clear) to detect a case where the rule was found
but while the groups where unlocked a new FTE was added. In this case,
the added FTE could be one with identical match value so we need to
attempt again with group lock held.

Change the code so version is retrieved only when FLOW_ACT_NO_APPEND is
cleared. As result, later compare can also be avoided if FLOW_ACT_NO_APPEND
is cleared.

Also improve comments text.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-25 23:19:11 -07:00
Eli Cohen
0aad2a0b42 net/mlx5: Avoid incrementing FTE version
FTE version is not used anywhere in the code so avoid incrementing it.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-25 23:19:09 -07:00
Eli Cohen
454401aeb2 net/mlx5: Fix group version management
When adding a rule to a flow group we need increment the version of the
group. Current code fails to do that and as a result, when trying to add
a rule, we will fail to discover a case where an FTE with the same match
value was added while we scanned the groups of the same match criteria,
thus we may try to add an FTE that was already added.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-25 23:19:06 -07:00
Eli Cohen
b820ce00e0 net/mlx5: Simplify matching group searches
Instead of using two different structs for searching groups with the
same match, use a single struct and thus simplify the code, make it more
readable and smaller size which means less code cache misses.

         text    data     bss     dec     hex
before: 35524    2744       0   38268    957c
after:  35038    2744       0   37782    9396

When testing add 70000 rules, delete all the rules, and repeat three
times taking the average, we get (time in seconds):

Before the change: insert 16.80, delete 11.02
After the change:  insert 16.55, delete 10.95

Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-25 23:19:04 -07:00
Roi Dayan
d528d49705 net/mlx5: E-Switch, Use correct type for chain, prio and level values
The correct type is u32.

Fixes: d18296ffd9 ("net/mlx5: E-Switch, Introduce global tables")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-25 23:19:02 -07:00
Roi Dayan
c8508713c7 net/mlx5: E-Switch, free flow_group_in after creating the restore table
We allocate a temporary memory but forget to free it.

Fixes: 11b717d615 ("net/mlx5: E-Switch, Get reg_c0 value on CQE")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-25 23:18:59 -07:00
Paul Blakey
7983a675ba net/mlx5: E-Switch, Enable chains only if regs loopback is enabled
Register c0 loopback is needed to fully support chains and prios.

Enable chains and prio only if loopback (of reg c1 which came together
with c0), is enabled. To be able to check that, move enabling of loopback
before eswitch chains init.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-25 23:18:57 -07:00
Paul Blakey
60acc105cb net/mlx5: E-Switch, Enable restore table only if reg_c1 is supported
Reg c0/c1 matching, rewrite of regs c0/c1, and copy header of regs c1,B
is needed for the restore table to function, might not be supported by
firmware, and creation of the restore table or the copy header will
fail.

Check reg_c1 loopback support, as firmware which supports this,
should have all of the above.

Fixes: 11b717d615 ("net/mlx5: E-Switch, Get reg_c0 value on CQE")
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-25 23:18:54 -07:00
wenxu
046826c878 net/mlx5e: remove duplicated check chain_index in mlx5e_rep_setup_ft_cb
The function mlx5e_rep_setup_ft_cb check chain_index is zero twice.

Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-25 23:18:52 -07:00
Dan Carpenter
49397b8012 net/mlx5e: Fix actions_match_supported() return
The actions_match_supported() function returns a bool, true for success
and false for failure.  This error path is returning a negative which
is cast to true but it should return false.

Fixes: 4c3844d9e9 ("net/mlx5e: CT: Introduce connection tracking")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-25 23:18:49 -07:00
David S. Miller
9fb16955fb Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Overlapping header include additions in macsec.c

A bug fix in 'net' overlapping with the removal of 'version'
string in ena_netdev.c

Overlapping test additions in selftests Makefile

Overlapping PCI ID table adjustments in iwlwifi driver.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25 18:58:11 -07:00
Linus Torvalds
1b649e0bca Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:

 1) Fix deadlock in bpf_send_signal() from Yonghong Song.

 2) Fix off by one in kTLS offload of mlx5, from Tariq Toukan.

 3) Add missing locking in iwlwifi mvm code, from Avraham Stern.

 4) Fix MSG_WAITALL handling in rxrpc, from David Howells.

 5) Need to hold RTNL mutex in tcindex_partial_destroy_work(), from Cong
    Wang.

 6) Fix producer race condition in AF_PACKET, from Willem de Bruijn.

 7) cls_route removes the wrong filter during change operations, from
    Cong Wang.

 8) Reject unrecognized request flags in ethtool netlink code, from
    Michal Kubecek.

 9) Need to keep MAC in reset until PHY is up in bcmgenet driver, from
    Doug Berger.

10) Don't leak ct zone template in act_ct during replace, from Paul
    Blakey.

11) Fix flushing of offloaded netfilter flowtable flows, also from Paul
    Blakey.

12) Fix throughput drop during tx backpressure in cxgb4, from Rahul
    Lakkireddy.

13) Don't let a non-NULL skb->dev leave the TCP stack, from Eric
    Dumazet.

14) TCP_QUEUE_SEQ socket option has to update tp->copied_seq as well,
    also from Eric Dumazet.

15) Restrict macsec to ethernet devices, from Willem de Bruijn.

16) Fix reference leak in some ethtool *_SET handlers, from Michal
    Kubecek.

17) Fix accidental disabling of MSI for some r8169 chips, from Heiner
    Kallweit.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (138 commits)
  net: Fix CONFIG_NET_CLS_ACT=n and CONFIG_NFT_FWD_NETDEV={y, m} build
  net: ena: Add PCI shutdown handler to allow safe kexec
  selftests/net/forwarding: define libs as TEST_PROGS_EXTENDED
  selftests/net: add missing tests to Makefile
  r8169: re-enable MSI on RTL8168c
  net: phy: mdio-bcm-unimac: Fix clock handling
  cxgb4/ptp: pass the sign of offset delta in FW CMD
  net: dsa: tag_8021q: replace dsa_8021q_remove_header with __skb_vlan_pop
  net: cbs: Fix software cbs to consider packet sending time
  net/mlx5e: Do not recover from a non-fatal syndrome
  net/mlx5e: Fix ICOSQ recovery flow with Striding RQ
  net/mlx5e: Fix missing reset of SW metadata in Striding RQ reset
  net/mlx5e: Enhance ICOSQ WQE info fields
  net/mlx5_core: Set IB capability mask1 to fix ib_srpt connection failure
  selftests: netfilter: add nfqueue test case
  netfilter: nft_fwd_netdev: allow to redirect to ifb via ingress
  netfilter: nft_fwd_netdev: validate family and chain type
  netfilter: nft_set_rbtree: Detect partial overlaps on insertion
  netfilter: nft_set_rbtree: Introduce and use nft_rbtree_interval_start()
  netfilter: nft_set_pipapo: Separate partial and complete overlap cases on insertion
  ...
2020-03-25 13:58:05 -07:00
Linus Torvalds
1dfb642b10 GPIO fixes for the v5.6 series:
- One core quirk by myself to fix the .irq_disable()
   semantics when the gpiolib core takes over this callback.
 
 - The rest is an elaborate series of 4 patches fixing Intel
   laptop ACPI wakeup quirks.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEElDRnuGcz/wPCXQWMQRCzN7AZXXMFAl57qSIACgkQQRCzN7AZ
 XXMmqxAAiggbsjiMkufxHxDqKl0tLRkbQwlsnz5VQlH0XCPBYSRidbIfQdiDFTbD
 yoBENIYeW/9aVlb0Q7a2T5H4zy1Ye3MiqObZt97ECr7DyxwNrAbPS/IgOFmsjSmu
 C4HZgX6/fQpBlJnKVgMBwzdgKShX4Fb4l1gFnY7uRRjBGVkp5ZArqJALJ747zFR0
 yj/kpA+hGgO24AJspF+QaLFfKkjMXKv2zgaa3HEtsYqUBH/3yBZigLKJs3koXHYB
 3c7EbWxnpYLiPi/T51aN7ZbbNX7Ag6wW6rLm/6460MCYDRSOB+7fCJCZc2KZJomr
 4ntq1HHxSrjOiqce+MSvc+WLI6pqjN73tyWJwhFgBfwr32Ara5FR6CO0KJT9M5d+
 0i8Na0Dyp2LlZM530ygOOn48p+tiU6NVoR/BXUnXBBuYtFsOmfTtrQGh++Uw+rcJ
 wraFAsSzzr7JmXzh+tUEeGGdI4EYy1T6+DppEfSbyty9W6YnWvqHP9xcyuicUyi2
 g9Btl0hRhSqQZozDYjROPQox1qjUKHLnI66n/W9pVUN61GlFlSH/38yRAEpD0DZW
 GlzSEkfL5SN5DQjaAOC2z0z+kfFSGdtCYMOKkyl+cKGbLCRc43PrzHB3oH+wF1Mu
 0J+Hilrk9y4sGTX67jIQPV/t1qhoorRFzihuUkHBdGt//XCqtI4=
 =+dEG
 -----END PGP SIGNATURE-----

Merge tag 'gpio-v5.6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull GPIO fixes from Linus Walleij:

 - One core quirk by myself to fix the .irq_disable() semantics when the
   gpiolib core takes over this callback.

 - The rest is an elaborate series of four patches fixing Intel laptop
   ACPI wakeup quirks.

* tag 'gpio-v5.6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpiolib: acpi: Add quirk to ignore EC wakeups on HP x2 10 CHT + AXP288 model
  gpiolib: acpi: Add quirk to ignore EC wakeups on HP x2 10 BYT + AXP288 model
  gpiolib: acpi: Rework honor_wakeup option into an ignore_wake option
  gpiolib: acpi: Correct comment for HP x2 10 honor_wakeup quirk
  gpiolib: Fix irq_disable() semantics
2020-03-25 13:52:36 -07:00
David S. Miller
2910594fd3 wireless-drivers fixes for v5.6
Fourth, and last, set of fixes for v5.6. Just two important fixes to
 iwlwifi regressions.
 
 iwlwifi
 
 * fix GEO_TX_POWER_LIMIT command on certain devices which caused
   firmware to crash during initialisation
 
 * add back device ids for three devices which were accidentally
   removed
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJee7S6AAoJEG4XJFUm622bcH4IAICV95Y79p9loKZVwXBrguiK
 5D9rBNZcSc0Yo9AwZkhatRAJYiLu19hw2qra3YsQnaWpjESmnZtV6/ZASDcpOGSP
 NaI4TMLt57dhnpqhpglvGeM9PlUbaoqX7Sl5LbhnSQoKkk7rppnk90KqbXl7SDba
 4eG7+i4iMc69uw4BttVh/lfgAH0YyYpgStWi9ccoQ2ip9SrljFZwmCF7q+3/25OR
 Hyty44U90BBrsi+LrRuhutR2LuR+TfXEWmtXWRzOpnwmicY8sCpeEVCMeeVy59TR
 y66sjlhM5w+N1mZbzBrqUZJLdMGInljx4G4DIvvi8jbD6rVLZc8zAeZMLjO24ZM=
 =CFbE
 -----END PGP SIGNATURE-----

Merge tag 'wireless-drivers-2020-03-25' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers

Kalle Valo says:

====================
wireless-drivers fixes for v5.6

Fourth, and last, set of fixes for v5.6. Just two important fixes to
iwlwifi regressions.

iwlwifi

* fix GEO_TX_POWER_LIMIT command on certain devices which caused
  firmware to crash during initialisation

* add back device ids for three devices which were accidentally
  removed
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25 13:12:26 -07:00
Pablo Neira Ayuso
2c64605b59 net: Fix CONFIG_NET_CLS_ACT=n and CONFIG_NFT_FWD_NETDEV={y, m} build
net/netfilter/nft_fwd_netdev.c: In function ‘nft_fwd_netdev_eval’:
    net/netfilter/nft_fwd_netdev.c:32:10: error: ‘struct sk_buff’ has no member named ‘tc_redirected’
      pkt->skb->tc_redirected = 1;
              ^~
    net/netfilter/nft_fwd_netdev.c:33:10: error: ‘struct sk_buff’ has no member named ‘tc_from_ingress’
      pkt->skb->tc_from_ingress = 1;
              ^~

To avoid a direct dependency with tc actions from netfilter, wrap the
redirect bits around CONFIG_NET_REDIRECT and move helpers to
include/linux/skbuff.h. Turn on this toggle from the ifb driver, the
only existing client of these bits in the tree.

This patch adds skb_set_redirected() that sets on the redirected bit
on the skbuff, it specifies if the packet was redirect from ingress
and resets the timestamp (timestamp reset was originally missing in the
netfilter bugfix).

Fixes: bcfabee1af ("netfilter: nft_fwd_netdev: allow to redirect to ifb via ingress")
Reported-by: noreply@ellerman.id.au
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25 12:24:33 -07:00
Rahul Kundu
1f074e677a cxgb4: Add support to catch bits set in INT_CAUSE5
This commit adds support to catch any bits set in SGE_INT_CAUSE5 for Parity Errors.
F_ERR_T_RXCRC flag is used to ignore that particular bit as it is not considered as fatal.
So, we clear out the bit before looking for error.
This patch now read and report separately all three registers(Cause1, Cause2, Cause5).
Also, checks for errors if any.

Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: Rahul Kundu <rahul.kundu@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25 12:22:33 -07:00
David S. Miller
6e22c60480 Merge branch 'octeontx2-pf-Miscellaneous-fixes'
Sunil Goutham says:

====================
octeontx2-pf: Miscellaneous fixes

This patchset fixes couple of issues related to missing
page refcount updation and taking a mutex lock in atomic
context.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25 12:20:00 -07:00
Sunil Goutham
e99b7c84fd octeontx2-pf: Fix ndo_set_rx_mode
Since set_rx_mode takes a mutex lock for sending mailbox
message to admin function to set the mode, moved logic
to a workqueue.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25 12:20:00 -07:00
Sunil Goutham
e88b288ec2 octeontx2-pf: Fix rx buffer page refcount
Fixed an issue wherein while refilling receive buffers
for the last page allocated, recount is not being updated.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25 12:20:00 -07:00
David S. Miller
1455ea1d8a Merge branch 's390-next'
Julian Wiedmann says:

====================
s390/qeth: updates 2020-03-25

please apply the following patch series for qeth to netdev's net-next
tree.
Same series as yesterday, with one minor update to patch 1 as per
your review.

This adds
1) NAPI poll support for the async-Completion Queue (with one qdio layer
   patch acked by Heiko),
2) ethtool support for per-queue TX IRQ coalescing,
3) various cleanups.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25 12:07:16 -07:00
Julian Wiedmann
bb59c8a89a s390/qeth: modernize two list helpers
Replace list_for_each() with list_for_each_entry(), and
list_entry(head.next) with list_first_entry().

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25 12:07:16 -07:00
Julian Wiedmann
c91a1fb7a4 s390/qeth: keep track of fixed prio-queue configuration
When a device is configured in prio-queue mode to pin all traffic onto
a specific HW queue, treat this as a distinct variant of prio-queueing
instead of QETH_NO_PRIO_QUEUEING.

This corrects an error message from qeth_osa_set_output_queues() for
devices configured in such a mode.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25 12:07:15 -07:00
Julian Wiedmann
bdb0cc128b s390/qeth: fine-tune MAC Address-related errnos
Return the correct errnos when .ndo_set_mac_address fails to set a new
MAC address.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25 12:07:15 -07:00
Julian Wiedmann
ee1e52d1e4 s390/qeth: add TX IRQ coalescing support for IQD devices
Since IQD devices complete (most of) their transmissions synchronously,
they don't offer TX completion IRQs and have no HW coalescing controls.
But we can fake the easy parts in SW, and give the user some control wrt
to how often the TX NAPI code should be triggered to process the TX
completions.

Having per-queue controls can in particular help the dedicated mcast
queue, as it likely benefits from different fine-tuning than what the
ucast queues need.

CC: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25 12:07:15 -07:00
Julian Wiedmann
1ab2f8c699 s390/qeth: collect more TX statistics
Count the number of TX doorbells we issue to the qdio layer.

Also count the number of actual frames in a TX buffer, and then
use this data along with the byte count during TX completion.
We'll make additional use of the frame count in a subsequent patch.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25 12:07:15 -07:00
Julian Wiedmann
9de15117f1 s390/qeth: clean up the mac_bits
We're down to a single bit flag for MAC-address related status, reflect
that in the info struct.
Also set up the flag during initialization instead of clearing it during
shutdown - one more little step towards unifying the shutdown code.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25 12:07:15 -07:00
Julian Wiedmann
8ec1e247a2 s390/qeth: simplify L3 dev_id logic
The logic that deals with errors from qeth_l3_get_unique_id() is quite
complex: it sets card->unique_id to 0xfffe, additionally flags it as
UNIQUE_ID_NOT_BY_CARD and later takes this flag as cue to not propagate
card->unique_id to dev->dev_id. With dev->dev_id thus holding 0,
addrconf_ifid_eui48() applies its default behaviour.

Get rid of all the special bit masks, and just return the old uid in
case of an error. For the vast majority of cases this will be 0 (and so
we still get the desired default behaviour) - with the rare exception
where qeth_l3_get_unique_id() might have been called earlier but the
initialization then failed at a later point.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25 12:07:15 -07:00
Julian Wiedmann
0a6e634535 s390/qdio: extend polling support to multiple queues
When the support for polling drivers was initially added, it only
considered Input Queue 0. But as QDIO interrupts are actually for the
full device and not a single queue, this doesn't really fit for
configurations where multiple Input Queues are used.

Rework the qdio code so that interrupts for a polling driver are not
split up into actions for each queue. Instead deliver the interrupt as
a single event, and let the driver decide which queue needs what action.

When re-enabling the QDIO interrupt via qdio_start_irq(), this means
that the qdio code needs to
(1) put _all_ eligible queues back into a state where they raise IRQs,
(2) and afterwards check _all_ eligible queues for new work to bridge
    the race window.

On the qeth side of things (as the only qdio polling driver), we can now
add CQ polling support to the main NAPI poll routine. It doesn't consume
NAPI budget, and to avoid hogging the CPU we yield control after
completing one full queue worth of buffers.
The subsequent qdio_start_irq() will check for any additional work, and
have us re-schedule the NAPI instance accordingly.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25 12:07:15 -07:00
Julian Wiedmann
b439044b70 s390/qeth: remove redundant if-clause in RX poll code
Whenever all completed RX buffers have been processed
(ie. rx->b_count == 0), we call down to the HW layer to scan for
additional buffers. If no further buffers are available, the code
breaks out of the while-loop.

So we never reach the 'process an RX buffer' step with rx->b_count == 0,
eliminate that check and one level of indentation.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25 12:07:15 -07:00
Julian Wiedmann
781b9a1820 s390/qeth: split out RX poll code
The main NAPI poll routine should eventually handle more types of work,
beyond just the RX ring.
Split off the RX poll logic into a separate function, and simplify the
nested while-loop.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25 12:07:15 -07:00
Julian Wiedmann
ed13615dd3 s390/qeth: simplify RX buffer tracking
Since RX buffers may contain multiple packets, qeth's NAPI poll code can
exhaust its budget in the middle of an RX buffer. Thus we keep track of
our current position within the active RX buffer, so we can resume
processing here in the next NAPI poll period.

Clean up that code by tracking the index of the active buffer element,
instead of a pointer to it.
Also simplify the code that advances to the next RX buffer when the
current buffer has been fully processed.

v2: - remove QDIO_ELEMENT_NO() macro (davem)

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25 12:07:15 -07:00
Guilherme G. Piccoli
428c491332 net: ena: Add PCI shutdown handler to allow safe kexec
Currently ENA only provides the PCI remove() handler, used during rmmod
for example. This is not called on shutdown/kexec path; we are potentially
creating a failure scenario on kexec:

(a) Kexec is triggered, no shutdown() / remove() handler is called for ENA;
instead pci_device_shutdown() clears the master bit of the PCI device,
stopping all DMA transactions;

(b) Kexec reboot happens and the device gets enabled again, likely having
its FW with that DMA transaction buffered; then it may trigger the (now
invalid) memory operation in the new kernel, corrupting kernel memory area.

This patch aims to prevent this, by implementing a shutdown() handler
quite similar to the remove() one - the difference being the handling
of the netdev, which is unregistered on remove(), but following the
convention observed in other drivers, it's only detached on shutdown().

This prevents an odd issue in AWS Nitro instances, in which after the 2nd
kexec the next one will fail with an initrd corruption, caused by a wild
DMA write to invalid kernel memory. The lspci output for the adapter
present in my instance is:

00:05.0 Ethernet controller [0200]: Amazon.com, Inc. Elastic Network
Adapter (ENA) [1d0f:ec20]

Suggested-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com>
Acked-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25 12:03:29 -07:00
Hangbin Liu
c085dbfb1c selftests/net/forwarding: define libs as TEST_PROGS_EXTENDED
The lib files should not be defined as TEST_PROGS, or we will run them
in run_kselftest.sh.

Also remove ethtool_lib.sh exec permission.

Fixes: 81573b18f2 ("selftests/net/forwarding: add Makefile to install tests")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25 12:01:18 -07:00
Hangbin Liu
919a23e9d6 selftests/net: add missing tests to Makefile
Find some tests are missed in Makefile by running:
for file in $(ls *.sh); do grep -q $file Makefile || echo $file; done

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25 11:33:02 -07:00
Eric Dumazet
29f3490ba9 net: use indirect call wrappers for skb_copy_datagram_iter()
TCP recvmsg() calls skb_copy_datagram_iter(), which
calls an indirect function (cb pointing to simple_copy_to_iter())
for every MSS (fragment) present in the skb.

CONFIG_RETPOLINE=y forces a very expensive operation
that we can avoid thanks to indirect call wrappers.

This patch gives a 13% increase of performance on
a single flow, if the bottleneck is the thread reading
the TCP socket.

Fixes: 950fcaecd5 ("datagram: consolidate datagram copy to iter helpers")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25 11:30:40 -07:00
YueHaibing
fab90c8202 cxgb4: remove set but not used variable 'tab'
drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c: In function cxgb4_get_free_ftid:
drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c:547:23:
 warning: variable tab set but not used [-Wunused-but-set-variable]

commit 8d174351f2 ("cxgb4: rework TC filter rule insertion across regions")
involved this, remove it.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-25 11:18:27 -07:00
Linus Torvalds
e2cf67f668 zonefs fixes for 5.6 final
A single fix in this pull request to correctly handle the size of
 read-only zone files (from me).
 
 Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSRPv8tYSvhwAzJdzjdoc3SxdoYdgUCXnrJJAAKCRDdoc3SxdoY
 dgZXAQDK88T4sdtFq1Fl1PuP+oyzHml+xgNo0djZQOdicnD6qQD8CgMGDFQQG4dv
 Ral+67qEyvUABGt0Vkmy29wuN8El6wQ=
 =+1D9
 -----END PGP SIGNATURE-----

Merge tag 'zonefs-5.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs

Pull zonefs fix from Damien Le Moal:
 "A single fix from me to correctly handle the size of read-only zone
  files"

* tag 'zonefs-5.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs:
  zonfs: Fix handling of read-only zones
2020-03-25 10:34:02 -07:00
Damien Le Moal
ccf4ad7da0 zonfs: Fix handling of read-only zones
The write pointer of zones in the read-only consition is defined as
invalid by the SCSI ZBC and ATA ZAC specifications. It is thus not
possible to determine the correct size of a read-only zone file on
mount. Fix this by handling read-only zones in the same manner as
offline zones by disabling all accesses to the zone (read and write)
and initializing the inode size of the read-only zone to 0).

For zones found to be in the read-only condition at runtime, only
disable write access to the zone and keep the size of the zone file to
its last updated value to allow the user to recover previously written
data.

Also fix zonefs documentation file to reflect this change.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
2020-03-25 11:28:26 +09:00
David S. Miller
6f000f9878 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) A new selftest for nf_queue, from Florian Westphal. This test
   covers two recent fixes: 07f8e4d0fd ("tcp: also NULL skb->dev
   when copy was needed") and b738a185be ("tcp: ensure skb->dev is
   NULL before leaving TCP stack").

2) The fwd action breaks with ifb. For safety in next extensions,
   make sure the fwd action only runs from ingress until it is extended
   to be used from a different hook.

3) The pipapo set type now reports EEXIST in case of subrange overlaps.
   Update the rbtree set to validate range overlaps, so far this
   validation is only done only from userspace. From Stefano Brivio.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-24 17:30:40 -07:00
David S. Miller
7e566df652 mlx5-fixes-2020-03-24
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl56fu0ACgkQSD+KveBX
 +j4g1Qf/edWPTCMGK4eb0jBPUvnxkoGPYj4cq0tfeUaY7Q4r95RzNjuW3gTotzGV
 JZymoC2OoWQxUR2Ye0FkM1C/RQFIAHinEX/KFOMJ6PL+k4+micXeIGNfVo3aflO0
 kaTcBdgZKqFS5hpRtWZc/DVRWckqJYtaAJEFliQbYGwmfiZNoNr0/ZeU+/DX2dHn
 bQkRHQZ3Zq43P4FhVBSyrfmsxUI71k7GtCdJ5G4i80e8qCCARKZDx7q1FRC0k6fh
 a84+7NxpFRkl+kT+se/bxcQaFht49YSJVauGMKK8Ae+pz0XEaNrYsFz9zQY/s7W6
 4a62hzlHuVmUwteZfH+secZzCnOkUw==
 =rO1d
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-fixes-2020-03-24' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
Mellanox, mlx5 fixes 2020-03-24

This series introduces some fixes to mlx5 driver.

From Aya, Fixes to the RX error recovery flows
From Leon, Fix IB capability mask

Please pull and let me know if there is any problem.

For -stable v5.5
 ('net/mlx5_core: Set IB capability mask1 to fix ib_srpt connection failure')

For -stable v5.4
 ('net/mlx5e: Fix ICOSQ recovery flow with Striding RQ')
 ('net/mlx5e: Do not recover from a non-fatal syndrome')
 ('net/mlx5e: Fix missing reset of SW metadata in Striding RQ reset')
 ('net/mlx5e: Enhance ICOSQ WQE info fields')

The above patch ('net/mlx5e: Enhance ICOSQ WQE info fields')
will fail to apply cleanly on v5.4 due to a trivial contextual conflict,
but it is an important fix, do I need to do something about it or just
assume Greg will know how to handle this ?
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-24 17:25:54 -07:00
Heiner Kallweit
f13bc68131 r8169: re-enable MSI on RTL8168c
The original change fixed an issue on RTL8168b by mimicking the vendor
driver behavior to disable MSI on chip versions before RTL8168d.
This however now caused an issue on a system with RTL8168c, see [0].
Therefore leave MSI disabled on RTL8168b, but re-enable it on RTL8168c.

[0] https://bugzilla.redhat.com/show_bug.cgi?id=1792839

Fixes: 003bd5b4a7 ("r8169: don't use MSI before RTL8168d")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-24 17:20:10 -07:00
Jakub Kicinski
cd556e40fd devlink: expand the devlink-info documentation
We are having multiple review cycles with all vendors trying
to implement devlink-info. Let's expand the documentation with
more information about what's implemented and motivation behind
this interface in an attempt to make the implementations easier.

Describe what each info section is supposed to contain, and make
some references to other HW interfaces (PCI caps).

Document how firmware management is expected to look, to make
it clear how devlink-info and devlink-flash work in concert.

Name some future work.

v2: - improve wording
v3: - improve wording

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-24 16:47:33 -07:00
Andre Przywara
c312c7818b net: phy: mdio-bcm-unimac: Fix clock handling
The DT binding for this PHY describes an *optional* clock property.
Due to a bug in the error handling logic, we are actually ignoring this
clock *all* of the time so far.

Fix this by using devm_clk_get_optional() to handle this clock properly.

Fixes: b78ac6ecd1 ("net: phy: mdio-bcm-unimac: Allow configuring MDIO clock divider")
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-24 16:45:32 -07:00
Vladimir Oltean
2283a02b67 net: phy: mscc: consolidate a common RGMII delay implementation
It looks like the VSC8584 PHY driver is rolling its own RGMII delay
configuration code, despite the fact that the logic is mostly the same.

In fact only the register layout and position for the RGMII controls has
changed. So we need to adapt and parameterize the PHY-dependent bit
fields when calling the new generic function.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-24 16:36:37 -07:00
David S. Miller
148aa2a86c Merge branch 'axienet-Update-error-handling-and-add-64-bit-DMA-support'
Andre Przywara says:

====================
net: axienet: Update error handling and add 64-bit DMA support

a minor update, fixing the 32-bit build breakage, and brightening up
Dave's christmas tree. Rebased against latest net-next/master.

This series is based on net-next as of today (9970de8b01), which
includes Russell's fixes [1], solving the SGMII issues I have had.

[1] https://lore.kernel.org/netdev/E1j6trA-0003GY-N1@rmk-PC.armlinux.org.uk/

Changelog v2 .. v3:
- Use two "left-shifts by 16" to fix builds with 32-bit phys_addr_t
- reorder variable declarations

Changelog v1 .. v2:
- Add Reviewed-by: tags from Radhey
- Extend kerndoc documentation
- Convert DMA error handler tasklet to work queue
- log DMA mapping errors
- mark DMA mapping error checks as unlikely (in "hot" paths)
- return NETDEV_TX_OK on TX DMA mapping error (increasing TX drop counter)
- Request eth IRQ as an optional IRQ
- Remove no longer needed MDIO IRQ register names
- Drop DT propery check for address width, assume full 64 bit

This series updates the Xilinx Axienet driver to work on our board
here. One big issue was broken SGMII support, which Russell fixed already
(in net-next).
While debugging and understanding the driver, I found several problems
in the error handling and cleanup paths, which patches 2-7 address.
Patch 8 removes a annoying error message, patch 9 paves the way for newer
revisions of the IP. The next patch adds mii-tool support, just for good
measure.

The next four patches add support for 64-bit DMA. This is an integration
option on newer IP revisions (>= v7.1), and expects MSB bits in formerly
reserved registers. Without writing to those MSB registers, the state
machine won't trigger, so it's mandatory to access them, even if they
are zero. Patches 11 and 12 prepare the code by adding accessors, to
wrap this properly and keep it working on older IP revisions.
Patch 13 enables access to the MSB registers, by trying to write a
non-zero value to them and checking if that sticks. Older IP revisions
always read those registers as zero.
Patch 14 then adjusts the DMA mask, based on the autodetected MSB
feature. It uses the full 64 bits in this case, the rest of the system
(actual physical addresses in use) should provide a natural limit if the
chip has connected fewer address lines. If not, the parent DT node can
use a dma-range property.

The Xilinx PG138 and PG021 documents (in versions 7.1 in both cases)
were used for this series.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-24 16:33:05 -07:00
Andre Przywara
5fff0151b3 net: axienet: Allow DMA to beyond 4GB
With all DMA address accesses wrapped, we can actually support 64-bit
DMA if this option was chosen at IP integration time.
If the IP has been configured for an address width greater than 32 bits,
we assume the full 64 bit DMA width is working. In practise this will be
limited by the actual system address bus width, which will ideally be the
same as the DMA IP address width.
If this is not the case, the actual width can still be configured using a
dma-ranges property in the parent of the MAC node.

This increases the DMA mask on those systems to let the kernel choose
buffers from memory at higher addresses.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-24 16:33:05 -07:00