Commit Graph

797261 Commits

Author SHA1 Message Date
David S. Miller
f0739e6517 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
100GbE Intel Wired LAN Driver Updates 2018-11-13

This series contains updates to the ice driver only.

Brett cleans up debug print messages by removing useless or duplicate
messages, and make sure we assign the hardware head pointer to head
instead of the software head pointer.  Resolved an issue when disabling
SRIOV we were trying to stop queues multiple times, so make sure we
disable SRIOV before stopping transmit and receive queues for VF.

Tony fixes a potential NULL pointer dereference during a VF reset.

Anirudh resolves an issue where we were releasing the VSI before
removing the VSI scheduler node, which was resulting in an error "Failed
to set LAN Tx queue context, error: -1".  Also fixed the guaranteed
number of VSIs available and used by discovering the device
capabilities to determine the 'guar_num_vsi' per function, rather than
always using the theoretical max number of VSIs every time.

Dave avoids a deadlock by nesting RTNL locking, so added a boolean to
determine if the RTNL lock is already held.

Lev fixes bad mask values which would break compilation.

Piotr increases the receive queue disable timeout since it can take
additional time to finish all pending queue requests.

Usha resolves an issue of VLAN priority tagged traffic not appearing on
all traffic classes, which was causing ETS bandwidth shaping to not work
as expected.

Henry fixes the reset path to cleanup the old scheduler tree before
rebuilding it.

Md Fahad removes a unnecessary check which was causing a driver load
error on platforms with more than 128 cores.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-15 11:23:30 -08:00
David S. Miller
bd5196b686 Merge branch 'hns3-hwgro'
Salil Mehta says:

====================
net: hns3: Add support of hardware GRO to HNS3 Driver

This patch-set adds support of hardware assisted GRO feature to
HNS3 driver on Rev B(=0x21) platform. Current hardware only
supports TCP/IPv{4|6} flows.

Change Log:
V1->V2:
1. Remove redundant print reported by Leon Romanovsky.
   Link: https://lkml.org/lkml/2018/11/13/715
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-15 09:44:46 -08:00
Peng Li
a6d53b97a2 net: hns3: Adds GRO params to SKB for the stack
When HW GRO enable, protocol stack will not do GRO again,
driver should add gro param to the skb for the protocol
stack..

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-15 09:44:46 -08:00
Peng Li
81ae0e0491 net: hns3: Add skb chain when num of RX buf exceeds MAX_SKB_FRAGS
MAX_SKB_FRAGS in protocol stack is defined as:

MAX_SKB_FRAGS is 17 when PAGE_SIZE is 4K. If HW enable GRO, it may
merge small packets and the rx buffer may be more than
MAX_SKB_FRAGS. So driver will add skb chain when RX buffer num.
more than MAX_SKB_FRAGS.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-15 09:44:46 -08:00
Peng Li
5c9f6b3935 net: hns3: Add support for ethtool -K to enable/disable HW GRO
This patch adds support of ethtool -K to enable/disable
hardware GRO in HNS3 PF/VF driver.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-15 09:44:46 -08:00
Peng Li
e559709505 net: hns3: Add handling of GRO Pkts not fully RX'ed in NAPI poll
The "FE bit" in the description means the last description for
a packets. When HW GRO enable, HW write data to ring every
packet/buffer, there is greater probability that driver handle
with the describtion but HW still not set the "FE bit".

When drier handle the packet and HW still not set "FE bit",
driver stores skb and bd_num in rx ring, and continue to use the
skb and bd_num in next napi.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-15 09:44:46 -08:00
Peng Li
b26a6fea22 net: hns3: Enable HW GRO for Rev B(=0x21) HNS3 hardware
HNS3 hardware Revision B(=0x21) supports Hardware GRO feature. This
patch enables this feature in the HNS3 PF/VF driver.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-15 09:44:46 -08:00
Heiner Kallweit
ba2f55b068 net: phy: icplus: add config_intr callback
Move IRQ configuration for IP101A/G from config_init to config_intr
callback. Reasons:

1. This allows phylib to disable interrupts if needed.
2. Icplus was the only driver supporting interrupts w/o defining a
   config_intr callback. Now we can add a phylib plausibility check
   disabling interrupt mode if one of the two irq-related callbacks
   isn't defined.

I don't own hardware with this PHY, and the change is based on the
datasheet for IP101A LF (which is supposed to be register-compatible
with IP101A/G). Change is compile-tested only.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-15 09:30:00 -08:00
David S. Miller
6d5db6c379 Merge branch 'nfp-abm-track-all-Qdiscs'
Jakub Kicinski says:

====================
nfp: abm: track all Qdiscs

Our Qdisc offload so far has been very simplistic.  We held
and array of marking thresholds and statistics sized to the
number of PF queues.  This was sufficient since the only
configuration we supported was single layer of RED Qdiscs
(on top of MQ or not, but MQ isn't really about queuing).

As we move to add more Qdiscs it's time to actually try to
track the full Qdisc hierarchy.  This allows us to make sure
our offloaded configuration reflects the SW path better.
We add graft notifications to MQ and RED (PRIO already sends
them) to allow drivers offloading those to learn how Qdiscs
are linked.  MQ graft gives us the obvious advantage of being
able to track when Qdiscs are shared or moved.  It seems
unlikely HW would offload RED's child Qdiscs but since the
behaviour would change based on linked child we should
stop offloading REDs with modified child.  RED will also
handle the child differently during reconfig when limit
parameter is set - so we have to inform the drivers about
the limit, and have them reset the child state when
appropriate.

The NFP driver will now allocate a structure to track each
Qdisc and link it to its children.  We will also maintain
a shadow copy of threshold settings - to save device writes
and make it easier to apply defaults when config is
re-evaluated.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 08:51:28 -08:00
Jakub Kicinski
bd3b5d462a nfp: abm: restructure Qdisc handling
In preparation of handling more Qdisc types switch to a different
offload strategy.  We have now recreated the Qdisc hierarchy in
the driver.  Every time the hierarchy changes parse it, and update
the configuration of the HW accordingly.

While at it drop the support of pretending that we can instantiate
a single queue on a multi-queue device in HW/FW.  MQ is now required,
and each queue will have its own instance of RED.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 08:51:28 -08:00
Jakub Kicinski
52db4eaca5 nfp: abm: save RED's parameters
Use the new driver Qdisc structure to keep track of parameters
of RED Qdiscs.  This way as the Qdisc moves around in the hierarchy
we will be able to configure the HW appropriately.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 08:51:28 -08:00
Jakub Kicinski
6c5dbda0d4 nfp: abm: reset RED's child based on limit
RED qdisc will replace its child Qdisc with a new FIFO queue if
it is reconfigured and the limit parameter is not 0.

This means that when it's created with limit of 0 it will have no FIFO,
and all packets will be dropped.  If it's changed and limit is specified
it will loose its existing child (implicit graft).  Make sure we mark
RED Qdisc child as NFP_QDISC_UNTRACKED if its not the expected FIFO.

nfp_abm_qdisc_replace() will return 1 if Qdisc already existed.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 08:51:28 -08:00
Jakub Kicinski
c0b7490b19 net: sched: red: notify drivers about RED's limit parameter
RED qdisc's limit parameter changes the behaviour of the qdisc,
for instance if it's set to 0 qdisc will drop all the packets.

When replace operation happens and parameter is set to non-0
a new fifo qdisc will be instantiated and replace the old child
qdisc which will be destroyed.

Drivers need to know the parameter, even if they don't impose
the actual limit to be able to reliably reconstruct the Qdisc
hierarchy.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 08:51:28 -08:00
Jakub Kicinski
6b8417b7e6 nfp: abm: build full Qdisc hierarchy based on graft notifications
Using graft notifications recreate in the driver the full Qdisc
hierarchy.  Keep track of how many times each Qdisc is attached
to the hierarchy to make sure we don't offload Qdiscs which are
attached multiple times (device queues can't be shared).  For
graft events of Qdiscs we don't know exist make the child as
invalid/untracked.

Note that MQ Qdisc doesn't send destruction events reliably when
device is dismantled, so we need to manually clean out the
children otherwise we'd think Qdiscs which are still in use
are getting freed.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 08:51:28 -08:00
Jakub Kicinski
d577a3d279 net: sched: mq: offload a graft notification
Drivers offloading Qdiscs should have reasonable certainty
the offloaded behaviour matches the SW path.  This is impossible
if the driver does not know about all Qdiscs or when Qdiscs move
and are reused.  Send a graft notification from MQ.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 08:51:28 -08:00
Jakub Kicinski
bf2a752bea net: sched: red: offload a graft notification
Drivers offloading Qdiscs should have reasonable certainty
the offloaded behaviour matches the SW path.  This is impossible
if the driver does not know about all Qdiscs or when Qdiscs move
and are reused.  Send a graft notification from RED.  The drivers
are expected to simply stop offloading the Qdisc, if a non-standard
child is ever grafted onto it.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 08:51:28 -08:00
Jakub Kicinski
aee7539c58 nfp: abm: allocate Qdisc child table
To keep track of Qdisc hierarchy allocate a table for children
for each Qdisc.  RED Qdisc can only have one child.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 08:51:27 -08:00
Jakub Kicinski
1853125889 nfp: abm: remember which Qdisc is root
Keep track of which Qdisc is currently root.  We need to implement
TC_SETUP_ROOT_QDISC handling, and for completeness also clear the
root Qdisc pointer when it's freed.  TC_SETUP_ROOT_QDISC isn't always
sent when device is dismantled.

Remembering the root Qdisc will allow us to build the entire hierarchy
in following patches.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 08:51:27 -08:00
Jakub Kicinski
98b0e5f684 net: sched: provide notification for graft on root
Drivers are currently not notified when a Qdisc is grafted as root.
This requires special casing Qdiscs added with parent = TC_H_ROOT in
the driver.  Also there is no notification sent to the driver when
an existing Qdisc is grafted as root.

Add this very simple notifications, drivers should now be able to
track their Qdisc tree fully.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 08:51:27 -08:00
Jakub Kicinski
4f5681d088 nfp: abm: track all offload-enabled qdiscs
Allocate an object corresponding to any offloaded qdisc we are
informed about by the kernel.  Not only the qdiscs we have a
chance of offloading.

The count of created objects will be used to decide whether
the ethtool TC offload can be disabled, since otherwise we may
miss destroy commands.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 08:51:27 -08:00
Jakub Kicinski
6666f545e9 nfp: abm: keep track of all RED thresholds
Instead of writing the threshold out when Qdisc is configured
and not remembering it move to a scheme where we remember all
thresholds.  When configuration changes parse the offloaded
Qdiscs and set thresholds appropriately.

This will help future extensions.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 08:51:27 -08:00
Jakub Kicinski
08990494e5 nfp: abm: rename qdiscs -> red_qdiscs
Rename qdiscs member to red_qdiscs.  One of following patches will
use the name qdiscs for tracking all qdisc types.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 08:51:27 -08:00
David S. Miller
15cef30974 Merge branch 'aquantia-add-rx-flow-filter-support'
Igor Russkikh says:

====================
net: aquantia: add rx-flow filter support

In this patchset the rx-flow filters functionality and vlan filter offloads
are implemented.

The rules in NIC hardware have fixed order and priorities.
To support this, the locations of filters from ethtool perspective are also fixed:

* Locations 0 - 15 for VLAN ID filters
* Locations 16 - 31 for L2 EtherType and PCP filters
* Locations 32 - 39 for L3/L4 5-tuple filters (locations 32, 36 for IPv6)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 08:48:37 -08:00
Dmitry Bogdanov
7975d2aff5 net: aquantia: add support of rx-vlan-filter offload
Since it uses the same NIC table as rx flow vlan filter therefore
rx-flow vlan filter accepts only vlans that present on the interface
in case of rx-vlan-filter is on.

Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 08:48:37 -08:00
Dmitry Bogdanov
9a8cac4b4d net: aquantia: add ethertype and PCP to rx flow filters
L2 EtherType filters allows to filter packet by EtherType field or
both EtherType and User Priority (PCP) field of 802.1Q.
UserPriority (vlan) parameter must be accompanied by mask 0x1FFF. That
is to distinguish VLAN filter from L2 Ethertype filter with
UserPriority since both User Priority and VLAN ID are passed in the
same 'vlan' parameter.

Example:
To add a filter that directs IP4 packess of priority 3 to queue 3:
ethtool -N <ethX> flow-type ether proto 0x800 vlan 0x600 m 0x1FFF \
action 3 loc 16

Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 08:48:37 -08:00
Dmitry Bogdanov
54bcb3d162 net: aquantia: add vlan id to rx flow filters
The VLAN filter (VLAN id) is compared against 16 filters.
VLAN id must be accompanied by mask 0xF000. That is to distinguish
VLAN filter from L2 Ethertype filter with UserPriority since both
User Priority and VLAN ID are passed in the same 'vlan' parameter.
Flow type may be any as  it is not matched for VLAN filter.
Due to fixed order of the rules in the NIC, the location 0-15 are
reserved for vlan filters.

Example:
To add a rule that directs packets from VLAN 2001 to queue 5:
ethtool -N <ethX> flow-type ip4 vlan 2001 m 0xF000 action 5 loc 0

Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 08:48:37 -08:00
Dmitry Bogdanov
a6ed6f2269 net: aquantia: add support of L3/L4 ntuple filters
Add support of L3/L4 5-tuple {protocol, src-ip, dst-ip, src-port, dst-port}
filters. Mask is not supported. Src-port and dst-port are only compared for
TCP/UDP/SCTP packets. Both IPv4 and IPv6 are supported.
The supported actions are the drop and the queue assignment.
Due to fixed order of the rules in the NIC, the location 32-39 are
reserved for L3/L4 5-tuple filters. The locations 32 and 36 are
reserved for IPv6 filters.

Examples:
sudo ethtool -N eth0 flow-type ip6 src-ip 2001:db8:0:f101::2 \
dst-ip 2001:db8:0:f101::5 action -1 loc 36

sudo ethtool -N eth0 flow-type udp4 src-ip 10.0.0.4 \
dst-ip 10.0.0.7 src-port 2000 dst-port 2001 action 2 loc 32

Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 08:48:37 -08:00
Dmitry Bogdanov
8d0bcb012f net: aquantia: add infrastructure for ntuple rules
Add infrastructure to support ntuple filter configuration.
Add rule, remove rule, reapply on interface up.

Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 08:48:37 -08:00
Dmitry Bogdanov
23e7a718a4 net: aquantia: add rx-flow filter definitions
Add missing register definitions and the functions accessing them
related to rx-flow filters.

Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-14 08:48:37 -08:00
David S. Miller
4fd3e2ac18 Merge branch 'cpsw-allow-vlan-h-w-timestamping'
Ivan Khoronzhuk says:

====================
net: ethernet: ti: cpsw: allow vlan h/w timestamping

The patchset adds several improvements and allows vlan h/w ts.

Based on net-next/master
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-13 16:30:00 -08:00
Ivan Khoronzhuk
1ebb2446c3 net: ethernet: ti: cpsw: allow vlan tagged packets to be timestamped
Allow vlan tagged packets to be timestamped, as no any restrictions
for this.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-13 16:29:59 -08:00
Ivan Khoronzhuk
a942312034 net: ethernet: ti: cpts: move enable/disable flags outside of cpts module
Each slave has it's own receive timestamp filter. But cpts rx/tx
timestamp enable flags are used to allow ts retrieve only for one
user. This limitation causes data path redundancy and setting overlap
if cpsw module is in dual-mac mode for instance.

If rx ts is enabled only for one port - the second interface must expect
every incoming packet to be PTP packet w/o absolutely any reason, and if
it's PTP - do unneeded stuff, as rx filter for second port is not set
and cpts fifo is not supposed to contain appropriate ts event.
That's not correct.

So, to fix control overlap and avoid redundant CPU cycles, the patch
splits rx/tx ts enable flags between network devices. After the patch,
PTP timestamping still should be used for only one port (or PTP id
counter has to be different for both ports as cpts IP is common).

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-13 16:29:59 -08:00
Ivan Khoronzhuk
f19dcd5f11 net: ethernet: ti: cpts: purge staled skbs from txq
The overflow event is running with 1 jiffy in case if txq is not
empty, but it can be emptied completely only if next tx event
consumes skb or deletes staled skb from the txq. In case of staled
skb, that can happen for some unpredictable reason (the ts event was
lost or timed out), the overflow event can be generated quite long
time consuming CPU w/o reason before next tx event happens. To avoid
it, purge txq before increasing overflow event rate.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-13 16:29:59 -08:00
Ivan Khoronzhuk
d0e14c4d9b net: ethernet: ti: cpts: correct debug for expired txq skb
The msgtype and seqid that is smth that belongs to event for
comparison but not for staled txq skb.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-13 16:29:59 -08:00
Md Fahad Iqbal Polash
ef878d6086 ice: Remove ICE_MAX_TXQ_PER_TXQG check when configuring Tx queue
This patch removes the condition checking of VSI TX queue number to
ICE_MAX_TXQ_PER_TXQG. This is an unnecessary check and causes a driver
load error on hosts that have more than 128 cores.

Signed-off-by: Md Fahad Iqbal Polash <md.fahad.iqbal.polash@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-13 09:09:26 -08:00
Henry Tieman
47e3e53cea ice: Destroy scheduler tree in reset path
The scheduler tree is is always rebuilt during reset. The existing code
adds new scheduler nodes for queues but may not clean up earlier nodes.
This patch removed the old scheduler tree during reset before it is
rebuilt.

Signed-off-by: Henry Tieman <henry.w.tieman@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-13 09:09:26 -08:00
Usha Ketineni
c5a2a4a388 ice: Fix to make VLAN priority tagged traffic to appear on all TCs
This patch includes below changes to resolve the issue of ETS bandwidth
shaping to work.

1. Allocation of Tx queues is accounted for based on the enabled TC's
   in ice_vsi_setup_q_map() and enabled the Tx queues on those TC's via
   ice_vsi_cfg_txqs()

2. Get the mapped netdev TC # for the user priority and set the priority
   to TC mapping for the VSI.

Signed-off-by: Usha Ketineni <usha.k.ketineni@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-13 09:09:26 -08:00
Brett Creeley
99fc1057b4 ice: Call pci_disable_sriov before stopping queues for VF
Previous to this commit the driver was immediately stopping Tx/Rx
queues when doing the following "echo 0 > sriov_numvfs" and then it was
calling pci_disable_sriov if the VFs are not assigned. This was causing
the VIRTCHNL_OP_DISABLE_QUEUES to fail because it was trying to stop
the queues for a second time.

Fix this by calling pci_disable_sriov before stopping the Tx/Rx queues.
This allows the VIRTCHNL_OP_DISABLE_QUEUES to get processed before the
driver tries to stop the Rx/Tx queues in ice_free_vfs.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-13 09:09:26 -08:00
Piotr Raczynski
7b8ff0f9cc ice: Increase Rx queue disable timeout
With much traffic coming into the port, Rx queue disable
procedure can take more time until all pending queue
requests on PCIe finish. Reuse ICE_Q_WAIT_MAX_RETRY macro
and increase the delay itself.

Signed-off-by: Piotr Raczynski <piotr.raczynski@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-13 09:09:26 -08:00
Lev Faerman
6263e811f4 ice: Fix NVM mask defines
Fixes bad masks that would break compilation when evaluated.

Signed-off-by: Lev Faerman <lev.faerman@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-13 09:09:26 -08:00
Dave Ertman
d09e2693b6 ice: Avoid nested RTNL locking in ice_dis_vsi
ice_dis_vsi() performs an rtnl_lock() if it detects a netdev that is
running on the VSI. In cases where the RTNL lock has already been
acquired, a deadlock results. Add a boolean to pass to ice_dis_vsi to
tell it if the RTNL lock is already held.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-13 09:09:26 -08:00
Anirudh Venkataramanan
995c90f2de ice: Calculate guaranteed VSIs per function and use it
Currently we are setting the guar_num_vsi to equal to ICE_MAX_VSI
which is the device limit of 768. This is incorrect and could have
unintended consequences. To fix this use the valid_function's 8-bit
bitmap returned from discovering device capabilities to determine the
guar_num_vsi per function. guar_num_vsi value is then passed on to
pf->num_alloc_vsi.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-13 09:09:26 -08:00
Anirudh Venkataramanan
10e03a22de ice: Remove node before releasing VSI
Before releasing the VSI, remove the VSI scheduler node. If not,
the node is left in the scheduler tree and, on subsequent load, the
scheduler tree contains the node so it does not set it in vsi_ctx.
This, later, causes the node to not be found in ice_sched_get_free_qparent
which leads to a "Failed to set LAN Tx queue context, error: -1".

To remove the scheduler node, this patch introduces ice_rm_vsi_lan_cfg
and related helpers.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-13 09:09:26 -08:00
Tony Nguyen
b354e98f49 ice: Check for q_vector when stopping rings
There is a gap in time between a VF reset, which sets the q_vector to
NULL, and the VF requesting mapping of the q_vectors. If
ice_vsi_stop_tx_rings() is called during this time, a NULL pointer
dereference is encountered. Add a check in ice_vsi_stop_tx_rings()
to ensure the q_vector is set to avoid this situation from occurring.

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-13 09:09:25 -08:00
Brett Creeley
807bc98d31 ice: Fix debug print in ice_tx_timeout
Currently the debug print in ice_tx_timeout is printing useless and
duplicate values. First, head is being assigned to tx_ring->next_to_clean
and we are printing both of those values, but naming them HWB and NTC
respectively. Also, reading tail always returns 0 so remove that as well.

Instead of assigning the SW head (NTC) read to head, use the actual head
register and change the debug print to note that this is HW_HEAD. Also
reduce the scope of a couple variables.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-13 09:09:25 -08:00
Colin Ian King
3e536cff34 net: phy: check if advertising is zero using linkmode_empty
A recent change modified variable advertising from a u32 to a link mode
array and left the u32 zero comparison, so essential we now have an array
being compared to null which is not the intention. Fix this by using the
call to linkmode_empty to check if advertising is all zero.

Detected by CoverityScan, CID#1475424 ("Array compared against 0")

Fixes: 3c1bcc8614 ("net: ethernet: Convert phydev advertize and supported from u32 to link mode")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-12 16:26:21 -08:00
David S. Miller
261501d94e Merge branch 'sctp-add-support-for-sk_reuseport'
Xin Long says:

====================
sctp: add support for sk_reuseport

sctp sk_reuseport allows multiple socks to listen on the same port and
addresses, as long as these socks have the same uid. This works pretty
much as TCP/UDP does, the only difference is that sctp is multi-homing
and all the bind_addrs in these socks will have to completely matched,
otherwise listen() will return err.

The below is when 5 sockets are listening on 172.16.254.254:6400 on a
server, 26 sockets on a client connect to 172.16.254.254:6400 and each
may be processed by a different socket on the server which is selected
by hash(lport, pport, paddr) in reuseport_select_sock():

 # ss --sctp -nn
   State      Recv-Q Send-Q        Local Address:Port     Peer Address:Port
   LISTEN     0      10           172.16.254.254:6400                *:*
   `- ESTAB   0      0       172.16.254.254%eth1:6400       172.16.2.1:1234
   `- ESTAB   0      0       172.16.254.254%eth1:6400       172.16.2.4:1234
   `- ESTAB   0      0       172.16.254.254%eth1:6400       172.16.3.3:1234
   `- ESTAB   0      0       172.16.254.254%eth1:6400       172.16.3.4:1234
   `- ESTAB   0      0       172.16.254.254%eth1:6400       172.16.5.2:1234
   `- ESTAB   0      0       172.16.254.254%eth1:6400       172.16.5.3:1234
   LISTEN     0      10           172.16.254.254:6400                *:*
   `- ESTAB   0      0       172.16.254.254%eth1:6400       172.16.1.3:1234
   `- ESTAB   0      0       172.16.254.254%eth1:6400       172.16.1.4:1234
   `- ESTAB   0      0       172.16.254.254%eth1:6400       172.16.3.2:1234
   `- ESTAB   0      0       172.16.254.254%eth1:6400       172.16.4.1:1234
   `- ESTAB   0      0       172.16.254.254%eth1:6400       172.16.4.2:1234
   `- ESTAB   0      0       172.16.254.254%eth1:6400       172.16.4.3:1234
   `- ESTAB   0      0       172.16.254.254%eth1:6400       172.16.4.4:1234
   LISTEN     0      10           172.16.254.254:6400                *:*
   `- ESTAB   0      0       172.16.254.254%eth1:6400       172.16.1.2:1234
   `- ESTAB   0      0       172.16.254.254%eth1:6400       172.16.3.5:1234
   `- ESTAB   0      0       172.16.254.254%eth1:6400       172.16.4.5:1234
   `- ESTAB   0      0       172.16.254.254%eth1:6400   172.16.253.253:1234
   LISTEN     0      10           172.16.254.254:6400                *:*
   `- ESTAB   0      0       172.16.254.254%eth1:6400       172.16.2.2:1234
   `- ESTAB   0      0       172.16.254.254%eth1:6400       172.16.2.3:1234
   `- ESTAB   0      0       172.16.254.254%eth1:6400       172.16.5.4:1234
   `- ESTAB   0      0       172.16.254.254%eth1:6400       172.16.5.5:1234
   LISTEN     0      10           172.16.254.254:6400                *:*
   `- ESTAB   0      0       172.16.254.254%eth1:6400       172.16.1.1:1234
   `- ESTAB   0      0       172.16.254.254%eth1:6400       172.16.1.5:1234
   `- ESTAB   0      0       172.16.254.254%eth1:6400       172.16.2.5:1234
   `- ESTAB   0      0       172.16.254.254%eth1:6400       172.16.3.1:1234
   `- ESTAB   0      0       172.16.254.254%eth1:6400       172.16.5.1:1234
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-12 09:09:51 -08:00
Xin Long
6ba8457402 sctp: process sk_reuseport in sctp_get_port_local
When socks' sk_reuseport is set, the same port and address are allowed
to be bound into these socks who have the same uid.

Note that the difference from sk_reuse is that it allows multiple socks
to listen on the same port and address.

Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-12 09:09:51 -08:00
Xin Long
76c6d988ae sctp: add sock_reuseport for the sock in __sctp_hash_endpoint
This is a part of sk_reuseport support for sctp. It defines a helper
sctp_bind_addrs_check() to check if the bind_addrs in two socks are
matched. It will add sock_reuseport if they are completely matched,
and return err if they are partly matched, and alloc sock_reuseport
if all socks are not matched at all.

It will work until sk_reuseport support is added in
sctp_get_port_local() in the next patch.

v1->v2:
  - use 'laddr->valid && laddr2->valid' check instead as Marcelo
    pointed in sctp_bind_addrs_check().

Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-12 09:09:51 -08:00
Xin Long
532ae2f10e sctp: do reuseport_select_sock in __sctp_rcv_lookup_endpoint
This is a part of sk_reuseport support for sctp, and it selects a
sock by the hashkey of lport, paddr and dport by default. It will
work until sk_reuseport support is added in sctp_get_port_local()
in the next patch.

v1->v2:
  - define lport as __be16 instead of __be32 as Marcelo pointed in
    __sctp_rcv_lookup_endpoint().

Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-12 09:09:51 -08:00