Stop Mode is entered when Stop Mode is requested at chip level and
MCR[LPM_ACK] is asserted by the FlexCAN.
Double check with IP owner, the MCR[LPM_ACK] bit should be polled for
stop mode acknowledgment, not the acknowledgment from chip level which
is used to gate flexcan clocks.
This patch depends on:
b7603d080f ("can: flexcan: add low power enter/exit acknowledgment helper")
Fixes: 5f186c257f (can: flexcan: fix stop mode acknowledgment)
Tested-by: Sean Nyekjaer <sean@geanix.com>
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v5.0
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The MCR[LPMACK] read-only bit indicates that FlexCAN is in a lower-power
mode (Disabled mode, Doze mode, Stop mode).
The CPU can poll this bit to know when FlexCAN has actually entered low
power mode. The low power enter/exit acknowledgment helper will reduce
code duplication for disabled mode, doze mode and stop mode.
Tested-by: Sean Nyekjaer <sean@geanix.com>
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
When suspending, and there is still CAN traffic on the interfaces the
flexcan immediately wakes the platform again. As it should :-). But it
throws this error msg:
[ 3169.378661] PM: noirq suspend of devices failed
On the way down to suspend the interface that throws the error message
calls flexcan_suspend() but fails to call flexcan_noirq_suspend(). That
means flexcan_enter_stop_mode() is called, but on the way out of suspend
the driver only calls flexcan_resume() and skips flexcan_noirq_resume(),
thus it doesn't call flexcan_exit_stop_mode(). This leaves the flexcan
in stop mode, and with the current driver it can't recover from this
even with a soft reboot, it requires a hard reboot.
This patch fixes the deadlock when using self wakeup, by calling
flexcan_exit_stop_mode() from flexcan_resume() instead of
flexcan_noirq_resume().
This also fixes another issue: CAN frames are received out-of-order in
first IRQ handler run after wakeup.
The problem is that the wakeup latency from frame reception to the IRQ
handler (where the CAN frames are sorted by timestamp) is much bigger
than the time stamp counter wrap around time. This means it's
impossible to sort the CAN frames by timestamp.
The reason is that the controller exits stop mode during noirq resume,
which means it receives frames immediately, but interrupt handling is
still not possible.
So exit stop mode during resume stage instead of noirq resume fixes this
issue.
Fixes: de3578c198 ("can: flexcan: add self wakeup support")
Signed-off-by: Sean Nyekjaer <sean@geanix.com>
Tested-by: Sean Nyekjaer <sean@geanix.com>
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v5.0
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
CANFD2.0 core uses BRAM for storing acceptance filter ID(AFID) and MASK
(AFMASK)registers. So by default AFID and AFMASK registers contain random
data. Due to random data, we are not able to receive all CAN ids.
Initializing AFID and AFMASK registers with Zero before enabling
acceptance filter to receive all packets irrespective of ID and Mask.
Fixes: 0db9071353 ("can: xilinx: add can 2.0 support")
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Srinivas Neeli <srinivas.neeli@xilinx.com>
Reviewed-by: Naga Sureshkumar Relli <naga.sureshkumar.relli@xilinx.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v5.0
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
During development the define J1939_PGN_ADDRESS_REQUEST was renamed to
J1939_PGN_REQUEST. It was forgotten to adjust the documentation
accordingly.
This patch fixes the name of the symbol.
Reported-by: https://github.com/linux-can/can-utils/issues/159#issuecomment-556538798
Fixes: 9d71dd0c70 ("can: add support of SAE J1939 protocol")
Cc: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Adding myself to support the TI TCAN4X5X SPI CAN device.
Signed-off-by: Dan Murphy <dmurphy@ti.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Since I refactored the code to create a m_can framework and we
have a MMIO MCAN IP as well add myself to help maintain the code.
Signed-off-by: Dan Murphy <dmurphy@ti.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
In referenced fix we removed the RTL8168e-specific jumbo config for
RTL8168evl in rtl_hw_jumbo_enable(). We have to do the same in
rtl_hw_jumbo_disable().
v2: fix referenced commit id
Fixes: 14012c9f3b ("r8169: fix jumbo configuration for RTL8168evl")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
RTL8125 also requires to enable RX for WoL.
v2: add missing Fixes tag
Fixes: f1bce4ad2f ("r8169: add support for RTL8125")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When we receive a new packet from the guest, we check if the
src_cid is correct, but we forgot to check the dst_cid.
The host should accept only packets where dst_cid is
equal to the host CID.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The commit ef87f7da6b ("net: phy: dp83867: move dt parsing to probe")
causes regression on TI dra71x-evm and dra72x-evm, where DP83867 PHY is
used in "rgmii-id" mode - the networking stops working.
Unfortunately, it's not enough to just move DT parsing code to .probe() as
it depends on phydev->interface value, which is set to correct value abter
the .probe() is completed and before calling .config_init(). So, RGMII
configuration can't be loaded from DT.
To fix and issue
- move RGMII validation code to .config_init()
- parse RGMII parameters in dp83867_of_init(), but consider them as
optional.
Fixes: ef87f7da6b ("net: phy: dp83867: move dt parsing to probe")
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now RX interrupt is triggered twice every time, because in
cpsw_rx_interrupt() it is asked first and then disabled. So there will be
pending interrupt always, when RX interrupt is enabled again in NAPI
handler.
Fix it by first disabling IRQ and then do ask.
Fixes: 870915feab ("drivers: net: cpsw: remove disable_irq/enable_irq as irq can be masked from cpsw itself")
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After pskb_may_pull() we should always refetch the header
pointers from the skb->data in case it got reallocated.
In gre_parse_header(), the erspan header is still fetched
from the 'options' pointer which is fetched before
pskb_may_pull().
Found this during code review of a KMSAN bug report.
Fixes: cb73ee40b1 ("net: ip_gre: use erspan key field for tunnel lookup")
Cc: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Acked-by: William Tu <u9012063@gmail.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Passing NULL to pppoe_pernet causes a crash via BUG_ON.
Dereferencing net in net_generici() also has the same effect. This patch
removes the redundant BUG_ON check on the same parameter.
Signed-off-by: Aditya Pakki <pakki001@umn.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Guillaume Nault says:
====================
tcp: fix handling of stale syncookies timestamps
The synflood timestamps (->ts_recent_stamp and ->synq_overflow_ts) are
only refreshed when the syncookie protection triggers. Therefore, their
value can become very far apart from jiffies if no synflood happens for
a long time.
If jiffies grows too much and wraps while the synflood timestamp isn't
refreshed, then time_after32() might consider the later to be in the
future. This can trick tcp_synq_no_recent_overflow() into returning
erroneous values and rejecting valid ACKs.
Patch 1 handles the case of ACKs using legitimate syncookies.
Patch 2 handles the case of stray ACKs.
Patch 3 annotates lockless timestamp operations with READ_ONCE() and
WRITE_ONCE().
Changes from v3:
- Fix description of time_between32() (found by Eric Dumazet).
- Use more accurate Fixes tag in patch 3 (suggested by Eric Dumazet).
Changes from v2:
- Define and use time_between32() instead of a pair of
time_before32/time_after32 (suggested by Eric Dumazet).
- Use 'last_overflow - HZ' as lower bound in
tcp_synq_no_recent_overflow(), to accommodate for concurrent
timestamp updates (found by Eric Dumazet).
- Add a third patch to annotate lockless accesses to .ts_recent_stamp.
Changes from v1:
- Initialising timestamps at socket creation time is not enough
because jiffies wraps in 24 days with HZ=1000 (Eric Dumazet).
Handle stale timestamps in tcp_synq_overflow() and
tcp_synq_no_recent_overflow() instead.
- Rework commit description.
- Add a second patch to handle the case of stray ACKs.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Syncookies borrow the ->rx_opt.ts_recent_stamp field to store the
timestamp of the last synflood. Protect them with READ_ONCE() and
WRITE_ONCE() since reads and writes aren't serialised.
Use of .rx_opt.ts_recent_stamp for storing the synflood timestamp was
introduced by a0f82f64e2 ("syncookies: remove last_synq_overflow from
struct tcp_sock"). But unprotected accesses were already there when
timestamp was stored in .last_synq_overflow.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When no synflood occurs, the synflood timestamp isn't updated.
Therefore it can be so old that time_after32() can consider it to be
in the future.
That's a problem for tcp_synq_no_recent_overflow() as it may report
that a recent overflow occurred while, in fact, it's just that jiffies
has grown past 'last_overflow' + TCP_SYNCOOKIE_VALID + 2^31.
Spurious detection of recent overflows lead to extra syncookie
verification in cookie_v[46]_check(). At that point, the verification
should fail and the packet dropped. But we should have dropped the
packet earlier as we didn't even send a syncookie.
Let's refine tcp_synq_no_recent_overflow() to report a recent overflow
only if jiffies is within the
[last_overflow, last_overflow + TCP_SYNCOOKIE_VALID] interval. This
way, no spurious recent overflow is reported when jiffies wraps and
'last_overflow' becomes in the future from the point of view of
time_after32().
However, if jiffies wraps and enters the
[last_overflow, last_overflow + TCP_SYNCOOKIE_VALID] interval (with
'last_overflow' being a stale synflood timestamp), then
tcp_synq_no_recent_overflow() still erroneously reports an
overflow. In such cases, we have to rely on syncookie verification
to drop the packet. We unfortunately have no way to differentiate
between a fresh and a stale syncookie timestamp.
In practice, using last_overflow as lower bound is problematic.
If the synflood timestamp is concurrently updated between the time
we read jiffies and the moment we store the timestamp in
'last_overflow', then 'now' becomes smaller than 'last_overflow' and
tcp_synq_no_recent_overflow() returns true, potentially dropping a
valid syncookie.
Reading jiffies after loading the timestamp could fix the problem,
but that'd require a memory barrier. Let's just accommodate for
potential timestamp growth instead and extend the interval using
'last_overflow - HZ' as lower bound.
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If no synflood happens for a long enough period of time, then the
synflood timestamp isn't refreshed and jiffies can advance so much
that time_after32() can't accurately compare them any more.
Therefore, we can end up in a situation where time_after32(now,
last_overflow + HZ) returns false, just because these two values are
too far apart. In that case, the synflood timestamp isn't updated as
it should be, which can trick tcp_synq_no_recent_overflow() into
rejecting valid syncookies.
For example, let's consider the following scenario on a system
with HZ=1000:
* The synflood timestamp is 0, either because that's the timestamp
of the last synflood or, more commonly, because we're working with
a freshly created socket.
* We receive a new SYN, which triggers synflood protection. Let's say
that this happens when jiffies == 2147484649 (that is,
'synflood timestamp' + HZ + 2^31 + 1).
* Then tcp_synq_overflow() doesn't update the synflood timestamp,
because time_after32(2147484649, 1000) returns false.
With:
- 2147484649: the value of jiffies, aka. 'now'.
- 1000: the value of 'last_overflow' + HZ.
* A bit later, we receive the ACK completing the 3WHS. But
cookie_v[46]_check() rejects it because tcp_synq_no_recent_overflow()
says that we're not under synflood. That's because
time_after32(2147484649, 120000) returns false.
With:
- 2147484649: the value of jiffies, aka. 'now'.
- 120000: the value of 'last_overflow' + TCP_SYNCOOKIE_VALID.
Of course, in reality jiffies would have increased a bit, but this
condition will last for the next 119 seconds, which is far enough
to accommodate for jiffie's growth.
Fix this by updating the overflow timestamp whenever jiffies isn't
within the [last_overflow, last_overflow + HZ] range. That shouldn't
have any performance impact since the update still happens at most once
per second.
Now we're guaranteed to have fresh timestamps while under synflood, so
tcp_synq_no_recent_overflow() can safely use it with time_after32() in
such situations.
Stale timestamps can still make tcp_synq_no_recent_overflow() return
the wrong verdict when not under synflood. This will be handled in the
next patch.
For 64 bits architectures, the problem was introduced with the
conversion of ->tw_ts_recent_stamp to 32 bits integer by commit
cca9bab1b7 ("tcp: use monotonic timestamps for PAWS").
The problem has always been there on 32 bits architectures.
Fixes: cca9bab1b7 ("tcp: use monotonic timestamps for PAWS")
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl3pcFUACgkQSD+KveBX
+j4tNgf/VJ1FgeYn4INueeliFebX+8sgk7pA8zI5Wf3s0gzmCLulNeF8sK3sV7o4
U94dHauU1CZxTpI9P2AVJVQ5CnTi2GBjuEKHh1Rz9bVwU3N5D4miShtHUEI1PLna
VhLJiMSU1ffUerL66zgW+QYbPAKSlp+INysr+KeBBOwxoco4C17pGwJ9idpaHYnL
sdVsHVcWmD1fen+eSN/8i4RzqVFTjM3cCsMg1OJn8KuDH+z6bQZ9cOWWLn+hz9lc
uBZHZHvRFRTg1NeN1BFcMEL0RBoG0sTtfJLkVeyQxmLbmTeBPHDBIAJysqjAe1kY
OFvEnq5t3HScuZIxTd5yXQ3rHy1EIg==
=InW3
-----END PGP SIGNATURE-----
Merge tag 'mlx5-fixes-2019-12-05' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
Mellanox, mlx5 fixes 2019-12-05
This series introduces some fixes to mlx5 driver.
Please pull and let me know if there is any problem.
For -stable v4.19:
('net/mlx5e: Query global pause state before setting prio2buffer')
For -stable v5.3
('net/mlx5e: Fix SFF 8472 eeprom length')
('net/mlx5e: Fix translation of link mode into speed')
('net/mlx5e: Fix freeing flow with kfree() and not kvfree()')
('net/mlx5e: ethtool, Fix analysis of speed setting')
('net/mlx5e: Fix TXQ indices to be sequential')
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
We may have found a bug in the nxp/lpc_eth.c driver. The function
platform_set_drvdata() is called twice, the second time it is called,
in lpc_mii_init(), it overwrites the struct net_device which should be
at pdev->dev->driver_data with pldat->mii_bus. When trying to remove
the driver, in lpc_eth_drv_remove(), platform_get_drvdata() will
return the pldat->mii_bus pointer and try to use it as a struct
net_device pointer. This causes unregister_netdev to segfault and
generate a kernel BUG. Is this reproducible?
Signed-off-by: Daniel Martinez <linux@danielsmartinez.com>
Signed-off-by: Bruno Carneiro da Cunha <brunocarneirodacunha@usp.br>
Signed-off-by: David S. Miller <davem@davemloft.net>
Back in 2008, Adam Langley fixed the corner case of packets for flows
having all of the following options : MD5 TS SACK
Since MD5 needs 20 bytes, and TS needs 12 bytes, no sack block
can be cooked from the remaining 8 bytes.
tcp_established_options() correctly sets opts->num_sack_blocks
to zero, but returns 36 instead of 32.
This means TCP cooks packets with 4 extra bytes at the end
of options, containing unitialized bytes.
Fixes: 33ad798c92 ("tcp: options clean up")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John Hurley says:
====================
Ensure egress un/bind are relayed with indirect blocks
On register and unregister for indirect blocks, a command is called that
sends a bind/unbind event to the registering driver. This command assumes
that the bind to indirect block will be on ingress. However, drivers such
as NFP have allowed binding to clsact qdiscs as well as ingress qdiscs
from mainline Linux 5.2. A clsact qdisc binds to an ingress and an egress
block.
Rather than assuming that an indirect bind is always ingress, modify the
function names to remove the ingress tag (patch 1). In cls_api, which is
used by NFP to offload TC flower, generate bind/unbind message for both
ingress and egress blocks on the event of indirectly
registering/unregistering from that block. Doing so mimics the behaviour
of both ingress and clsact qdiscs on initialise and destroy.
This now ensures that drivers such as NFP receive the correct binder type
for the indirect block registration.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When a device is bound to a clsact qdisc, bind events are triggered to
registered drivers for both ingress and egress. However, if a driver
registers to such a device using the indirect block routines then it is
assumed that it is only interested in ingress offload and so only replays
ingress bind/unbind messages.
The NFP driver supports the offload of some egress filters when
registering to a block with qdisc of type clsact. However, on unregister,
if the block is still active, it will not receive an unbind egress
notification which can prevent proper cleanup of other registered
callbacks.
Modify the indirect block callback command in TC to send messages of
ingress and/or egress bind depending on the qdisc in use. NFP currently
supports egress offload for TC flower offload so the changes are only
added to TC.
Fixes: 4d12ba4278 ("nfp: flower: allow offloading of matches on 'internal' ports")
Signed-off-by: John Hurley <john.hurley@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With indirect blocks, a driver can register for callbacks from a device
that is does not 'own', for example, a tunnel device. When registering to
or unregistering from a new device, a callback is triggered to generate
a bind/unbind event. This, in turn, allows the driver to receive any
existing rules or to properly clean up installed rules.
When first added, it was assumed that all indirect block registrations
would be for ingress offloads. However, the NFP driver can, in some
instances, support clsact qdisc binds for egress offload.
Change the name of the indirect block callback command in flow_offload to
remove the 'ingress' identifier from it. While this does not change
functionality, a follow up patch will implement a more more generic
callback than just those currently just supporting ingress offload.
Fixes: 4d12ba4278 ("nfp: flower: allow offloading of matches on 'internal' ports")
Signed-off-by: John Hurley <john.hurley@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dev_hold has to be called always in netdev_queue_add_kobject.
Otherwise usage count drops below 0 in case of failure in
kobject_init_and_add.
Fixes: b8eb718348 ("net-sysfs: Fix reference count leak in rx|netdev_queue_add_kobject")
Reported-by: Hulk Robot <hulkci@huawei.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: David Miller <davem@davemloft.net>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 43e665287f ("net-next: dsa: fix flow dissection") added an
ability to override protocol and network offset during flow dissection
for DSA-enabled devices (i.e. controllers shipped as switch CPU ports)
in order to fix skb hashing for RPS on Rx path.
However, skb_hash() and added part of code can be invoked not only on
Rx, but also on Tx path if we have a multi-queued device and:
- kernel is running on UP system or
- XPS is not configured.
The call stack in this two cases will be like: dev_queue_xmit() ->
__dev_queue_xmit() -> netdev_core_pick_tx() -> netdev_pick_tx() ->
skb_tx_hash() -> skb_get_hash().
The problem is that skbs queued for Tx have both network offset and
correct protocol already set up even after inserting a CPU tag by DSA
tagger, so calling tag_ops->flow_dissect() on this path actually only
breaks flow dissection and hashing.
This can be observed by adding debug prints just before and right after
tag_ops->flow_dissect() call to the related block of code:
Before the patch:
Rx path (RPS):
[ 19.240001] Rx: proto: 0x00f8, nhoff: 0 /* ETH_P_XDSA */
[ 19.244271] tag_ops->flow_dissect()
[ 19.247811] Rx: proto: 0x0800, nhoff: 8 /* ETH_P_IP */
[ 19.215435] Rx: proto: 0x00f8, nhoff: 0 /* ETH_P_XDSA */
[ 19.219746] tag_ops->flow_dissect()
[ 19.223241] Rx: proto: 0x0806, nhoff: 8 /* ETH_P_ARP */
[ 18.654057] Rx: proto: 0x00f8, nhoff: 0 /* ETH_P_XDSA */
[ 18.658332] tag_ops->flow_dissect()
[ 18.661826] Rx: proto: 0x8100, nhoff: 8 /* ETH_P_8021Q */
Tx path (UP system):
[ 18.759560] Tx: proto: 0x0800, nhoff: 26 /* ETH_P_IP */
[ 18.763933] tag_ops->flow_dissect()
[ 18.767485] Tx: proto: 0x920b, nhoff: 34 /* junk */
[ 22.800020] Tx: proto: 0x0806, nhoff: 26 /* ETH_P_ARP */
[ 22.804392] tag_ops->flow_dissect()
[ 22.807921] Tx: proto: 0x920b, nhoff: 34 /* junk */
[ 16.898342] Tx: proto: 0x86dd, nhoff: 26 /* ETH_P_IPV6 */
[ 16.902705] tag_ops->flow_dissect()
[ 16.906227] Tx: proto: 0x920b, nhoff: 34 /* junk */
After:
Rx path (RPS):
[ 16.520993] Rx: proto: 0x00f8, nhoff: 0 /* ETH_P_XDSA */
[ 16.525260] tag_ops->flow_dissect()
[ 16.528808] Rx: proto: 0x0800, nhoff: 8 /* ETH_P_IP */
[ 15.484807] Rx: proto: 0x00f8, nhoff: 0 /* ETH_P_XDSA */
[ 15.490417] tag_ops->flow_dissect()
[ 15.495223] Rx: proto: 0x0806, nhoff: 8 /* ETH_P_ARP */
[ 17.134621] Rx: proto: 0x00f8, nhoff: 0 /* ETH_P_XDSA */
[ 17.138895] tag_ops->flow_dissect()
[ 17.142388] Rx: proto: 0x8100, nhoff: 8 /* ETH_P_8021Q */
Tx path (UP system):
[ 15.499558] Tx: proto: 0x0800, nhoff: 26 /* ETH_P_IP */
[ 20.664689] Tx: proto: 0x0806, nhoff: 26 /* ETH_P_ARP */
[ 18.565782] Tx: proto: 0x86dd, nhoff: 26 /* ETH_P_IPV6 */
In order to fix that we can add the check 'proto == htons(ETH_P_XDSA)'
to prevent code from calling tag_ops->flow_dissect() on Tx.
I also decided to initialize 'offset' variable so tagger callbacks can
now safely leave it untouched without provoking a chaos.
Fixes: 43e665287f ("net-next: dsa: fix flow dissection")
Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ENOTSUPP is not available in userspace, for example:
setsockopt failed, 524, Unknown error 524
Signed-off-by: Valentin Vidic <vvidic@valentin-vidic.from.hr>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CONFIG_RETPOLINE=y made indirect calls expensive.
gcc seems to add an indirect call in ____sys_recvmsg().
Rewriting the code slightly makes sure to avoid this indirection.
Alternative would be to not call sock_recvmsg() and instead
use security_socket_recvmsg() and sock_recvmsg_nosec(),
but this is less readable IMO.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: David Laight <David.Laight@aculab.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver forgets to call pci_release_regions() in remove like that
in probe failure.
Add the missed call to fix it.
Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When user runs a command like
tc qdisc add dev eth1 root mqprio
KASAN stack-out-of-bounds warning is emitted.
Currently, NLA_ALIGN macro used in mqprio_dump provides too large
buffer size as argument for nla_put and memcpy down the call stack.
The flow looks like this:
1. nla_put expects exact object size as an argument;
2. Later it provides this size to memcpy;
3. To calculate correct padding for SKB, nla_put applies NLA_ALIGN
macro itself.
Therefore, NLA_ALIGN should not be applied to the nla_put parameter.
Otherwise it will lead to out-of-bounds memory access in memcpy.
Fixes: 4e8b86c062 ("mqprio: Introduce new hardware offload mode and shaper in mqprio")
Signed-off-by: Vladyslav Tarasiuk <vladyslavt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Refer to the databook of DesignWare Cores Ethernet MAC Universal:
6.2.1.5 Register 4 (Transmit Descriptor List Address Register
If this register is not changed when the ST bit is set to 0, then
the DMA takes the descriptor address where it was stopped earlier.
The stmmac_tx_err() does zero indices to Tx descriptors, but does
not reset HW current Tx descriptor address. To fix inconsistency,
the base address of the Tx descriptors should be rewritten before
restarting Tx.
Signed-off-by: Jongsung Kim <neidhard.kim@lge.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The EEE support has not been enabled on ENETC, but it may connect
to a PHY which supports EEE and advertises EEE by default, while
its link partner also advertises EEE. If this happens, the PHY enters
low power mode when the traffic rate is low and causes packet loss.
This patch disables EEE advertisement by default for any PHY that
ENETC connects to, to prevent the above unwanted outcome.
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov says:
====================
pull-request: bpf 2019-12-05
The following pull-request contains BPF updates for your *net* tree.
We've added 6 non-merge commits during the last 1 day(s) which contain
a total of 14 files changed, 116 insertions(+), 37 deletions(-).
The main changes are:
1) three selftests fixes, from Stanislav.
2) one samples fix, from Jesper.
3) one verifier fix, from Yonghong.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
sock_fprog_kern::len is in units of struct sock_filter, not bytes.
Fixes: 3e859adf36 ("compat_ioctl: unify copy-in of ppp filters")
Reported-by: syzbot+eb853b51b10f1befa0b7@syzkaller.appspotmail.com
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huazhong Tan says:
====================
net: hns3: fixes for -net
This patchset includes misc fixes for the HNS3 ethernet driver.
[patch 1/3] fixes a TX queue not restarted problem.
[patch 2/3] fixes a use-after-free issue.
[patch 3/3] fixes a VF ID issue for setting VF VLAN.
change log:
V1->V2: keeps 'ring' as parameter in hns3_nic_maybe_stop_tx()
in [patch 1/3], suggestted by David.
rewrites [patch 2/3]'s commit log to make it be easier
to understand, suggestted by David.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously, when set VF VLAN with command "ip link set <pf name>
vf <vf id> vlan <vlan id>", the VF ID 0 is handled as PF incorrectly,
which should be the first VF. This patch fixes it.
Fixes: 21e043cd81 ("net: hns3: fix set port based VLAN for PF")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, hns3_nic_maybe_stop_tx() uses skb_copy() to linearize a
SKB if the BD num required by the SKB does not meet the hardware
limitation, and it linearizes the SKB by allocating a new linearized SKB
and freeing the old SKB, if hns3_nic_maybe_stop_tx() returns -EBUSY
because there are no enough space in the ring to send the linearized
skb to hardware, the sch_direct_xmit() still hold reference to old SKB
and try to retransmit the old SKB when dev_hard_start_xmit() return
TX_BUSY, which may cause use after freed problem.
This patch fixes it by using __skb_linearize() to linearize the
SKB in hns3_nic_maybe_stop_tx().
Fixes: 51e8439f34 ("net: hns3: add 8 BD limit for tx flow")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is timing window between ring_space checking and
netif_stop_subqueue when transmiting a SKB, and the TX BD
cleaning may be executed during the time window, which may
caused TX queue not restarted problem.
This patch fixes it by rechecking the ring_space after
netif_stop_subqueue to make sure TX queue is restarted.
Also, the ring->next_to_clean is updated even when pkts is
zero, because all the TX BD cleaned may be non-SKB, so it
needs to check if TX queue need to be restarted.
Fixes: 76ad4f0ee7 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace "select NET_SWITCHDEV" vs "depends on NET_SWITCHDEV" to fix Kconfig
warning with CONFIG_COMPILE_TEST=y
WARNING: unmet direct dependencies detected for NET_SWITCHDEV
Depends on [n]: NET [=y] && INET [=n]
Selected by [y]:
- TI_CPSW_SWITCHDEV [=y] && NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_TI [=y] && (ARCH_DAVINCI || ARCH_OMAP2PLUS || COMPILE_TEST [=y])
because TI_CPSW_SWITCHDEV blindly selects NET_SWITCHDEV even though
INET is not set/enabled, while NET_SWITCHDEV depends on INET.
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Fixes: ed3525eda4 ("net: ethernet: ti: introduce cpsw switchdev based driver part 1 - dual-emac")
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Signed-off-by: David S. Miller <davem@davemloft.net>
In cited commit, when prio tag mode is enabled, FTE creation fails
due to missing group with valid match criteria.
Hence,
(a) create prio tag group metadata_prio_tag_grp when prio tag is
enabled with match criteria for vlan push FTE.
(b) Rename metadata_grp to metadata_allmatch_grp to reflect its purpose.
Also when priority tag is enabled, delete metadata settings after
deleting ingress rules, which are using it.
Tide up rest of the ingress config code for unnecessary labels.
Fixes: 10652f3994 ("net/mlx5: Refactor ingress acl configuration")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Eli Britstein <elibr@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When setting speed to 100G via ethtool (AN is set to off), only 25G*4 is
configured while the user, who has an advanced HW which supports
extended PTYS, expects also 50G*2 to be configured.
With this patch, when extended PTYS mode is available, configure
PTYS via extended fields.
Fixes: 4b95840a6c ("net/mlx5e: Fix matching of speed to PRM link modes")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add a missing value in translation of PTYS ext_eth_proto_oper to its
corresponding speed. When ext_eth_proto_oper bit 10 is set, ethtool
shows unknown speed. With this fix, ethtool shows speed is 100G as
expected.
Fixes: a08b4ed137 ("net/mlx5: Add support to ext_* fields introduced in Port Type and Speed register")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
It could be neigh update flow took a refcount on peer flow so
sometimes we cannot release peer flow even if parent flow is
being freed now.
Fixes: 5a7e5bcb66 ("net/mlx5e: Extend tc flow struct with reference counter")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Eli Britstein <elibr@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Flows are allocated with kzalloc() so free with kfree().
Fixes: 04de7dda73 ("net/mlx5e: Infrastructure for duplicated offloading of TC flows")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Eli Britstein <elibr@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
SFF 8472 eeprom length is 512 bytes. Fix module info return value to
support 512 bytes read.
Fixes: ace329f4ab ("net/mlx5e: ethtool, Remove unsupported SFP EEPROM high pages query")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When the user changes prio2buffer mapping while global pause is
enabled, mlx5 driver incorrectly sets all active buffers
(buffer that has at least one priority mapped) to lossy.
Solution:
If global pause is enabled, set all the active buffers to lossless
in prio2buffer command.
Also, add error message when buffer size is not enough to meet
xoff threshold.
Fixes: 0696d60853 ("net/mlx5e: Receive buffer configuration")
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>