Pull networking fixes from David Miller:
1) Fix deadlock in bpf_send_signal() from Yonghong Song.
2) Fix off by one in kTLS offload of mlx5, from Tariq Toukan.
3) Add missing locking in iwlwifi mvm code, from Avraham Stern.
4) Fix MSG_WAITALL handling in rxrpc, from David Howells.
5) Need to hold RTNL mutex in tcindex_partial_destroy_work(), from Cong
Wang.
6) Fix producer race condition in AF_PACKET, from Willem de Bruijn.
7) cls_route removes the wrong filter during change operations, from
Cong Wang.
8) Reject unrecognized request flags in ethtool netlink code, from
Michal Kubecek.
9) Need to keep MAC in reset until PHY is up in bcmgenet driver, from
Doug Berger.
10) Don't leak ct zone template in act_ct during replace, from Paul
Blakey.
11) Fix flushing of offloaded netfilter flowtable flows, also from Paul
Blakey.
12) Fix throughput drop during tx backpressure in cxgb4, from Rahul
Lakkireddy.
13) Don't let a non-NULL skb->dev leave the TCP stack, from Eric
Dumazet.
14) TCP_QUEUE_SEQ socket option has to update tp->copied_seq as well,
also from Eric Dumazet.
15) Restrict macsec to ethernet devices, from Willem de Bruijn.
16) Fix reference leak in some ethtool *_SET handlers, from Michal
Kubecek.
17) Fix accidental disabling of MSI for some r8169 chips, from Heiner
Kallweit.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (138 commits)
net: Fix CONFIG_NET_CLS_ACT=n and CONFIG_NFT_FWD_NETDEV={y, m} build
net: ena: Add PCI shutdown handler to allow safe kexec
selftests/net/forwarding: define libs as TEST_PROGS_EXTENDED
selftests/net: add missing tests to Makefile
r8169: re-enable MSI on RTL8168c
net: phy: mdio-bcm-unimac: Fix clock handling
cxgb4/ptp: pass the sign of offset delta in FW CMD
net: dsa: tag_8021q: replace dsa_8021q_remove_header with __skb_vlan_pop
net: cbs: Fix software cbs to consider packet sending time
net/mlx5e: Do not recover from a non-fatal syndrome
net/mlx5e: Fix ICOSQ recovery flow with Striding RQ
net/mlx5e: Fix missing reset of SW metadata in Striding RQ reset
net/mlx5e: Enhance ICOSQ WQE info fields
net/mlx5_core: Set IB capability mask1 to fix ib_srpt connection failure
selftests: netfilter: add nfqueue test case
netfilter: nft_fwd_netdev: allow to redirect to ifb via ingress
netfilter: nft_fwd_netdev: validate family and chain type
netfilter: nft_set_rbtree: Detect partial overlaps on insertion
netfilter: nft_set_rbtree: Introduce and use nft_rbtree_interval_start()
netfilter: nft_set_pipapo: Separate partial and complete overlap cases on insertion
...
- One core quirk by myself to fix the .irq_disable()
semantics when the gpiolib core takes over this callback.
- The rest is an elaborate series of 4 patches fixing Intel
laptop ACPI wakeup quirks.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEElDRnuGcz/wPCXQWMQRCzN7AZXXMFAl57qSIACgkQQRCzN7AZ
XXMmqxAAiggbsjiMkufxHxDqKl0tLRkbQwlsnz5VQlH0XCPBYSRidbIfQdiDFTbD
yoBENIYeW/9aVlb0Q7a2T5H4zy1Ye3MiqObZt97ECr7DyxwNrAbPS/IgOFmsjSmu
C4HZgX6/fQpBlJnKVgMBwzdgKShX4Fb4l1gFnY7uRRjBGVkp5ZArqJALJ747zFR0
yj/kpA+hGgO24AJspF+QaLFfKkjMXKv2zgaa3HEtsYqUBH/3yBZigLKJs3koXHYB
3c7EbWxnpYLiPi/T51aN7ZbbNX7Ag6wW6rLm/6460MCYDRSOB+7fCJCZc2KZJomr
4ntq1HHxSrjOiqce+MSvc+WLI6pqjN73tyWJwhFgBfwr32Ara5FR6CO0KJT9M5d+
0i8Na0Dyp2LlZM530ygOOn48p+tiU6NVoR/BXUnXBBuYtFsOmfTtrQGh++Uw+rcJ
wraFAsSzzr7JmXzh+tUEeGGdI4EYy1T6+DppEfSbyty9W6YnWvqHP9xcyuicUyi2
g9Btl0hRhSqQZozDYjROPQox1qjUKHLnI66n/W9pVUN61GlFlSH/38yRAEpD0DZW
GlzSEkfL5SN5DQjaAOC2z0z+kfFSGdtCYMOKkyl+cKGbLCRc43PrzHB3oH+wF1Mu
0J+Hilrk9y4sGTX67jIQPV/t1qhoorRFzihuUkHBdGt//XCqtI4=
=+dEG
-----END PGP SIGNATURE-----
Merge tag 'gpio-v5.6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO fixes from Linus Walleij:
- One core quirk by myself to fix the .irq_disable() semantics when the
gpiolib core takes over this callback.
- The rest is an elaborate series of four patches fixing Intel laptop
ACPI wakeup quirks.
* tag 'gpio-v5.6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpiolib: acpi: Add quirk to ignore EC wakeups on HP x2 10 CHT + AXP288 model
gpiolib: acpi: Add quirk to ignore EC wakeups on HP x2 10 BYT + AXP288 model
gpiolib: acpi: Rework honor_wakeup option into an ignore_wake option
gpiolib: acpi: Correct comment for HP x2 10 honor_wakeup quirk
gpiolib: Fix irq_disable() semantics
Fourth, and last, set of fixes for v5.6. Just two important fixes to
iwlwifi regressions.
iwlwifi
* fix GEO_TX_POWER_LIMIT command on certain devices which caused
firmware to crash during initialisation
* add back device ids for three devices which were accidentally
removed
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJee7S6AAoJEG4XJFUm622bcH4IAICV95Y79p9loKZVwXBrguiK
5D9rBNZcSc0Yo9AwZkhatRAJYiLu19hw2qra3YsQnaWpjESmnZtV6/ZASDcpOGSP
NaI4TMLt57dhnpqhpglvGeM9PlUbaoqX7Sl5LbhnSQoKkk7rppnk90KqbXl7SDba
4eG7+i4iMc69uw4BttVh/lfgAH0YyYpgStWi9ccoQ2ip9SrljFZwmCF7q+3/25OR
Hyty44U90BBrsi+LrRuhutR2LuR+TfXEWmtXWRzOpnwmicY8sCpeEVCMeeVy59TR
y66sjlhM5w+N1mZbzBrqUZJLdMGInljx4G4DIvvi8jbD6rVLZc8zAeZMLjO24ZM=
=CFbE
-----END PGP SIGNATURE-----
Merge tag 'wireless-drivers-2020-03-25' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
Kalle Valo says:
====================
wireless-drivers fixes for v5.6
Fourth, and last, set of fixes for v5.6. Just two important fixes to
iwlwifi regressions.
iwlwifi
* fix GEO_TX_POWER_LIMIT command on certain devices which caused
firmware to crash during initialisation
* add back device ids for three devices which were accidentally
removed
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
net/netfilter/nft_fwd_netdev.c: In function ‘nft_fwd_netdev_eval’:
net/netfilter/nft_fwd_netdev.c:32:10: error: ‘struct sk_buff’ has no member named ‘tc_redirected’
pkt->skb->tc_redirected = 1;
^~
net/netfilter/nft_fwd_netdev.c:33:10: error: ‘struct sk_buff’ has no member named ‘tc_from_ingress’
pkt->skb->tc_from_ingress = 1;
^~
To avoid a direct dependency with tc actions from netfilter, wrap the
redirect bits around CONFIG_NET_REDIRECT and move helpers to
include/linux/skbuff.h. Turn on this toggle from the ifb driver, the
only existing client of these bits in the tree.
This patch adds skb_set_redirected() that sets on the redirected bit
on the skbuff, it specifies if the packet was redirect from ingress
and resets the timestamp (timestamp reset was originally missing in the
netfilter bugfix).
Fixes: bcfabee1af ("netfilter: nft_fwd_netdev: allow to redirect to ifb via ingress")
Reported-by: noreply@ellerman.id.au
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently ENA only provides the PCI remove() handler, used during rmmod
for example. This is not called on shutdown/kexec path; we are potentially
creating a failure scenario on kexec:
(a) Kexec is triggered, no shutdown() / remove() handler is called for ENA;
instead pci_device_shutdown() clears the master bit of the PCI device,
stopping all DMA transactions;
(b) Kexec reboot happens and the device gets enabled again, likely having
its FW with that DMA transaction buffered; then it may trigger the (now
invalid) memory operation in the new kernel, corrupting kernel memory area.
This patch aims to prevent this, by implementing a shutdown() handler
quite similar to the remove() one - the difference being the handling
of the netdev, which is unregistered on remove(), but following the
convention observed in other drivers, it's only detached on shutdown().
This prevents an odd issue in AWS Nitro instances, in which after the 2nd
kexec the next one will fail with an initrd corruption, caused by a wild
DMA write to invalid kernel memory. The lspci output for the adapter
present in my instance is:
00:05.0 Ethernet controller [0200]: Amazon.com, Inc. Elastic Network
Adapter (ENA) [1d0f:ec20]
Suggested-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com>
Acked-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The lib files should not be defined as TEST_PROGS, or we will run them
in run_kselftest.sh.
Also remove ethtool_lib.sh exec permission.
Fixes: 81573b18f2 ("selftests/net/forwarding: add Makefile to install tests")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Find some tests are missed in Makefile by running:
for file in $(ls *.sh); do grep -q $file Makefile || echo $file; done
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A single fix in this pull request to correctly handle the size of
read-only zone files (from me).
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQSRPv8tYSvhwAzJdzjdoc3SxdoYdgUCXnrJJAAKCRDdoc3SxdoY
dgZXAQDK88T4sdtFq1Fl1PuP+oyzHml+xgNo0djZQOdicnD6qQD8CgMGDFQQG4dv
Ral+67qEyvUABGt0Vkmy29wuN8El6wQ=
=+1D9
-----END PGP SIGNATURE-----
Merge tag 'zonefs-5.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs
Pull zonefs fix from Damien Le Moal:
"A single fix from me to correctly handle the size of read-only zone
files"
* tag 'zonefs-5.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs:
zonfs: Fix handling of read-only zones
The write pointer of zones in the read-only consition is defined as
invalid by the SCSI ZBC and ATA ZAC specifications. It is thus not
possible to determine the correct size of a read-only zone file on
mount. Fix this by handling read-only zones in the same manner as
offline zones by disabling all accesses to the zone (read and write)
and initializing the inode size of the read-only zone to 0).
For zones found to be in the read-only condition at runtime, only
disable write access to the zone and keep the size of the zone file to
its last updated value to allow the user to recover previously written
data.
Also fix zonefs documentation file to reflect this change.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
1) A new selftest for nf_queue, from Florian Westphal. This test
covers two recent fixes: 07f8e4d0fd ("tcp: also NULL skb->dev
when copy was needed") and b738a185be ("tcp: ensure skb->dev is
NULL before leaving TCP stack").
2) The fwd action breaks with ifb. For safety in next extensions,
make sure the fwd action only runs from ingress until it is extended
to be used from a different hook.
3) The pipapo set type now reports EEXIST in case of subrange overlaps.
Update the rbtree set to validate range overlaps, so far this
validation is only done only from userspace. From Stefano Brivio.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl56fu0ACgkQSD+KveBX
+j4g1Qf/edWPTCMGK4eb0jBPUvnxkoGPYj4cq0tfeUaY7Q4r95RzNjuW3gTotzGV
JZymoC2OoWQxUR2Ye0FkM1C/RQFIAHinEX/KFOMJ6PL+k4+micXeIGNfVo3aflO0
kaTcBdgZKqFS5hpRtWZc/DVRWckqJYtaAJEFliQbYGwmfiZNoNr0/ZeU+/DX2dHn
bQkRHQZ3Zq43P4FhVBSyrfmsxUI71k7GtCdJ5G4i80e8qCCARKZDx7q1FRC0k6fh
a84+7NxpFRkl+kT+se/bxcQaFht49YSJVauGMKK8Ae+pz0XEaNrYsFz9zQY/s7W6
4a62hzlHuVmUwteZfH+secZzCnOkUw==
=rO1d
-----END PGP SIGNATURE-----
Merge tag 'mlx5-fixes-2020-03-24' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
Mellanox, mlx5 fixes 2020-03-24
This series introduces some fixes to mlx5 driver.
From Aya, Fixes to the RX error recovery flows
From Leon, Fix IB capability mask
Please pull and let me know if there is any problem.
For -stable v5.5
('net/mlx5_core: Set IB capability mask1 to fix ib_srpt connection failure')
For -stable v5.4
('net/mlx5e: Fix ICOSQ recovery flow with Striding RQ')
('net/mlx5e: Do not recover from a non-fatal syndrome')
('net/mlx5e: Fix missing reset of SW metadata in Striding RQ reset')
('net/mlx5e: Enhance ICOSQ WQE info fields')
The above patch ('net/mlx5e: Enhance ICOSQ WQE info fields')
will fail to apply cleanly on v5.4 due to a trivial contextual conflict,
but it is an important fix, do I need to do something about it or just
assume Greg will know how to handle this ?
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The original change fixed an issue on RTL8168b by mimicking the vendor
driver behavior to disable MSI on chip versions before RTL8168d.
This however now caused an issue on a system with RTL8168c, see [0].
Therefore leave MSI disabled on RTL8168b, but re-enable it on RTL8168c.
[0] https://bugzilla.redhat.com/show_bug.cgi?id=1792839
Fixes: 003bd5b4a7 ("r8169: don't use MSI before RTL8168d")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The DT binding for this PHY describes an *optional* clock property.
Due to a bug in the error handling logic, we are actually ignoring this
clock *all* of the time so far.
Fix this by using devm_clk_get_optional() to handle this clock properly.
Fixes: b78ac6ecd1 ("net: phy: mdio-bcm-unimac: Allow configuring MDIO clock divider")
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
cxgb4_ptp_fineadjtime() doesn't pass the signedness of offset delta
in FW_PTP_CMD. Fix it by passing correct sign.
Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Not only did this wheel did not need reinventing, but there is also
an issue with it: It doesn't remove the VLAN header in a way that
preserves the L2 payload checksum when that is being provided by the DSA
master hw. It should recalculate checksum both for the push, before
removing the header, and for the pull afterwards. But the current
implementation is quite dizzying, with pulls followed immediately
afterwards by pushes, the memmove is done before the push, etc. This
makes a DSA master with RX checksumming offload to print stack traces
with the infamous 'hw csum failure' message.
So remove the dsa_8021q_remove_header function and replace it with
something that actually works with inet checksumming.
Fixes: d461933638 ("net: dsa: tag_8021q: Create helper function for removing VLAN header")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently the software CBS does not consider the packet sending time
when depleting the credits. It caused the throughput to be
Idleslope[kbps] * (Port transmit rate[kbps] / |Sendslope[kbps]|) where
Idleslope * (Port transmit rate / (Idleslope + |Sendslope|)) = Idleslope
is expected. In order to fix the issue above, this patch takes the time
when the packet sending completes into account by moving the anchor time
variable "last" ahead to the send completion time upon transmission and
adding wait when the next dequeue request comes before the send
completion time of the previous packet.
changelog:
V2->V3:
- remove unnecessary whitespace cleanup
- add the checks if port_rate is 0 before division
V1->V2:
- combine variable "send_completed" into "last"
- add the comment for estimate of the packet sending
Fixes: 585d763af0 ("net/sched: Introduce Credit Based Shaper (CBS) qdisc")
Signed-off-by: Zh-yuan Ye <ye.zh-yuan@socionext.com>
Reviewed-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For non-fatal syndromes like LOCAL_LENGTH_ERR, recovery shouldn't be
triggered. In these scenarios, the RQ is not actually in ERR state.
This misleads the recovery flow which assumes that the RQ is really in
error state and no more completions arrive, causing crashes on bad page
state.
Fixes: 8276ea1353 ("net/mlx5e: Report and recover from CQE with error on RQ")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
In striding RQ mode, the buffers of an RX WQE are first
prepared and posted to the HW using a UMR WQEs via the ICOSQ.
We maintain the state of these in-progress WQEs in the RQ
SW struct.
In the flow of ICOSQ recovery, the corresponding RQ is not
in error state, hence:
- The buffers of the in-progress WQEs must be released
and the RQ metadata should reflect it.
- Existing RX WQEs in the RQ should not be affected.
For this, wrap the dealloc of the in-progress WQEs in
a function, and use it in the ICOSQ recovery flow
instead of mlx5e_free_rx_descs().
Fixes: be5323c837 ("net/mlx5e: Report and recover from CQE error on ICOSQ")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When resetting the RQ (moving RQ state from RST to RDY), the driver
resets the WQ's SW metadata.
In striding RQ mode, we maintain a field that reflects the actual
expected WQ head (including in progress WQEs posted to the ICOSQ).
It was mistakenly not reset together with the WQ. Fix this here.
Fixes: 8276ea1353 ("net/mlx5e: Report and recover from CQE with error on RQ")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add number of WQEBBs (WQE's Basic Block) to WQE info struct. Set the
number of WQEBBs on WQE post, and increment the consumer counter (cc)
on completion.
In case of error completions, the cc was mistakenly not incremented,
keeping a gap between cc and pc (producer counter). This failed the
recovery flow on the ICOSQ from a CQE error which timed-out waiting for
the cc and pc to meet.
Fixes: be5323c837 ("net/mlx5e: Report and recover from CQE error on ICOSQ")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The cap_mask1 isn't protected by field_select and not listed among RW
fields, but it is required to be written to properly initialize ports
in IB virtualization mode.
Link: https://lore.kernel.org/linux-rdma/88bab94d2fd72f3145835b4518bc63dda587add6.camel@redhat.com
Fixes: ab118da4c1 ("net/mlx5: Don't write read-only fields in MODIFY_HCA_VPORT_CONTEXT command")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add a test case to check nf queue infrastructure.
Could be extended in the future to also cover serialization of
conntrack, uid and secctx attributes in nfqueue.
For now, this checks that 'queue bypass' works, that a queue rule with
no bypass option blocks traffic and that userspace receives the expected
number of packets.
For this we add two queues and hook all of
prerouting/input/forward/output/postrouting.
Packets get queued twice with a dummy base chain in between:
This passes with current nf tree, but reverting
commit 946c0d8e6e ("netfilter: nf_queue: fix reinject verdict handling")
makes this trip (it processes 30 instead of expected 20 packets).
v2: update config file with queue and other options missing/needed for
other tests.
v3: also test with tcp, this reveals problem with commit
28f8bfd1ac ("netfilter: Support iif matches in POSTROUTING"), due to
skb->dev pointing at another skb in the retransmit rbtree (skb->dev
aliases to rbnode child).
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Set skb->tc_redirected to 1, otherwise the ifb driver drops the packet.
Set skb->tc_from_ingress to 1 to reinject the packet back to the ingress
path after leaving the ifb egress path.
This patch inconditionally sets on these two skb fields that are
meaningful to the ifb driver. The existing forward action is guaranteed
to run from ingress path.
Fixes: 39e6dea28a ("netfilter: nf_tables: add forward expression to the netdev family")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Make sure the forward action is only used from ingress.
Fixes: 39e6dea28a ("netfilter: nf_tables: add forward expression to the netdev family")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
...and return -ENOTEMPTY to the front-end in this case, instead of
proceeding. Currently, nft takes care of checking for these cases
and not sending them to the kernel, but if we drop the set_overlap()
call in nft we can end up in situations like:
# nft add table t
# nft add set t s '{ type inet_service ; flags interval ; }'
# nft add element t s '{ 1 - 5 }'
# nft add element t s '{ 6 - 10 }'
# nft add element t s '{ 4 - 7 }'
# nft list set t s
table ip t {
set s {
type inet_service
flags interval
elements = { 1-3, 4-5, 6-7 }
}
}
This change has the primary purpose of making the behaviour
consistent with nft_set_pipapo, but is also functional to avoid
inconsistent behaviour if userspace sends overlapping elements for
any reason.
v2: When we meet the same key data in the tree, as start element while
inserting an end element, or as end element while inserting a start
element, actually check that the existing element is active, before
resetting the overlap flag (Pablo Neira Ayuso)
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Replace negations of nft_rbtree_interval_end() with a new helper,
nft_rbtree_interval_start(), wherever this helps to visualise the
problem at hand, that is, for all the occurrences except for the
comparison against given flags in __nft_rbtree_get().
This gets especially useful in the next patch.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
...and return -ENOTEMPTY to the front-end on collision, -EEXIST if
an identical element already exists. Together with the previous patch,
element collision will now be returned to the user as -EEXIST.
Reported-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Currently, the -EEXIST return code of ->insert() callbacks is ambiguous: it
might indicate that a given element (including intervals) already exists as
such, or that the new element would clash with existing ones.
If identical elements already exist, the front-end is ignoring this without
returning error, in case NLM_F_EXCL is not set. However, if the new element
can't be inserted due an overlap, we should report this to the user.
To this purpose, allow set back-ends to return -ENOTEMPTY on collision with
existing elements, translate that to -EEXIST, and return that to userspace,
no matter if NLM_F_EXCL was set.
Reported-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pull perf tooling fixes from Ingo Molnar:
"A handful of tooling fixes all across the map, no kernel changes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tools headers uapi: Update linux/in.h copy
perf probe: Do not depend on dwfl_module_addrsym()
perf probe: Fix to delete multiple probe event
perf parse-events: Fix reading of invalid memory in event parsing
perf python: Fix clang detection when using CC=clang-version
perf map: Fix off by one in strncpy() size argument
tools: Let O= makes handle a relative path with -C option
Pull x86 fix from Ingo Molnar:
"A build fix with certain Kconfig combinations"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/ioremap: Fix CONFIG_EFI=n build
Late fixes in dmaengine for v5.6:
- move .device_release missing log warning to debug
- couple of maintainer entries for HiSilicon and IADX drivers
- one off fix for idxd driver
- documentation warn fixes
- TI k3 dma error handling fix
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+vs47OPLdNbVcHzyfBQHDyUjg0cFAl55mjUACgkQfBQHDyUj
g0dnIA//eHgGyhvuLWkKunLH2+x5YLYL0ZzWdldW6dbsHaxB59/j1aO325KWXiXB
c2GKGxqRzoNc9i4V5k6cDWjml34z9x44HGUhySyOqE3MkNzM4STjtclePE/DDWaV
QU2zxd7cg3QkP0q3WFaJtw7ffObwJyJqL2GbXcLbfEw731XyjsV3qZvrcHcHkiro
X8taSqlVhZEBc6aGQRNQijWYVH/a/SK2kqo79zv1r24EEkvId3f2k0/ZsHT9r/tD
M2+guUvPEWuL7hUUuhul++7tauvi0Klvil7Ye6HRaUWDjJ1UBW5bnQXzVJMzKoRv
n8BJXet6owIWucHJijqifRDkKJDAg6XIT97ado/jpoH11xtmRqFF+85uPOF2MQSR
Ko2CtsHZjk02XmrVBpqesW8vN2iWlpeaG5lKtDyvwMpOR1b/iTs29LEIZkHtpm1z
kA5/w4MEZF9jP1up7dTIs7rQJsArFhfh6hKUahWu9FdaHg8VufmtiRUW71NTfY9z
pM5PNL7+2dq6BBVnxobrSN1GybR4NEie37xKZnF5JPKG6/qUl9WCciRxcfJjNxFz
qip4Bm39elXTzFZ7S/U7qJyB+/vTbKMldDj8YmUa57jev+8pfjp9Pfqja/V2SIHx
IiLG9ugpQFxXRnGOdo4LDjh6ute9sZvTdSrpKUT4J+7E9qFNuAk=
=QaCh
-----END PGP SIGNATURE-----
Merge tag 'dmaengine-fix-5.6' of git://git.infradead.org/users/vkoul/slave-dma
Pull dmaengine fixes from Vinod Koul:
"Late fixes in dmaengine for v5.6:
- move .device_release missing log warning to debug
- couple of maintainer entries for HiSilicon and IADX drivers
- off-by-one fix for idxd driver
- documentation warning fixes
- TI k3 dma error handling fix"
* tag 'dmaengine-fix-5.6' of git://git.infradead.org/users/vkoul/slave-dma:
dmaengine: ti: k3-udma-glue: Fix an error handling path in 'k3_udma_glue_cfg_rx_flow()'
MAINTAINERS: Add maintainer for HiSilicon DMA engine driver
dmaengine: idxd: fix off by one on cdev dwq refcount
MAINTAINERS: rectify the INTEL IADX DRIVER entry
dmaengine: move .device_release missing log warning to debug level
docs: dmaengine: provider.rst: get rid of some warnings
There are at least 3 models of the HP x2 10 models:
Bay Trail SoC + AXP288 PMIC
Cherry Trail SoC + AXP288 PMIC
Cherry Trail SoC + TI PMIC
Like on the other HP x2 10 models we need to ignore wakeup for ACPI GPIO
events on the external embedded-controller pin to avoid spurious wakeups
on the HP x2 10 CHT + AXP288 model too.
This commit adds an extra DMI based quirk for the HP x2 10 CHT + AXP288
model, ignoring wakeups for ACPI GPIO events on the EC interrupt pin
on this model. This fixes spurious wakeups from suspend on this model.
Fixes: aa23ca3d98 ("gpiolib: acpi: Add honor_wakeup module-option + quirk mechanism")
Reported-and-tested-by: Marc Lehmann <schmorp@schmorp.de>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20200302111225.6641-4-hdegoede@redhat.com
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Add missing Makefile for net/forwarding tests and include it to
the targets list, otherwise forwarding tests are not installed
in case of cross-compilation.
Signed-off-by: Vadym Kochan <vadym.kochan@plvision.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew noticed that some handlers for *_SET commands leak a netdev
reference if required ethtool_ops callbacks do not exist. A simple
reproducer would be e.g.
ip link add veth1 type veth peer name veth2
ethtool -s veth1 wol g
ip link del veth1
Make sure dev_put() is called when ethtool_ops check fails.
v2: add Fixes tags
Fixes: a53f3d41e4 ("ethtool: set link settings with LINKINFO_SET request")
Fixes: bfbcfe2032 ("ethtool: set link modes related data with LINKMODES_SET request")
Fixes: e54d04e3af ("ethtool: set message mask with DEBUG_SET request")
Fixes: 8d425b19b3 ("ethtool: set wake-on-lan settings with WOL_SET request")
Reported-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When both the switch and the bridge are learning about new addresses,
switch ports attached to the bridge would see duplicate ARP frames
because both entities would attempt to send them.
Fixes: 5037d532b8 ("net: dsa: add Broadcom tag RX/TX handler")
Reported-by: Maxime Bizon <mbizon@freebox.fr>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan says:
====================
bnxt_en: Bug fixes.
5 bug fix patches covering an indexing bug for priority counters, memory
leak when retrieving DCB ETS settings, error path return code, proper
disabling of PCI before freeing context memory, and proper ring accounting
in error path.
Please also apply these to -stable. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
If ring counts are not reset when ring reservation fails,
bnxt_init_dflt_ring_mode() will not be called again to reinitialise
IRQs when open() is called and results in system crash as napi will
also be not initialised. This patch fixes it by resetting the ring
counts.
Fixes: 47558acd56 ("bnxt_en: Reserve rings at driver open if none was reserved at probe time.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Other shutdown code paths will always disable PCI first to shutdown DMA
before freeing context memory. Do the same sequence in the error path
of probe to be safe and consistent.
Fixes: c20dc142dd ("bnxt_en: Disable bus master during PCI shutdown and driver unload.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current code ignores the return value from
bnxt_hwrm_func_backing_store_cfg(), causing the driver to proceed in
the init path even when this vital firmware call has failed. Fix it
by propagating the error code to the caller.
Fixes: 1b9394e5a2 ("bnxt_en: Configure context memory on new devices.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The allocated ieee_ets structure goes out of scope without being freed,
leaking memory. Appropriate result codes should be returned so that
callers do not rely on invalid data passed by reference.
Also cache the ETS config retrieved from the device so that it doesn't
need to be freed. The balance of the code was clearly written with the
intent of having the results of querying the hardware cached in the
device structure. The commensurate store was evidently missed though.
Fixes: 7df4ae9fe8 ("bnxt_en: Implement DCBNL to support host-based DCBX.")
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is an indexing bug in determining these ethtool priority
counters. Instead of using the queue ID to index, we need to
normalize by modulo 10 to get the index. This index is then used
to obtain the proper CoS queue counter. Rename bp->pri2cos to
bp->pri2cos_idx to make this more clear.
Fixes: e37fed7903 ("bnxt_en: Add ethtool -S priority counters.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Only attach macsec to ethernet devices.
Syzbot was able to trigger a KMSAN warning in macsec_handle_frame
by attaching to a phonet device.
Macvlan has a similar check in macvlan_port_create.
v1->v2
- fix commit message typo
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Unlike NL_SET_ERR_* macros, nl_set_extack_cookie_u64() and
nl_set_extack_cookie_u32() helpers do not check extack argument for null
and neither do their callers, as syzbot recently discovered for
ethnl_parse_header().
Instead of fixing the callers and leaving the trap in place, add check of
null extack to both helpers to make them consistent with NL_SET_ERR_*
macros.
v2: drop incorrect second Fixes tag
Fixes: 2363d73a2f ("ethtool: reject unrecognized request flags")
Reported-by: syzbot+258a9089477493cea67b@syzkaller.appspotmail.com
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
The nci_conn_max_data_pkt_payload_size() function sometimes returns
-EPROTO so "max_size" needs to be signed for the error handling to
work. We can make "payload_size" an int as well.
Fixes: a06347c04c ("NFC: Add Intel Fields Peak NFC solution driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a place,
inet_dump_fib()
fib_table_dump
fn_trie_dump_leaf()
hlist_for_each_entry_rcu()
without rcu_read_lock() will trigger a warning,
WARNING: suspicious RCU usage
-----------------------------
net/ipv4/fib_trie.c:2216 RCU-list traversed in non-reader section!!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
1 lock held by ip/1923:
#0: ffffffff8ce76e40 (rtnl_mutex){+.+.}, at: netlink_dump+0xd6/0x840
Call Trace:
dump_stack+0xa1/0xea
lockdep_rcu_suspicious+0x103/0x10d
fn_trie_dump_leaf+0x581/0x590
fib_table_dump+0x15f/0x220
inet_dump_fib+0x4ad/0x5d0
netlink_dump+0x350/0x840
__netlink_dump_start+0x315/0x3e0
rtnetlink_rcv_msg+0x4d1/0x720
netlink_rcv_skb+0xf0/0x220
rtnetlink_rcv+0x15/0x20
netlink_unicast+0x306/0x460
netlink_sendmsg+0x44b/0x770
__sys_sendto+0x259/0x270
__x64_sys_sendto+0x80/0xa0
do_syscall_64+0x69/0xf4
entry_SYSCALL_64_after_hwframe+0x49/0xb3
Fixes: 18a8021a7b ("net/ipv4: Plumb support for filtering route dumps")
Signed-off-by: Qian Cai <cai@lca.pw>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl5hh6wACgkQSD+KveBX
+j6qvQf9HQsiQ+cE1UIbM/IzyTWXeBMzjljCWFgQfvyKQjSFnoATeVl6GMQJNk7M
ovQ7XHOlN36E/tQW/ypnwbX+btjl/mDEJsxEcvVf4gnw/QH1AwUjo291vPfrE5md
DcWWe9Jrq2MkHeZAgIt/Tnw4GYIwQOBVmYgky3A0+azzuvxK+nXX6JJG5JqRR8Wd
tBsN9UKok1nOt73d+TyHjGnzQHWzGBdS0vlxl0MYcD1QD66UzA0Atgz5aUQLGvTK
Nk/IcHr3SF+0kvbtpjRxlrpi4ywD2gBNHXMJX1DDnCkqg9nWjJ5DByYDASo+I6uL
0fMNsimp/gZG8HD0oFEVTFgaqPACUA==
=DEqe
-----END PGP SIGNATURE-----
Merge tag 'mlx5-fixes-2020-03-05' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
Mellanox, mlx5 fixes 2020-03-05
This series introduces some fixes to mlx5 driver.
Please pull and let me know if there is any problem.
For -stable v5.4
('net/mlx5: DR, Fix postsend actions write length')
For -stable v5.5
('net/mlx5e: kTLS, Fix TCP seq off-by-1 issue in TX resync flow')
('net/mlx5e: Fix endianness handling in pedit mask')
====================
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull crypto fix from Herbert Xu:
"This fixes a correctness bug in the ARM64 version of ChaCha for
lib/crypto used by WireGuard"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: arm64/chacha - correctly walk through blocks
When application uses TCP_QUEUE_SEQ socket option to
change tp->rcv_next, we must also update tp->copied_seq.
Otherwise, stuff relying on tcp_inq() being precise can
eventually be confused.
For example, tcp_zerocopy_receive() might crash because
it does not expect tcp_recv_skb() to return NULL.
We could add tests in various places to fix the issue,
or simply make sure tcp_inq() wont return a random value,
and leave fast path as it is.
Note that this fixes ioctl(fd, SIOCINQ, &val) at the same
time.
Fixes: ee9952831c ("tcp: Initial repair mode")
Fixes: 05255b823a ("tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>