The kernel may sleep while holding a spinlock.
The function call path (from bottom to top) in Linux 4.19 is:
net/nfc/nci/uart.c, 349:
nci_skb_alloc in nci_uart_default_recv_buf
net/nfc/nci/uart.c, 255:
(FUNC_PTR)nci_uart_default_recv_buf in nci_uart_tty_receive
net/nfc/nci/uart.c, 254:
spin_lock in nci_uart_tty_receive
nci_skb_alloc(GFP_KERNEL) can sleep at runtime.
(FUNC_PTR) means a function pointer is called.
To fix this bug, GFP_KERNEL is replaced with GFP_ATOMIC for
nci_skb_alloc().
This bug is found by a static analysis tool STCheck written by myself.
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dev_hold has to be called always in rx_queue_add_kobject.
Otherwise usage count drops below 0 in case of failure in
kobject_init_and_add.
Fixes: b8eb718348 ("net-sysfs: Fix reference count leak in rx|netdev_queue_add_kobject")
Reported-by: syzbot <syzbot+30209ea299c09d8785c9@syzkaller.appspotmail.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: David Miller <davem@davemloft.net>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Jouni Hogander <jouni.hogander@unikie.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As flower rules are added, they are given a stats ID based on the number
of rules that can be supported in firmware. Only after the initial
allocation of all available IDs does the driver begin to reuse those that
have been released.
The initial allocation of IDs was modified to account for multiple memory
units on the offloaded device. However, this introduced a bug whereby the
counter that controls the IDs could be decremented before the ID was
assigned (where it is further decremented). This means that the stats ID
could be assigned as -1/0xfffffff which is out of range.
Fix this by only decrementing the main counter after the current ID has
been assigned.
Fixes: 467322e262 ("nfp: flower: support multiple memory units for filter offloads")
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dsa_link_touch() is not exported, or defined outside of the
file it is in so make it static to avoid the following warning:
net/dsa/dsa2.c:127:17: warning: symbol 'dsa_link_touch' was not declared. Should it be static?
Signed-off-by: Ben Dooks (Codethink) <ben.dooks@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/ethernet/atheros/ag71xx.c: In function 'ag71xx_probe':
drivers/net/ethernet/atheros/ag71xx.c:1776:30: warning: passing argument 2 of
'of_get_phy_mode' makes pointer from integer without a cast [-Wint-conversion]
In file included from drivers/net/ethernet/atheros/ag71xx.c:33:
./include/linux/of_net.h:15:69: note: expected 'phy_interface_t *'
{aka 'enum <anonymous> *'} but argument is of type 'int'
Fixes: 0c65b2b90d ("net: of_get_phy_mode: Change API to solve int/unit warnings")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix missing '*' kernel-doc notation that causes this warning:
../include/linux/netdevice.h:1779: warning: bad line: spinlock
Fixes: ab92d68fc2 ("net: core: add generic lockdep keys")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
sk->sk_pacing_shift can be read and written without lock
synchronization. This patch adds annotations to
document this fact and avoid future syzbot complains.
This might also avoid unexpected false sharing
in sk_pacing_shift_update(), as the compiler
could remove the conditional check and always
write over sk->sk_pacing_shift :
if (sk->sk_pacing_shift != val)
sk->sk_pacing_shift = val;
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ql_alloc_large_buffers() has the usual RX buffer allocation
loop where it allocates skbs and maps them for DMA. It also
treats failure as a fatal error.
There are (at least) three bugs in the error paths:
1. ql_free_large_buffers() assumes that the lrg_buf[] entry for the
first buffer that couldn't be allocated will have .skb == NULL.
But the qla_buf[] array is not zero-initialised.
2. ql_free_large_buffers() DMA-unmaps all skbs in lrg_buf[]. This is
incorrect for the last allocated skb, if DMA mapping failed.
3. Commit 1acb8f2a7a ("net: qlogic: Fix memory leak in
ql_alloc_large_buffers") added a direct call to dev_kfree_skb_any()
after the skb is recorded in lrg_buf[], so ql_free_large_buffers()
will double-free it.
The bugs are somewhat inter-twined, so fix them all at once:
* Clear each entry in qla_buf[] before attempting to allocate
an skb for it. This goes half-way to fixing bug 1.
* Set the .skb field only after the skb is DMA-mapped. This
fixes the rest.
Fixes: 1357bfcf71 ("qla3xxx: Dynamically size the rx buffer queue ...")
Fixes: 0f8ab89e82 ("qla3xxx: Check return code from pci_map_single() ...")
Fixes: 1acb8f2a7a ("net: qlogic: Fix memory leak in ql_alloc_large_buffers")
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
syzbot reported a memory leak when an allocation fails within
genradix_prealloc() for output streams. That's because
genradix_prealloc() leaves initialized members initialized when the
issue happens and SCTP stack will abort the current initialization but
without cleaning up such members.
The fix here is to always call genradix_free() when genradix_prealloc()
fails, for output and also input streams, as it suffers from the same
issue.
Reported-by: syzbot+772d9e36c490b18d51d1@syzkaller.appspotmail.com
Fixes: 2075e50caf ("sctp: convert to genradix")
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
First set of fixes for v5.5. Fixing security issues, some regressions
and few major bugs.
mwifiex
* security fix for handling country Information Elements (CVE-2019-14895)
* security fix for handling TDLS Information Elements
ath9k
* fix endian issue with ath9k_pci_owl_loader
mt76
* fix default mac address handling
iwlwifi
* fix merge damage which lead to firmware crashing during boot on some devices
* fix device initialisation regression on some devices
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJd+P4eAAoJEG4XJFUm622bnOQH/RHSFFQNjxfPboZoozKb0/mT
6YtrG+0K1z8E21Os2bd+s4aRJgq1oGKDocMie5y53M38qQ5N3FchzcGopNRCRy/a
gF1PzEqceOysbEtPOx4yI+c0Gfi7rS0SjEyf2XKGowXsoIZh4j+1xlWbF/JtHt6n
J901W1GW0ZUCHaP8KuZcbsS2nfdV8tFW2NQW3Xuhy+nOdBhNRL/lKSvlhZLCCAfY
f1eJrB0rAzQpCqw8Wuz2JzWsxBiTt+6Ucuzv4EkFVrt7Xnj00feghTKJkVDqa+fL
B9hTiiYh8sLGJkYHsZLDZIzAdOW8jmun1I+XFsjdl7ucLRYz6U8NL9jc6oIGK00=
=KJmM
-----END PGP SIGNATURE-----
Merge tag 'wireless-drivers-2019-12-17' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
Kalle Valo says:
====================
wireless-drivers fixes for v5.5
First set of fixes for v5.5. Fixing security issues, some regressions
and few major bugs.
mwifiex
* security fix for handling country Information Elements (CVE-2019-14895)
* security fix for handling TDLS Information Elements
ath9k
* fix endian issue with ath9k_pci_owl_loader
mt76
* fix default mac address handling
iwlwifi
* fix merge damage which lead to firmware crashing during boot on some devices
* fix device initialisation regression on some devices
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Upon reusing the ptp_qoriq driver, the ptp_qoriq_free() function was
used on the remove path to free any allocated resources.
The ptp_qoriq IRQ is among these resources that are freed in
ptp_qoriq_free() even though it is also a managed one (allocated using
devm_request_threaded_irq).
Drop the resource managed version of requesting the IRQ in order to not
trigger a double free of the interrupt as below:
[ 226.731005] Trying to free already-free IRQ 126
[ 226.735533] WARNING: CPU: 6 PID: 749 at kernel/irq/manage.c:1707
__free_irq+0x9c/0x2b8
[ 226.743435] Modules linked in:
[ 226.746480] CPU: 6 PID: 749 Comm: bash Tainted: G W
5.4.0-03629-gfd7102c32b2c-dirty #912
[ 226.755857] Hardware name: NXP Layerscape LX2160ARDB (DT)
[ 226.761244] pstate: 40000085 (nZcv daIf -PAN -UAO)
[ 226.766022] pc : __free_irq+0x9c/0x2b8
[ 226.769758] lr : __free_irq+0x9c/0x2b8
[ 226.773493] sp : ffff8000125039f0
(...)
[ 226.856275] Call trace:
[ 226.858710] __free_irq+0x9c/0x2b8
[ 226.862098] free_irq+0x30/0x70
[ 226.865229] devm_irq_release+0x14/0x20
[ 226.869054] release_nodes+0x1b0/0x220
[ 226.872790] devres_release_all+0x34/0x50
[ 226.876790] device_release_driver_internal+0x100/0x1c0
Fixes: d346c9e86d ("dpaa2-ptp: reuse ptp_qoriq driver")
Cc: Yangbo Lu <yangbo.lu@nxp.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Yangbo Lu <yangbo.lu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* disable AQL on most drivers, addressing the iwlwifi issues
* fix double-free on network namespace changes
* fix TID field in frames injected through monitor interfaces
* fix ieee80211_calc_rx_airtime()
* fix NULL pointer dereference in rfkill (and remove BUG_ON)
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAl33THYACgkQB8qZga/f
l8QSsw/+MW4NiXgjFNIaTfyPocFCLi45Efdizef3D+T2MsAD279DFILLFc4Q08Tt
kL2CxOaD2lrrrPXTa++vkcaSBBQtgRRFvPQPycwoju9QkTuEQ31wFtXSzeCdSFmx
vLVaY+gMvPjw6HejWqouPlm1hBaA0jqZOCjwq3IWj0spDR/FwJ0HwXSzzEhUs7FV
1097Q9i7kLDZjdMvUUVnKi8SyWPL8TMPfXxyGPOsSbMPG5QAYj3odfb7FtsZLYgD
SwWafp6nroUfEDi3jk+QNEuJB4on6iAVEJxbltDQWqsBXO76CWVAezh9KqiDtzIt
Ay2YtTyOUTZUTPR8lZnoiTvR0GLzeNybwT5BQ9COO1tCD4yB8y1/cpa8oJxv/YRB
xekCcNMPDFqIwtrY4UKbTEuyCbf78uVO8cUYdlb4ZUUvLKFP2LiD63InyWAtvCdu
N21mzausAUzy65j5AJ7IIut7iFrcNEQ2qQtQuECGEqmu9uqHD4S3e9MQYd6Qx429
uXdWbtyoqnYXMaEhF4Zy+DNz5vELNUkP0Lv9sJ6ihALQSXWXwkiz+p7+eYsdBdZy
mJslJs+sYCgt0y46xzUfAcwvEdAOGkBeF91gCHSJ7q9g8UCrnZ4SN+OCx93YGnvW
w1AQgZ7VW2aL4i8lnfu6bd86K04M/AbwdRn2OL6Ug4LmmFvHUT8=
=tvTq
-----END PGP SIGNATURE-----
Merge tag 'mac80211-for-net-2019-10-16' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
A handful of fixes:
* disable AQL on most drivers, addressing the iwlwifi issues
* fix double-free on network namespace changes
* fix TID field in frames injected through monitor interfaces
* fix ieee80211_calc_rx_airtime()
* fix NULL pointer dereference in rfkill (and remove BUG_ON)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Selecting MSCC_OCELOT_SWITCH is not possible when NET_VENDOR_MICROSEMI
is disabled:
WARNING: unmet direct dependencies detected for MSCC_OCELOT_SWITCH
Depends on [n]: NETDEVICES [=y] && ETHERNET [=n] && NET_VENDOR_MICROSEMI [=n] && NET_SWITCHDEV [=y] && HAS_IOMEM [=y]
Selected by [m]:
- NET_DSA_MSCC_FELIX [=m] && NETDEVICES [=y] && HAVE_NET_DSA [=y] && NET_DSA [=y] && PCI [=y]
Add a Kconfig dependency on NET_VENDOR_MICROSEMI, which also implies
CONFIG_NETDEVICES.
Depending on a vendor config violates menuconfig locality for the DSA
driver, but is the smallest compromise since all other solutions are
much more complicated (see [0]).
https://www.spinics.net/lists/netdev/msg618808.html
Fixes: 5605194877 ("net: dsa: ocelot: add driver for Felix switch family")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mao Wenan <maowenan@huawei.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the implementation of gmac_setup_txqs() the allocated desc_ring is
leaked if TX queue base is not aligned. Release it via
dma_free_coherent.
Fixes: 4d5ae32f5e ("net: ethernet: Add a driver for Gemini gigabit ethernet")
Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
There were several issues with 53568438e3 ("net: dsa: b53: Add support for port_egress_floods callback") that resulted in breaking connectivity for standalone ports:
- both user and CPU ports must allow unicast and multicast forwarding by
default otherwise this just flat out breaks connectivity for
standalone DSA ports
- IP multicast is treated similarly as multicast, but has separate
control registers
- the UC, MC and IPMC lookup failure register offsets were wrong, and
instead used bit values that are meaningful for the
B53_IP_MULTICAST_CTRL register
Fixes: 53568438e3 ("net: dsa: b53: Add support for port_egress_floods callback")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefano Garzarella says:
====================
vsock/virtio: fix null-pointer dereference and related precautions
This series mainly solves a possible null-pointer dereference in
virtio_transport_recv_listen() introduced with the multi-transport
support [PATCH 1].
PATCH 2 adds a WARN_ON check for the same potential issue
and a returned error in the virtio_transport_send_pkt_info() function
to avoid crashing the kernel.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
virtio_transport_get_ops() and virtio_transport_send_pkt_info()
can only be used on connecting/connected sockets, since a socket
assigned to a transport is required.
This patch adds a WARN_ON() on virtio_transport_get_ops() to check
this requirement, a comment and a returned error on
virtio_transport_send_pkt_info(),
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In rfkill_register, the struct rfkill pointer is first derefernced
and then checked for NULL. This patch removes the BUG_ON and returns
an error to the caller in case rfkill is NULL.
Signed-off-by: Aditya Pakki <pakki001@umn.edu>
Link: https://lore.kernel.org/r/20191215153409.21696-1-pakki001@umn.edu
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
In function xenvif_disconnect_queue(), the value of queue->rx_irq is
zeroed *before* queue->task is stopped. Unfortunately that task may call
notify_remote_via_irq(queue->rx_irq) and calling that function with a
zero value results in a NULL pointer dereference in evtchn_from_irq().
This patch simply re-orders things, stopping all tasks before zero-ing the
irq values, thereby avoiding the possibility of the race.
Fixes: 2ac061ce97 ("xen/netback: cleanup init and deinit code")
Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Acked-by: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Display the return code as decimal integer.
Fixes: 55d7de9de6 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver")
Signed-off-by: Cristian Birsan <cristian.birsan@microchip.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
FASTOPEN setsockopt() or sendmsg() may switch the SMC socket to fallback
mode. Once fallback mode is active, the native TCP socket functions are
called. Nevertheless there is a small race window, when FASTOPEN
setsockopt/sendmsg runs in parallel to a connect(), and switch the
socket into fallback mode before connect() takes the sock lock.
Make sure the SMC-specific connect setup is omitted in this case.
This way a syzbot-reported refcount problem is fixed, triggered by
different threads running non-blocking connect() and FASTOPEN_KEY
setsockopt.
Reported-by: syzbot+96d3f9ff6a86d37e44c8@syzkaller.appspotmail.com
Fixes: 6d6dd528d5 ("net/smc: fix refcount non-blocking connect() -part 2")
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
A mismerge between the following two commits:
c678726305 ("net: phylink: ensure consistent phy interface mode")
27755ff88c ("net: phylink: Add phylink_mac_link_{up, down} wrapper functions")
resulted in the wrong interface being passed to the mac_link_up()
function. Fix this up.
Fixes: b4b12b0d2f ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
This test only works when [1] is applied, which was rejected.
Basically, the errors are reported and cleared. In this particular case of
tls sockets, following reads will block.
The test case was originally submitted with the rejected patch, but, then,
was included as part of a different patchset, possibly by mistake.
[1] https://lore.kernel.org/netdev/20191007035323.4360-2-jakub.kicinski@netronome.com/#t
Thanks Paolo Pisati for pointing out the original patchset where this
appeared.
Fixes: 65190f7742 (selftests/tls: add a test for fragmented messages)
Reported-by: Paolo Pisati <paolo.pisati@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Taehee Yoo says:
====================
gtp: fix several bugs in gtp module
This patchset fixes several bugs in the GTP module.
1. Do not allow adding duplicate TID and ms_addr pdp context.
In the current code, duplicate TID and ms_addr pdp context could be added.
So, RX and TX path could find correct pdp context.
2. Fix wrong condition in ->dumpit() callback.
->dumpit() callback is re-called if dump packet size is too big.
So, before return, it saves last position and then restart from
last dump position.
TID value is used to find last dump position.
GTP module allows adding zero TID value. But ->dumpit() callback ignores
zero TID value.
So, dump would not work correctly if dump packet size too big.
3. Fix use-after-free in ipv4_pdp_find().
RX and TX patch always uses gtp->tid_hash and gtp->addr_hash.
but while packet processing, these hash pointer would be freed.
So, use-after-free would occur.
4. Fix panic because of zero size hashtable
GTP hashtable size could be set by user-space.
If hashsize is set to 0, hashtable will not work and panic will occur.
====================
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
GTP default hashtable size is 1024 and userspace could set specific
hashtable size with IFLA_GTP_PDP_HASHSIZE. If hashtable size is set to 0
from userspace, hashtable will not work and panic will occur.
Fixes: 459aa660eb ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
ipv4_pdp_find() is called in TX packet path of GTP.
ipv4_pdp_find() internally uses gtp->tid_hash to lookup pdp context.
In the current code, gtp->tid_hash and gtp->addr_hash are freed by
->dellink(), which is gtp_dellink().
But gtp_dellink() would be called while packets are processing.
So, gtp_dellink() should not free gtp->tid_hash and gtp->addr_hash.
Instead, dev->priv_destructor() would be used because this callback
is called after all packet processing safely.
Test commands:
ip link add veth1 type veth peer name veth2
ip a a 172.0.0.1/24 dev veth1
ip link set veth1 up
ip a a 172.99.0.1/32 dev lo
gtp-link add gtp1 &
gtp-tunnel add gtp1 v1 200 100 172.99.0.2 172.0.0.2
ip r a 172.99.0.2/32 dev gtp1
ip link set gtp1 mtu 1500
ip netns add ns2
ip link set veth2 netns ns2
ip netns exec ns2 ip a a 172.0.0.2/24 dev veth2
ip netns exec ns2 ip link set veth2 up
ip netns exec ns2 ip a a 172.99.0.2/32 dev lo
ip netns exec ns2 ip link set lo up
ip netns exec ns2 gtp-link add gtp2 &
ip netns exec ns2 gtp-tunnel add gtp2 v1 100 200 172.99.0.1 172.0.0.1
ip netns exec ns2 ip r a 172.99.0.1/32 dev gtp2
ip netns exec ns2 ip link set gtp2 mtu 1500
hping3 172.99.0.2 -2 --flood &
ip link del gtp1
Splat looks like:
[ 72.568081][ T1195] BUG: KASAN: use-after-free in ipv4_pdp_find.isra.12+0x130/0x170 [gtp]
[ 72.568916][ T1195] Read of size 8 at addr ffff8880b9a35d28 by task hping3/1195
[ 72.569631][ T1195]
[ 72.569861][ T1195] CPU: 2 PID: 1195 Comm: hping3 Not tainted 5.5.0-rc1 #199
[ 72.570547][ T1195] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 72.571438][ T1195] Call Trace:
[ 72.571764][ T1195] dump_stack+0x96/0xdb
[ 72.572171][ T1195] ? ipv4_pdp_find.isra.12+0x130/0x170 [gtp]
[ 72.572761][ T1195] print_address_description.constprop.5+0x1be/0x360
[ 72.573400][ T1195] ? ipv4_pdp_find.isra.12+0x130/0x170 [gtp]
[ 72.573971][ T1195] ? ipv4_pdp_find.isra.12+0x130/0x170 [gtp]
[ 72.574544][ T1195] __kasan_report+0x12a/0x16f
[ 72.575014][ T1195] ? ipv4_pdp_find.isra.12+0x130/0x170 [gtp]
[ 72.575593][ T1195] kasan_report+0xe/0x20
[ 72.576004][ T1195] ipv4_pdp_find.isra.12+0x130/0x170 [gtp]
[ 72.576577][ T1195] gtp_build_skb_ip4+0x199/0x1420 [gtp]
[ ... ]
[ 72.647671][ T1195] BUG: unable to handle page fault for address: ffff8880b9a35d28
[ 72.648512][ T1195] #PF: supervisor read access in kernel mode
[ 72.649158][ T1195] #PF: error_code(0x0000) - not-present page
[ 72.649849][ T1195] PGD a6c01067 P4D a6c01067 PUD 11fb07067 PMD 11f939067 PTE 800fffff465ca060
[ 72.652958][ T1195] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[ 72.653834][ T1195] CPU: 2 PID: 1195 Comm: hping3 Tainted: G B 5.5.0-rc1 #199
[ 72.668062][ T1195] RIP: 0010:ipv4_pdp_find.isra.12+0x86/0x170 [gtp]
[ ... ]
[ 72.679168][ T1195] Call Trace:
[ 72.679603][ T1195] gtp_build_skb_ip4+0x199/0x1420 [gtp]
[ 72.681915][ T1195] ? ipv4_pdp_find.isra.12+0x170/0x170 [gtp]
[ 72.682513][ T1195] ? lock_acquire+0x164/0x3b0
[ 72.682966][ T1195] ? gtp_dev_xmit+0x35e/0x890 [gtp]
[ 72.683481][ T1195] gtp_dev_xmit+0x3c2/0x890 [gtp]
[ ... ]
Fixes: 459aa660eb ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
gtp_genl_dump_pdp() is ->dumpit() callback of GTP module and it is used
to dump pdp contexts. it would be re-executed because of dump packet size.
If dump packet size is too big, it saves current dump pointer
(gtp interface pointer, bucket, TID value) then it restarts dump from
last pointer.
Current GTP code allows adding zero TID pdp context but dump code
ignores zero TID value. So, last dump pointer will not be found.
In addition, this patch adds missing rcu_read_lock() in
gtp_genl_dump_pdp().
Fixes: 459aa660eb ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
GTP RX packet path lookups pdp context with TID. If duplicate TID pdp
contexts are existing in the list, it couldn't select correct pdp context.
So, TID value should be unique.
GTP TX packet path lookups pdp context with ms_addr. If duplicate ms_addr pdp
contexts are existing in the list, it couldn't select correct pdp context.
So, ms_addr value should be unique.
Fixes: 459aa660eb ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
After the recent fix in commit 1899bb3251 ("bonding: fix state
transition issue in link monitoring"), the active-backup mode with
miimon initially come-up fine but after a link-failure, both members
transition into backup state.
Following steps to reproduce the scenario (eth1 and eth2 are the
slaves of the bond):
ip link set eth1 up
ip link set eth2 down
sleep 1
ip link set eth2 up
ip link set eth1 down
cat /sys/class/net/eth1/bonding_slave/state
cat /sys/class/net/eth2/bonding_slave/state
Fixes: 1899bb3251 ("bonding: fix state transition issue in link monitoring")
CC: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Manish Chopra says:
====================
bnx2x: bug fixes
This series has two driver changes, one to fix some unexpected
hardware behaviour casued during the parity error recovery in
presence of SR-IOV VFs and another one related for fixing resource
management in the driver among the PFs configured on an engine.
Please consider applying it to "net".
V1->V2:
=======
Fix the compilation errors reported by kbuild test robot
on the patch #1 with CONFIG_BNX2X_SRIOV=n
====================
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Driver doesn't calculate total number of PFs configured on a
given engine correctly which messed up resources in the PFs
loaded on that engine, leading driver to exceed configuration
of resources (like vlan filters etc.) beyond the limit per
engine, which ended up with asserts from the firmware.
Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Parity error from the hardware will cause PF to lose the state
of their VFs due to PF's internal reload and hardware reset following
the parity error. Restrict any configuration request from the VFs after
the parity as it could cause unexpected hardware behavior, only way
for VFs to recover would be to trigger FLR on VFs and reload them.
Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Without the common part of the driver, the new file fails to link:
drivers/net/ethernet/ti/cpsw_new.o: In function `cpsw_probe':
cpsw_new.c:(.text+0x312c): undefined reference to `ti_cm_get_macid'
Use the same Makefile hack as before, and build cpsw-common.o for
any driver that needs it.
Fixes: ed3525eda4 ("net: ethernet: ti: introduce cpsw switchdev based driver part 1 - dual-emac")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
The new driver misses a dependency:
drivers/net/ethernet/ti/cpsw_new.o: In function `cpsw_rx_handler':
cpsw_new.c:(.text+0x259c): undefined reference to `__page_pool_put_page'
cpsw_new.c:(.text+0x25d0): undefined reference to `page_pool_alloc_pages'
drivers/net/ethernet/ti/cpsw_priv.o: In function `cpsw_fill_rx_channels':
cpsw_priv.c:(.text+0x22d8): undefined reference to `page_pool_alloc_pages'
cpsw_priv.c:(.text+0x2420): undefined reference to `__page_pool_put_page'
drivers/net/ethernet/ti/cpsw_priv.o: In function `cpsw_create_xdp_rxqs':
cpsw_priv.c:(.text+0x2624): undefined reference to `page_pool_create'
drivers/net/ethernet/ti/cpsw_priv.o: In function `cpsw_run_xdp':
cpsw_priv.c:(.text+0x2dc8): undefined reference to `__page_pool_put_page'
Other drivers use 'select' for PAGE_POOL, so do the same here.
Fixes: ed3525eda4 ("net: ethernet: ti: introduce cpsw switchdev based driver part 1 - dual-emac")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Host can provide send indirection table messages anytime after RSS is
enabled by calling rndis_filter_set_rss_param(). So the host provided
table values may be overwritten by the initialization in
rndis_set_subchannel().
To prevent this problem, move the tx_table initialization before calling
rndis_filter_set_rss_param().
Fixes: a6fb6aa3cf ("hv_netvsc: Set tx_table to equal weight after subchannels open")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
phylink requires the MAC to report when its link status changes when
operating in inband modes. Failure to report link status changes
means that phylink has no idea when the link events happen, which
results in either the network interface's carrier remaining up or
remaining permanently down.
For example, with a fiber module, if the interface is brought up and
link is initially established, taking the link down at the far end
will cut the optical power. The SFP module's LOS asserts, we
deactivate the link, and the network interface reports no carrier.
When the far end is brought back up, the SFP module's LOS deasserts,
but the MAC may be slower to establish link. If this happens (which
in my tests is a certainty) then phylink never hears that the MAC
has established link with the far end, and the network interface is
stuck reporting no carrier. This means the interface is
non-functional.
Avoiding the link interrupt when we have phylink is basically not
an option, so remove the !port->phylink from the test.
Fixes: 4bb0432628 ("net: mvpp2: phylink support")
Tested-by: Sven Auhagen <sven.auhagen@voleatech.de>
Tested-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Eric Dumazet says:
====================
tcp: take care of empty skbs in write queue
We understood recently that TCP sockets could have an empty
skb at the tail of the write queue, leading to various problems.
This patch series :
1) Make sure we do not send an empty packet since this
was unintended and causing crashes in old kernels.
2) Change tcp_write_queue_empty() to not be fooled by
the presence of an empty skb.
3) Fix a bug that could trigger suboptimal epoll()
application behavior under memory pressure.
====================
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
At the time commit ce5ec44099 ("tcp: ensure epoll edge trigger
wakeup when write queue is empty") was added to the kernel,
we still had a single write queue, combining rtx and write queues.
Once we moved the rtx queue into a separate rb-tree, testing
if sk_write_queue is empty has been suboptimal.
Indeed, if we have packets in the rtx queue, we probably want
to delay the EPOLLOUT generation at the time incoming packets
will free them, making room, but more importantly avoiding
flooding application with EPOLLOUT events.
Solution is to use tcp_rtx_and_write_queues_empty() helper.
Fixes: 75c119afe1 ("tcp: implement rb-tree based retransmit queue")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Due to how tcp_sendmsg() is implemented, we can have an empty
skb at the tail of the write queue.
Most [1] tcp_write_queue_empty() callers want to know if there is
anything to send (payload and/or FIN)
Instead of checking if the sk_write_queue is empty, we need
to test if tp->write_seq == tp->snd_nxt
[1] tcp_send_fin() was the only caller that expected to
see if an skb was in the write queue, I have changed the code
to reuse the tcp_write_queue_tail() result.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Backport of commit fdfc5c8594 ("tcp: remove empty skb from
write queue in error cases") in linux-4.14 stable triggered
various bugs. One of them has been fixed in commit ba2ddb43f270
("tcp: Don't dequeue SYN/FIN-segments from write-queue"), but
we still have crashes in some occasions.
Root-cause is that when tcp_sendmsg() has allocated a fresh
skb and could not append a fragment before being blocked
in sk_stream_wait_memory(), tcp_write_xmit() might be called
and decide to send this fresh and empty skb.
Sending an empty packet is not only silly, it might have caused
many issues we had in the past with tp->packets_out being
out of sync.
Fixes: c65f7f00c5 ("[TCP]: Simplify SKB data portion allocation with NETIF_F_SG.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Christoph Paasch <cpaasch@apple.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Cc: Jason Baron <jbaron@akamai.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Michal Kubecek and Firo Yang did a very nice analysis of crashes
happening in __inet_lookup_established().
Since a TCP socket can go from TCP_ESTABLISH to TCP_LISTEN
(via a close()/socket()/listen() cycle) without a RCU grace period,
I should not have changed listeners linkage in their hash table.
They must use the nulls protocol (Documentation/RCU/rculist_nulls.txt),
so that a lookup can detect a socket in a hash list was moved in
another one.
Since we added code in commit d296ba60d8 ("soreuseport: Resolve
merge conflict for v4/v6 ordering fix"), we have to add
hlist_nulls_add_tail_rcu() helper.
Fixes: 3b24d854cb ("tcp/dccp: do not touch listener sk_refcnt under synflood")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Michal Kubecek <mkubecek@suse.cz>
Reported-by: Firo Yang <firo.yang@suse.com>
Reviewed-by: Michal Kubecek <mkubecek@suse.cz>
Link: https://lore.kernel.org/netdev/20191120083919.GH27852@unicorn.suse.cz/
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
This conditional is missing a bang, with the intent
being to break when the retry count reaches zero.
Fixes: 476d96ca9c ("ibmvnic: Bound waits for device queries")
Suggested-by: Juliet Kim <julietk@linux.vnet.ibm.com>
Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
In commit 4b1373de73 ("net: ipv6: addr: perform strict checks also for
doit handlers") we add strict check for inet6_rtm_getaddr(). But we did
the invalid header values check before checking if NETLINK_F_STRICT_CHK
is set. This may break backwards compatibility if user already set the
ifm->ifa_prefixlen, ifm->ifa_flags, ifm->ifa_scope in their netlink code.
I didn't move the nlmsg_len check because I thought it's a valid check.
Reported-by: Jianlin Shi <jishi@redhat.com>
Fixes: 4b1373de73 ("net: ipv6: addr: perform strict checks also for doit handlers")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Without I2C, we get a link failure:
drivers/ptp/ptp_clockmatrix.o: In function `idtcm_xfer.isra.3':
ptp_clockmatrix.c:(.text+0xcc): undefined reference to `i2c_transfer'
drivers/ptp/ptp_clockmatrix.o: In function `idtcm_driver_init':
ptp_clockmatrix.c:(.init.text+0x14): undefined reference to `i2c_register_driver'
drivers/ptp/ptp_clockmatrix.o: In function `idtcm_driver_exit':
ptp_clockmatrix.c:(.exit.text+0x10): undefined reference to `i2c_del_driver'
Fixes: 3a6ba7dc77 ("ptp: Add a ptp clock driver for IDT ClockMatrix.")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Vincent Cheng <vincent.cheng.xh@renesas.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
After executing "ethtool -C eth0 rx-usecs-irq 0", the box becomes
unresponsive, likely due to interrupt livelock. It appears that
a minimum clamp value for the irq timer is computed, but is never
applied.
Fix by applying the corrected clamp value.
Fixes: 74706afa71 ("bnxt_en: Update interrupt coalescing logic.")
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
I no longer work at Savoir-faire Linux but even though MAINTAINERS is
up-to-date, some emails are still sent to my old email address.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Add myself and Sean as maintainers for rmnet driver.
Signed-off-by: Sean Tranchetti <stranche@codeaurora.org>
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>