linux_dsm_epyc7002/net/core
Willy Tarreau 48119ffba8 net: linkwatch: fix failure to restore device state across suspend/resume
[ Upstream commit 6922110d152e56d7569616b45a1f02876cf3eb9f ]

After migrating my laptop from 4.19-LTS to 5.4-LTS a while ago I noticed
that my Ethernet port to which a bond and a VLAN interface are attached
appeared to remain up after resuming from suspend with the cable unplugged
(and that problem still persists with 5.10-LTS).

It happens that the following happens:

  - the network driver (e1000e here) prepares to suspend, calls e1000e_down()
    which calls netif_carrier_off() to signal that the link is going down.
  - netif_carrier_off() adds a link_watch event to the list of events for
    this device
  - the device is completely stopped.
  - the machine suspends
  - the cable is unplugged and the machine brought to another location
  - the machine is resumed
  - the queued linkwatch events are processed for the device
  - the device doesn't yet have the __LINK_STATE_PRESENT bit and its events
    are silently dropped
  - the device is resumed with its link down
  - the upper VLAN and bond interfaces are never notified that the link had
    been turned down and remain up
  - the only way to provoke a change is to physically connect the machine
    to a port and possibly unplug it.

The state after resume looks like this:
  $ ip -br li | egrep 'bond|eth'
  bond0            UP             e8:6a:64:64:64:64 <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP>
  eth0             DOWN           e8:6a:64:64:64:64 <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP>
  eth0.2@eth0      UP             e8:6a:64:64:64:64 <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP>

Placing an explicit call to netdev_state_change() either in the suspend
or the resume code in the NIC driver worked around this but the solution
is not satisfying.

The issue in fact really is in link_watch that loses events while it
ought not to. It happens that the test for the device being present was
added by commit 124eee3f69 ("net: linkwatch: add check for netdevice
being present to linkwatch_do_dev") in 4.20 to avoid an access to
devices that are not present.

Instead of dropping events, this patch proceeds slightly differently by
postponing their handling so that they happen after the device is fully
resumed.

Fixes: 124eee3f69 ("net: linkwatch: add check for netdevice being present to linkwatch_do_dev")
Link: https://lists.openwall.net/netdev/2018/03/15/62
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/r/20210809160628.22623-1-w@1wt.eu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-05 18:54:25 +02:00
..
bpf_sk_storage.c bpf: Change bpf_sk_storage_*() to accept ARG_PTR_TO_BTF_ID_SOCK_COMMON 2020-09-25 13:58:01 -07:00
datagram.c udp: fix skb_copy_and_csum_datagram with odd segment sizes 2021-02-17 11:02:28 +01:00
datagram.h net/core: Allow the compiler to verify declaration and definition consistency 2019-03-27 13:49:44 -07:00
dev_addr_lists.c net: core: add nested_level variable in net_device 2020-09-28 15:00:15 -07:00
dev_ioctl.c net: fix dev_ifsioc_locked() race condition 2021-03-07 12:34:07 +01:00
dev.c init: add dsm gpl source 2024-07-05 18:00:04 +02:00
devlink.c devlink: Correct VIRTUAL port to not have phys_port attributes 2021-06-10 13:39:16 +02:00
drop_monitor.c drop_monitor: Perform cleanup upon probe registration failure 2021-03-30 14:31:57 +02:00
dst_cache.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
dst.c net, bpf: Fix ip6ip6 crash with collect_md populated skbs 2021-03-30 14:32:05 +02:00
failover.c failover: allow name change on IFF_UP slave interfaces 2019-04-10 22:12:26 -07:00
fib_notifier.c net: fib_notifier: propagate extack down to the notifier block callback 2019-10-04 11:10:56 -07:00
fib_rules.c fib: Return the correct errno code 2021-06-18 10:00:06 +02:00
filter.c bpf: Do not change gso_size during bpf_skb_change_proto() 2021-07-14 16:56:27 +02:00
flow_dissector.c flow_dissector: Fix out-of-bounds warning in __skb_flow_bpf_to_target() 2021-05-19 10:12:57 +02:00
flow_offload.c net: flow_offload: Fix memory leak for indirect flow block 2020-12-09 16:08:33 -08:00
gen_estimator.c net_sched: gen_estimator: support large ewma log 2021-01-27 11:55:23 +01:00
gen_stats.c docs: networking: convert gen_stats.txt to ReST 2020-04-28 14:39:46 -07:00
gro_cells.c gro_cells: reduce number of synchronize_net() calls 2020-11-25 11:28:12 -08:00
hwbm.c net: hwbm: Make the hwbm_pool lock a mutex 2019-06-09 19:40:10 -07:00
link_watch.c net: linkwatch: fix failure to restore device state across suspend/resume 2024-07-05 18:54:25 +02:00
lwt_bpf.c lwt_bpf: Replace preempt_disable() with migrate_disable() 2020-12-07 11:53:40 -08:00
lwtunnel.c net: ipv6: add rpl sr tunnel 2020-03-29 22:30:57 -07:00
Makefile ethtool: move to its own directory 2019-12-12 17:07:05 -08:00
neighbour.c neighbour: allow NUD_NOARP entries to be forced GCed 2021-06-10 13:39:29 +02:00
net_namespace.c net: make get_net_ns return error if NET_NS is disabled 2021-06-23 14:42:44 +02:00
net-procfs.c net-sysfs: add backlog len and CPU id to softnet data 2020-09-21 13:56:37 -07:00
net-sysfs.c init: add dsm gpl source 2024-07-05 18:00:04 +02:00
net-sysfs.h net-sysfs: add netdev_change_owner() 2020-02-26 20:07:25 -08:00
net-traces.c page_pool: add tracepoints for page_pool with details need by XDP 2019-06-19 11:23:13 -04:00
netclassid_cgroup.c cgroup, netclassid: remove double cond_resched 2020-04-21 15:44:30 -07:00
netevent.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
netpoll.c net: Have netpoll bring-up DSA management interface 2020-11-18 11:04:11 -08:00
netprio_cgroup.c netprio_cgroup: Fix unlimited memory leak of v2 cgroups 2020-05-09 20:59:21 -07:00
page_pool.c mm: fix struct page layout on 32-bit systems 2021-05-19 10:13:17 +02:00
pktgen.c pktgen: fix misuse of BUG_ON() in pktgen_thread_worker() 2021-03-07 12:34:09 +01:00
ptp_classifier.c ptp: Add generic ptp v2 header parsing function 2020-08-19 16:07:49 -07:00
request_sock.c tcp: add rcu protection around tp->fastopen_rsk 2019-10-13 10:13:08 -07:00
rtnetlink.c rtnetlink: Fix regression in bridge VLAN configuration 2021-06-23 14:42:42 +02:00
scm.c fs: Add receive_fd() wrapper for __receive_fd() 2020-07-13 11:03:44 -07:00
secure_seq.c crypto: lib/sha1 - remove unnecessary includes of linux/cryptohash.h 2020-05-08 15:32:17 +10:00
skbuff.c net: Fix zero-copy head len calculation. 2024-07-05 18:07:38 +02:00
skmsg.c skmsg: Make sk_psock_destroy() static 2024-07-05 18:05:02 +02:00
sock_diag.c bpf, net: Rework cookie generator as per-cpu one 2020-09-30 11:50:35 -07:00
sock_map.c net, sockmap: Don't call bpf_prog_put() on NULL pointer 2020-10-15 21:05:23 +02:00
sock_reuseport.c udp: Prevent reuseport_select_sock from reading uninitialized socks 2021-01-23 16:03:59 +01:00
sock.c init: add dsm gpl source 2024-07-05 18:00:04 +02:00
stream.c tcp: make sure EPOLLOUT wont be missed 2019-08-19 13:07:43 -07:00
sysctl_net_core.c net: add option to not create fall-back tunnels in root-ns as well 2020-08-28 06:52:44 -07:00
timestamping.c net: Introduce a new MII time stamping interface. 2019-12-25 19:51:33 -08:00
tso.c net: tso: add UDP segmentation support 2020-06-18 20:46:23 -07:00
utils.c net: Fix skb->csum update in inet_proto_csum_replace16(). 2020-01-24 20:54:30 +01:00
xdp.c xdp: fix xdp_return_frame() kernel BUG throw for page_pool memory model 2021-04-14 08:42:09 +02:00