linux_dsm_epyc7002/net/ipv4
Eric Dumazet 3c8fc87820 inet: frags: rework rhashtable dismantle
syszbot found an interesting use-after-free [1] happening
while IPv4 fragment rhashtable was destroyed at netns dismantle.

While no insertions can possibly happen at the time a dismantling
netns is destroying this rhashtable, timers can still fire and
attempt to remove elements from this rhashtable.

This is forbidden, since rhashtable_free_and_destroy() has
no synchronization against concurrent inserts and deletes.

Add a new fqdir->dead flag so that timers do not attempt
a rhashtable_remove_fast() operation.

We also have to respect an RCU grace period before starting
the rhashtable_free_and_destroy() from process context,
thus we use rcu_work infrastructure.

This is a refinement of a prior rough attempt to fix this bug :
https://marc.info/?l=linux-netdev&m=153845936820900&w=2

Since the rhashtable cleanup is now deferred to a work queue,
netns dismantles should be slightly faster.

[1]
BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:194 [inline]
BUG: KASAN: use-after-free in rhashtable_last_table+0x162/0x180 lib/rhashtable.c:212
Read of size 8 at addr ffff8880a6497b70 by task kworker/0:0/5

CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 5.2.0-rc1+ #2
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events rht_deferred_worker
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x172/0x1f0 lib/dump_stack.c:113
 print_address_description.cold+0x7c/0x20d mm/kasan/report.c:188
 __kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
 kasan_report+0x12/0x20 mm/kasan/common.c:614
 __asan_report_load8_noabort+0x14/0x20 mm/kasan/generic_report.c:132
 __read_once_size include/linux/compiler.h:194 [inline]
 rhashtable_last_table+0x162/0x180 lib/rhashtable.c:212
 rht_deferred_worker+0x111/0x2030 lib/rhashtable.c:411
 process_one_work+0x989/0x1790 kernel/workqueue.c:2269
 worker_thread+0x98/0xe40 kernel/workqueue.c:2415
 kthread+0x354/0x420 kernel/kthread.c:255
 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352

Allocated by task 32687:
 save_stack+0x23/0x90 mm/kasan/common.c:71
 set_track mm/kasan/common.c:79 [inline]
 __kasan_kmalloc mm/kasan/common.c:489 [inline]
 __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:462
 kasan_kmalloc+0x9/0x10 mm/kasan/common.c:503
 __do_kmalloc_node mm/slab.c:3620 [inline]
 __kmalloc_node+0x4e/0x70 mm/slab.c:3627
 kmalloc_node include/linux/slab.h:590 [inline]
 kvmalloc_node+0x68/0x100 mm/util.c:431
 kvmalloc include/linux/mm.h:637 [inline]
 kvzalloc include/linux/mm.h:645 [inline]
 bucket_table_alloc+0x90/0x480 lib/rhashtable.c:178
 rhashtable_init+0x3f4/0x7b0 lib/rhashtable.c:1057
 inet_frags_init_net include/net/inet_frag.h:109 [inline]
 ipv4_frags_init_net+0x182/0x410 net/ipv4/ip_fragment.c:683
 ops_init+0xb3/0x410 net/core/net_namespace.c:130
 setup_net+0x2d3/0x740 net/core/net_namespace.c:316
 copy_net_ns+0x1df/0x340 net/core/net_namespace.c:439
 create_new_namespaces+0x400/0x7b0 kernel/nsproxy.c:107
 unshare_nsproxy_namespaces+0xc2/0x200 kernel/nsproxy.c:206
 ksys_unshare+0x440/0x980 kernel/fork.c:2692
 __do_sys_unshare kernel/fork.c:2760 [inline]
 __se_sys_unshare kernel/fork.c:2758 [inline]
 __x64_sys_unshare+0x31/0x40 kernel/fork.c:2758
 do_syscall_64+0xfd/0x680 arch/x86/entry/common.c:301
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Freed by task 7:
 save_stack+0x23/0x90 mm/kasan/common.c:71
 set_track mm/kasan/common.c:79 [inline]
 __kasan_slab_free+0x102/0x150 mm/kasan/common.c:451
 kasan_slab_free+0xe/0x10 mm/kasan/common.c:459
 __cache_free mm/slab.c:3432 [inline]
 kfree+0xcf/0x220 mm/slab.c:3755
 kvfree+0x61/0x70 mm/util.c:460
 bucket_table_free+0x69/0x150 lib/rhashtable.c:108
 rhashtable_free_and_destroy+0x165/0x8b0 lib/rhashtable.c:1155
 inet_frags_exit_net+0x3d/0x50 net/ipv4/inet_fragment.c:152
 ipv4_frags_exit_net+0x73/0x90 net/ipv4/ip_fragment.c:695
 ops_exit_list.isra.0+0xaa/0x150 net/core/net_namespace.c:154
 cleanup_net+0x3fb/0x960 net/core/net_namespace.c:553
 process_one_work+0x989/0x1790 kernel/workqueue.c:2269
 worker_thread+0x98/0xe40 kernel/workqueue.c:2415
 kthread+0x354/0x420 kernel/kthread.c:255
 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352

The buggy address belongs to the object at ffff8880a6497b40
 which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 48 bytes inside of
 1024-byte region [ffff8880a6497b40, ffff8880a6497f40)
The buggy address belongs to the page:
page:ffffea0002992580 refcount:1 mapcount:0 mapping:ffff8880aa400ac0 index:0xffff8880a64964c0 compound_mapcount: 0
flags: 0x1fffc0000010200(slab|head)
raw: 01fffc0000010200 ffffea0002916e88 ffffea000218fe08 ffff8880aa400ac0
raw: ffff8880a64964c0 ffff8880a6496040 0000000100000005 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff8880a6497a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff8880a6497a80: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
>ffff8880a6497b00: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
                                                             ^
 ffff8880a6497b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff8880a6497c00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

Fixes: 648700f76b ("inet: frags: use rhashtables for reassembly units")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-26 14:08:05 -07:00
..
bpfilter SPDX update for 5.2-rc2, round 1 2019-05-21 12:33:38 -07:00
netfilter treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 13 2019-05-21 11:28:45 +02:00
af_inet.c net: rework SIOCGSTAMP ioctl handling 2019-04-19 14:07:40 -07:00
ah4.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
arp.c net: Evict neighbor entries on carrier down 2018-10-12 09:47:39 -07:00
cipso_ipv4.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 13 2019-05-21 11:28:45 +02:00
datagram.c ipv4: Allow sending multicast packets on specific i/f using VRF socket 2018-10-02 22:28:17 -07:00
devinet.c netlink: make validation more configurable for future strictness 2019-04-27 17:07:21 -04:00
esp4_offload.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-05-02 22:14:21 -04:00
esp4.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
fib_frontend.c net: Set strict_start_type for routes and rules 2019-05-22 17:50:24 -07:00
fib_lookup.h ipv4: Add fib_nh_common to fib_result 2019-04-03 21:50:20 -07:00
fib_notifier.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
fib_rules.c ipv4: fib_rules: Fix possible infinite loop in fib_empty_table 2018-12-30 12:57:04 -08:00
fib_semantics.c ipv4: Rename and export nh_update_mtu 2019-05-22 17:48:44 -07:00
fib_trie.c ipv4: Add function to send route updates 2019-05-22 17:48:44 -07:00
fou.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
gre_demux.c net: ip_gre: use erspan key field for tunnel lookup 2019-01-22 11:52:17 -08:00
gre_offload.c Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net 2018-07-03 10:29:26 +09:00
icmp.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-03-02 12:54:35 -08:00
igmp.c net: remove unneeded switch fall-through 2019-02-21 13:48:00 -08:00
inet_connection_sock.c ipv4: Prepare rtable for IPv6 gateway 2019-04-08 15:22:40 -07:00
inet_diag.c inet_diag: fix reporting cgroup classid and fallback to priority 2019-02-12 13:35:57 -05:00
inet_fragment.c inet: frags: rework rhashtable dismantle 2019-05-26 14:08:05 -07:00
inet_hashtables.c net: dccp: fix kernel crash on module load 2018-12-24 15:27:56 -08:00
inet_timewait_sock.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
inetpeer.c net: ipv4: use a dedicated counter for icmp_v4 redirect packets 2019-02-08 21:50:15 -08:00
ip_forward.c ipv4: Prepare rtable for IPv6 gateway 2019-04-08 15:22:40 -07:00
ip_fragment.c net: dynamically allocate fqdir structures 2019-05-26 14:08:05 -07:00
ip_gre.c net: ip_gre: fix possible use-after-free in erspan_rcv 2019-04-08 16:16:47 -07:00
ip_input.c net: use indirect calls helpers at early demux stage 2019-05-05 10:38:04 -07:00
ip_options.c vrf: check accept_source_route on the original netdevice 2019-04-01 10:44:58 -07:00
ip_output.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
ip_sockglue.c ip: on queued skb use skb_header_pointer instead of pskb_may_pull 2019-01-10 09:27:20 -05:00
ip_tunnel_core.c netlink: make validation more configurable for future strictness 2019-04-27 17:07:21 -04:00
ip_tunnel.c iptunnel: NULL pointer deref for ip_md_tunnel_xmit 2019-03-06 10:43:06 -08:00
ip_vti.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-05-02 22:14:21 -04:00
ipcomp.c net-ipv4: remove 2 always zero parameters from ipv4_redirect() 2018-09-26 20:30:55 -07:00
ipconfig.c ipconfig: add carrier_timeout kernel parameter 2019-02-01 15:24:13 -08:00
ipip.c ip_tunnel: Add tnl_update_pmtu in ip_md_tunnel_xmit 2019-01-26 09:43:03 -08:00
ipmr_base.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-05-07 17:22:09 -07:00
ipmr.c netlink: make validation more configurable for future strictness 2019-04-27 17:07:21 -04:00
Kconfig treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
Makefile xfrm: make xfrm modes builtin 2019-04-08 09:15:17 +02:00
metrics.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
netfilter.c netfilter: ipv4: remove useless export_symbol 2019-01-28 11:32:58 +01:00
netlink.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
ping.c net: Treat sock->sk_drops as an unsigned int when printing 2019-05-19 10:31:10 -07:00
proc.c net: dynamically allocate fqdir structures 2019-05-26 14:08:05 -07:00
protocol.c fou, fou6: ICMP error handlers for FoU and GUE 2018-11-08 17:13:08 -08:00
raw_diag.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
raw.c net: Treat sock->sk_drops as an unsigned int when printing 2019-05-19 10:31:10 -07:00
route.c ipv4: Move exception bucket to nh_common 2019-05-05 00:47:16 -07:00
syncookies.c tcp: free request sock directly upon TFO or syncookies error 2019-03-19 14:13:01 -07:00
sysctl_net_ipv4.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-04-25 23:52:29 -04:00
tcp_bbr.c tcp_bbr: adapt cwnd based on ack aggregation estimation 2019-01-24 22:27:27 -08:00
tcp_bic.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_bpf.c bpf, tcp: correctly handle DONT_WAIT flags and timeo == 0 2019-05-16 01:36:13 +02:00
tcp_cdg.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_cong.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
tcp_cubic.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_dctcp.c dctcp: more accurate tracking of packets delivery 2019-04-11 21:31:03 -07:00
tcp_dctcp.h tcp: refactor DCTCP ECN ACK handling 2018-10-10 22:26:00 -07:00
tcp_diag.c net: sock: replace sk_state_load with inet_sk_state_load and remove sk_state_store 2017-12-20 14:00:25 -05:00
tcp_fastopen.c tcp: pause Fast Open globally after third consecutive timeout 2017-12-13 15:51:12 -05:00
tcp_highspeed.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_htcp.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_hybla.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_illinois.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_input.c tcp: fix retrans timestamp on passive Fast Open 2019-05-14 15:17:49 -07:00
tcp_ipv4.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-05-02 22:14:21 -04:00
tcp_lp.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_metrics.c tcp: refactor setting the initial congestion window 2019-05-01 11:47:54 -04:00
tcp_minisocks.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
tcp_nv.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_offload.c net: use indirect call wrappers at GRO transport layer 2018-12-15 13:23:02 -08:00
tcp_output.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
tcp_rate.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
tcp_recovery.c tcp: introduce tcp_skb_timestamp_us() helper 2018-09-21 19:37:59 -07:00
tcp_scalable.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_timer.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
tcp_ulp.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
tcp_vegas.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_vegas.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
tcp_veno.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_westwood.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_yeah.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp.c tcp: do not recycle cloned skbs 2019-05-15 09:22:41 -07:00
tunnel4.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
udp_diag.c net: diag: document swapped src/dst in udp_dump_one. 2018-10-28 19:27:21 -07:00
udp_impl.h udp: add missing rehash callback to udplite 2019-01-17 15:01:08 -08:00
udp_offload.c udp: fix GRO packet of death 2019-05-01 22:29:56 -04:00
udp_tunnel.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
udp.c net: Treat sock->sk_drops as an unsigned int when printing 2019-05-19 10:31:10 -07:00
udplite.c udp: add missing rehash callback to udplite 2019-01-17 15:01:08 -08:00
xfrm4_input.c xfrm: reset transport header back to network header after all input transforms ahave been applied 2018-09-04 10:26:30 +02:00
xfrm4_output.c xfrm: store xfrm_mode directly, not its address 2019-04-08 09:15:28 +02:00
xfrm4_policy.c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next 2019-04-30 09:26:13 -04:00
xfrm4_protocol.c xfrm: remove unneeded export_symbols 2019-04-23 07:42:20 +02:00
xfrm4_state.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
xfrm4_tunnel.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00