linux_dsm_epyc7002/net/ipv4
Jakub Kicinski a8585fdf42 net: ip: avoid OOM kills with large UDP sends over loopback
[ Upstream commit 6d123b81ac615072a8525c13c6c41b695270a15d ]

Dave observed number of machines hitting OOM on the UDP send
path. The workload seems to be sending large UDP packets over
loopback. Since loopback has MTU of 64k kernel will try to
allocate an skb with up to 64k of head space. This has a good
chance of failing under memory pressure. What's worse if
the message length is <32k the allocation may trigger an
OOM killer.

This is entirely avoidable, we can use an skb with page frags.

af_unix solves a similar problem by limiting the head
length to SKB_MAX_ALLOC. This seems like a good and simple
approach. It means that UDP messages > 16kB will now
use fragments if underlying device supports SG, if extra
allocator pressure causes regressions in real workloads
we can switch to trying the large allocation first and
falling back.

v4: pre-calculate all the additions to alloclen so
    we can be sure it won't go over order-2

Reported-by: Dave Jones <dsj@fb.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-19 09:44:53 +02:00
..
bpfilter
netfilter netfilter: arp_tables: add pre_exit hook for table unregister 2021-04-21 13:00:56 +02:00
af_inet.c inet: annotate data race in inet_send_prepare() and inet_dgram_connect() 2021-06-30 08:47:20 -04:00
ah4.c xfrm: Use actual socket sk instead of skb socket for xfrm_output_resume 2021-04-14 08:42:05 +02:00
arp.c net: Exempt multicast addresses from five-second neighbor lifetime 2020-11-13 14:24:39 -08:00
bpf_tcp_ca.c
cipso_ipv4.c net: ipv4: fix memory leak in netlbl_cipsov4_add_std 2021-06-23 14:42:41 +02:00
datagram.c
devinet.c net: ipv4: Remove unneed BUG() function 2021-06-30 08:47:20 -04:00
esp4_offload.c xfrm: Provide private skb extensions for segmented and hw offloaded ESP packets 2021-04-14 08:42:07 +02:00
esp4.c xfrm: xfrm_state_mtu should return at least 1280 for ipv6 2021-07-14 16:56:14 +02:00
fib_frontend.c net/ipv4: swap flow ports when validating source 2021-07-14 16:56:25 +02:00
fib_lookup.h
fib_notifier.c
fib_rules.c
fib_semantics.c
fib_trie.c
fou.c
gre_demux.c erspan: fix version 1 check in gre_parse_header() 2021-01-12 20:18:12 +01:00
gre_offload.c
icmp.c icmp: don't send out ICMP messages with a source address of 0.0.0.0 2021-06-23 14:42:47 +02:00
igmp.c net: ipv4: fix memory leak in ip_mc_add1_src 2021-06-23 14:42:46 +02:00
inet_connection_sock.c tcp: relookup sock for RST+ACK packets handled by obsolete req sock 2021-03-30 14:31:59 +02:00
inet_diag.c inet_diag: Fix error path to cancel the meseage in inet_req_diag_fill() 2020-11-17 16:08:36 -08:00
inet_fragment.c
inet_hashtables.c tcp: fix race condition when creating child sockets from syncookies 2020-11-23 16:32:33 -08:00
inet_timewait_sock.c
inetpeer.c
ip_forward.c
ip_fragment.c
ip_gre.c
ip_input.c
ip_options.c
ip_output.c net: ip: avoid OOM kills with large UDP sends over loopback 2021-07-19 09:44:53 +02:00
ip_sockglue.c
ip_tunnel_core.c tunnels: Fix off-by-one in lower MTU bounds for ICMP/ICMPv6 replies 2020-11-09 15:39:39 -08:00
ip_tunnel.c net: always use icmp{,v6}_ndo_send from ndo_start_xmit 2021-03-17 17:06:12 +01:00
ip_vti.c net: always use icmp{,v6}_ndo_send from ndo_start_xmit 2021-03-17 17:06:12 +01:00
ipcomp.c
ipconfig.c net: ipconfig: Don't override command-line hostnames or domains 2021-06-18 10:00:05 +02:00
ipip.c
ipmr_base.c
ipmr.c
Kconfig
Makefile
metrics.c
netfilter.c netfilter: use actual socket sk rather than skb sk when routing harder 2020-10-30 12:57:39 +01:00
netlink.c
nexthop.c nexthop: Do not flush blackhole nexthops when loopback goes down 2021-03-17 17:06:14 +01:00
ping.c ping: Check return value of function 'ping_queue_rcv_skb' 2021-06-30 08:47:20 -04:00
proc.c
protocol.c
raw_diag.c
raw.c
route.c net: lwtunnel: handle MTU calculation in forwading 2021-07-14 16:56:32 +02:00
syncookies.c net: Update window_clamp if SOCK_RCVBUF is set 2020-11-10 17:42:35 -08:00
sysctl_net_ipv4.c net: Make tcp_allowed_congestion_control readonly in non-init netns 2021-04-21 13:00:57 +02:00
tcp_bbr.c tcp: only postpone PROBE_RTT if RTT is < current min_rtt estimate 2020-11-17 11:03:22 -08:00
tcp_bic.c
tcp_bpf.c bpf, sockmap: Ensure SO_RCVBUF memory is observed on ingress redirect 2020-11-18 00:12:34 +01:00
tcp_cdg.c
tcp_cong.c net: Only allow init netns to set default tcp cong to a restricted algo 2021-05-14 09:50:46 +02:00
tcp_cubic.c
tcp_dctcp.c
tcp_dctcp.h
tcp_diag.c
tcp_fastopen.c
tcp_highspeed.c
tcp_htcp.c
tcp_hybla.c
tcp_illinois.c
tcp_input.c net: tcp better handling of reordering then loss cases 2021-07-19 09:44:45 +02:00
tcp_ipv4.c tcp: Fix potential use-after-free due to double kfree() 2021-01-27 11:55:28 +01:00
tcp_lp.c
tcp_metrics.c
tcp_minisocks.c tcp: relookup sock for RST+ACK packets handled by obsolete req sock 2021-03-30 14:31:59 +02:00
tcp_nv.c
tcp_offload.c
tcp_output.c tcp: make TCP_USER_TIMEOUT accurate for zero window probes 2021-02-03 23:28:51 +01:00
tcp_rate.c
tcp_recovery.c tcp: fix TLP timer not set when CA_STATE changes from DISORDER to OPEN 2021-02-03 23:28:52 +01:00
tcp_scalable.c
tcp_timer.c tcp: make TCP_USER_TIMEOUT accurate for zero window probes 2021-02-03 23:28:51 +01:00
tcp_ulp.c
tcp_vegas.c
tcp_vegas.h
tcp_veno.c
tcp_westwood.c
tcp_yeah.c
tcp.c tcp: add sanity tests to TCP_QUEUE_SEQ 2021-03-17 17:06:11 +01:00
tunnel4.c
udp_bpf.c
udp_diag.c
udp_impl.h
udp_offload.c net: Fix gro aggregation for udp encaps with zero csum 2021-03-17 17:06:11 +01:00
udp_tunnel_core.c
udp_tunnel_nic.c
udp_tunnel_stub.c
udp.c udp: fix race between close() and udp_abort() 2021-06-23 14:42:42 +02:00
udplite.c
xfrm4_input.c
xfrm4_output.c
xfrm4_policy.c
xfrm4_protocol.c
xfrm4_state.c
xfrm4_tunnel.c