linux_dsm_epyc7002/net/ipv4
Holger Eitzenberger a452ce345d net: Fix memory leak if TPROXY used with TCP early demux
I see a memory leak when using a transparent HTTP proxy using TPROXY
together with TCP early demux and Kernel v3.8.13.15 (Ubuntu stable):

unreferenced object 0xffff88008cba4a40 (size 1696):
  comm "softirq", pid 0, jiffies 4294944115 (age 8907.520s)
  hex dump (first 32 bytes):
    0a e0 20 6a 40 04 1b 37 92 be 32 e2 e8 b4 00 00  .. j@..7..2.....
    02 00 07 01 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff810b710a>] kmem_cache_alloc+0xad/0xb9
    [<ffffffff81270185>] sk_prot_alloc+0x29/0xc5
    [<ffffffff812702cf>] sk_clone_lock+0x14/0x283
    [<ffffffff812aaf3a>] inet_csk_clone_lock+0xf/0x7b
    [<ffffffff8129a893>] netlink_broadcast+0x14/0x16
    [<ffffffff812c1573>] tcp_create_openreq_child+0x1b/0x4c3
    [<ffffffff812c033e>] tcp_v4_syn_recv_sock+0x38/0x25d
    [<ffffffff812c13e4>] tcp_check_req+0x25c/0x3d0
    [<ffffffff812bf87a>] tcp_v4_do_rcv+0x287/0x40e
    [<ffffffff812a08a7>] ip_route_input_noref+0x843/0xa55
    [<ffffffff812bfeca>] tcp_v4_rcv+0x4c9/0x725
    [<ffffffff812a26f4>] ip_local_deliver_finish+0xe9/0x154
    [<ffffffff8127a927>] __netif_receive_skb+0x4b2/0x514
    [<ffffffff8127aa77>] process_backlog+0xee/0x1c5
    [<ffffffff8127c949>] net_rx_action+0xa7/0x200
    [<ffffffff81209d86>] add_interrupt_randomness+0x39/0x157

But there are many more, resulting in the machine going OOM after some
days.

From looking at the TPROXY code, and with help from Florian, I see
that the memory leak is introduced in tcp_v4_early_demux():

  void tcp_v4_early_demux(struct sk_buff *skb)
  {
    /* ... */

    iph = ip_hdr(skb);
    th = tcp_hdr(skb);

    if (th->doff < sizeof(struct tcphdr) / 4)
        return;

    sk = __inet_lookup_established(dev_net(skb->dev), &tcp_hashinfo,
                       iph->saddr, th->source,
                       iph->daddr, ntohs(th->dest),
                       skb->skb_iif);
    if (sk) {
        skb->sk = sk;

where the socket is assigned unconditionally to skb->sk, also bumping
the refcnt on it.  This is problematic, because in our case the skb
has already a socket assigned in the TPROXY target.  This then results
in the leak I see.

The very same issue seems to be with IPv6, but haven't tested.

Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-27 16:22:11 -08:00
..
netfilter netfilter: nf_tables: fix error path in the init functions 2014-01-09 23:25:48 +01:00
af_inet.c ipv4: introduce hardened ip_no_pmtu_disc mode 2014-01-13 11:22:55 -08:00
ah4.c ipv4: properly refresh rtable entries on pmtu/redirect events 2013-06-03 00:07:42 -07:00
arp.c ipv4: arp: update neighbour address when a gratuitous arp is received and arp_accept is set 2014-01-02 00:08:38 -05:00
cipso_ipv4.c ipv4: ERROR: code indent should use tabs where possible 2013-12-26 13:43:21 -05:00
datagram.c net: Remove FLOWI_FLAG_CAN_SLEEP 2013-12-06 07:24:39 +01:00
devinet.c net/ipv4: queue work on power efficient wq 2014-01-22 21:57:05 -08:00
esp4.c net: esp{4,6}: get rid of struct esp_data 2013-10-29 06:39:42 +01:00
fib_frontend.c fib_frontend: fix possible NULL pointer dereference 2014-01-24 15:51:26 -08:00
fib_lookup.h ipv4: make fib_detect_death static 2013-12-28 17:01:46 -05:00
fib_rules.c inet: fix NULL pointer Oops in fib(6)_rule_suppress 2013-12-10 17:54:23 -05:00
fib_semantics.c ipv4: make fib_detect_death static 2013-12-28 17:01:46 -05:00
fib_trie.c seq_file: remove "%n" usage from seq_file users 2013-11-15 09:32:20 +09:00
gre_demux.c gre_offload: statically build GRE offloading support 2014-01-06 20:28:34 -05:00
gre_offload.c net/ipv4: don't use module_init in non-modular gre_offload 2014-01-16 16:08:27 -08:00
icmp.c ipv4: introduce hardened ip_no_pmtu_disc mode 2014-01-13 11:22:55 -08:00
igmp.c net: replace macros net_random and net_srandom with direct calls to prandom 2014-01-14 15:15:25 -08:00
inet_connection_sock.c net: replace macros net_random and net_srandom with direct calls to prandom 2014-01-14 15:15:25 -08:00
inet_diag.c inet_diag: fix inet_diag_dump_icsk() to use correct state for timewait sockets 2014-01-13 22:35:46 -08:00
inet_fragment.c inet: remove old fragmentation hash initializing 2013-10-23 17:01:41 -04:00
inet_hashtables.c inet: convert inet_ehash_secret and ipv6_hash_secret to net_get_random_once 2013-10-19 19:45:35 -04:00
inet_lro.c lro: remove dead code 2013-12-29 16:34:25 -05:00
inet_timewait_sock.c tcp/dccp: remove twchain 2013-10-08 23:19:24 -04:00
inetpeer.c ipv4: remove unused function 2013-12-28 17:03:20 -05:00
ip_forward.c ipv4: introduce ip_dst_mtu_maybe_forward and protect forwarding path against pmtu spoofing 2014-01-13 11:22:54 -08:00
ip_fragment.c net: Add utility functions to clear rxhash 2013-12-17 16:36:21 -05:00
ip_gre.c ipv4: be friend with drop monitor 2014-01-18 23:08:02 -08:00
ip_input.c net: Fix memory leak if TPROXY used with TCP early demux 2014-01-27 16:22:11 -08:00
ip_options.c ipv4: switch and case should be at the same indent 2014-01-02 03:30:36 -05:00
ip_output.c ipv4: register igmp_notifier even when !CONFIG_PROC_FS 2014-01-14 15:03:33 -08:00
ip_sockglue.c ipv6: make IPV6_RECVPKTINFO work for ipv4 datagrams 2014-01-19 19:53:18 -08:00
ip_tunnel_core.c net: Add utility functions to clear rxhash 2013-12-17 16:36:21 -05:00
ip_tunnel.c net: ipv4: Use PTR_ERR_OR_ZERO 2014-01-27 12:57:31 -08:00
ip_vti.c ipv4: be friend with drop monitor 2014-01-18 23:08:02 -08:00
ipcomp.c ipv4: properly refresh rtable entries on pmtu/redirect events 2013-06-03 00:07:42 -07:00
ipconfig.c ipconfig: add informative timeout messages while waiting for carrier 2013-04-02 14:35:33 -04:00
ipip.c ipv4: be friend with drop monitor 2014-01-18 23:08:02 -08:00
ipmr.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-01-18 00:55:41 -08:00
Kconfig net: neighbour: Remove CONFIG_ARPD 2013-09-03 21:41:43 -04:00
Makefile gre_offload: statically build GRE offloading support 2014-01-06 20:28:34 -05:00
netfilter.c netfilter: add my copyright statements 2013-04-18 20:27:55 +02:00
ping.c ipv6: protect protocols not handling ipv4 from v4 connection/bind attempts 2014-01-21 16:59:19 -08:00
proc.c ipv4: spaces required around that '=' 2014-01-02 03:30:36 -05:00
protocol.c net: remove outdated comment for ipv4 and ipv6 protocol handler 2013-11-28 18:47:51 -05:00
raw.c net: add build-time checks for msg->msg_name size 2014-01-18 23:04:16 -08:00
route.c ipv4: introduce ip_dst_mtu_maybe_forward and protect forwarding path against pmtu spoofing 2014-01-13 11:22:54 -08:00
syncookies.c ipv4: fix checkpatch error "space prohibited" 2013-12-26 13:43:21 -05:00
sysctl_net_ipv4.c ipv4: introduce ip_dst_mtu_maybe_forward and protect forwarding path against pmtu spoofing 2014-01-13 11:22:54 -08:00
tcp_bic.c tcp: properly handle stretch acks in slow start 2013-11-04 19:57:59 -05:00
tcp_cong.c tcp: properly handle stretch acks in slow start 2013-11-04 19:57:59 -05:00
tcp_cubic.c tcp: properly handle stretch acks in slow start 2013-11-04 19:57:59 -05:00
tcp_diag.c inet_diag: Rename inet_diag_req into inet_diag_req_v2 2012-01-11 12:56:06 -08:00
tcp_fastopen.c tcp: enable sockets to use MSG_FASTOPEN by default 2013-11-04 19:57:47 -05:00
tcp_highspeed.c tcp: properly handle stretch acks in slow start 2013-11-04 19:57:59 -05:00
tcp_htcp.c tcp: properly handle stretch acks in slow start 2013-11-04 19:57:59 -05:00
tcp_hybla.c tcp: properly handle stretch acks in slow start 2013-11-04 19:57:59 -05:00
tcp_illinois.c remove extra definitions of U32_MAX 2014-01-23 16:36:55 -08:00
tcp_input.c tcp: make local functions static 2013-12-29 16:34:24 -05:00
tcp_ipv4.c tcp: delete redundant calls of tcp_mtup_init() 2014-01-21 16:52:31 -08:00
tcp_lp.c tcp: properly handle stretch acks in slow start 2013-11-04 19:57:59 -05:00
tcp_memcontrol.c tcp_memcontrol: Cleanup/fix cg_proto->memory_pressure handling. 2013-12-05 21:01:01 -05:00
tcp_metrics.c tcp: metrics: Handle v6/v4-mapped sockets in tcp-metrics 2014-01-23 12:48:28 -08:00
tcp_minisocks.c ipv6: tcp: fix flowlabel value in ACK messages send from TIME_WAIT 2014-01-17 17:56:33 -08:00
tcp_offload.c tcp: do not export tcp_gso_segment() and tcp_gro_receive() 2014-01-14 18:53:48 -08:00
tcp_output.c tcp: make local functions static 2013-12-29 16:34:24 -05:00
tcp_probe.c ipv4: ERROR: do not initialise globals to 0 or NULL 2013-12-26 13:43:21 -05:00
tcp_scalable.c tcp: properly handle stretch acks in slow start 2013-11-04 19:57:59 -05:00
tcp_timer.c tcp: temporarily disable Fast Open on SYN timeout 2013-10-29 22:50:41 -04:00
tcp_vegas.c tcp: properly handle stretch acks in slow start 2013-11-04 19:57:59 -05:00
tcp_vegas.h net: ipv4/ipv6: Remove extern from function prototypes 2013-10-19 19:12:11 -04:00
tcp_veno.c tcp: properly handle stretch acks in slow start 2013-11-04 19:57:59 -05:00
tcp_westwood.c tcp: refactor F-RTO 2013-03-21 11:47:50 -04:00
tcp_yeah.c ipv4: ipv4: Cleanup the comments in tcp_yeah.c 2013-12-26 13:43:55 -05:00
tcp.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2014-01-25 11:17:34 -08:00
tunnel4.c net: Convert printks to pr_<level> 2012-03-11 23:42:51 -07:00
udp_diag.c netlink: rename ssk to sk in struct netlink_skb_params 2013-04-19 14:57:56 -04:00
udp_impl.h net: ipv4/ipv6: Remove extern from function prototypes 2013-10-19 19:12:11 -04:00
udp_offload.c net/udp_offload: Handle static checker complaints 2014-01-23 12:59:08 -08:00
udp.c net: add build-time checks for msg->msg_name size 2014-01-18 23:04:16 -08:00
udplite.c net: ipv4: Standardize prefixes for message logging 2012-03-12 17:05:21 -07:00
xfrm4_input.c net: Add skb_unclone() helper function. 2013-02-15 15:10:37 -05:00
xfrm4_mode_beet.c ipv4: ERROR: code indent should use tabs where possible 2013-12-26 13:43:21 -05:00
xfrm4_mode_transport.c
xfrm4_mode_tunnel.c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next 2013-09-30 15:24:57 -04:00
xfrm4_output.c xfrm: revert ipv4 mtu determination to dst_mtu 2013-08-26 12:40:53 +02:00
xfrm4_policy.c xfrm: Fix null pointer dereference when decoding sessions 2013-11-01 07:08:46 +01:00
xfrm4_state.c inet: make no_pmtu_disc per namespace and kill ipv4_config 2013-12-18 16:58:20 -05:00
xfrm4_tunnel.c sit: add IPv4 over IPv4 support 2013-05-31 17:19:05 -07:00