linux_dsm_epyc7002/net/ipv6
Holger Eitzenberger a452ce345d net: Fix memory leak if TPROXY used with TCP early demux
I see a memory leak when using a transparent HTTP proxy using TPROXY
together with TCP early demux and Kernel v3.8.13.15 (Ubuntu stable):

unreferenced object 0xffff88008cba4a40 (size 1696):
  comm "softirq", pid 0, jiffies 4294944115 (age 8907.520s)
  hex dump (first 32 bytes):
    0a e0 20 6a 40 04 1b 37 92 be 32 e2 e8 b4 00 00  .. j@..7..2.....
    02 00 07 01 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff810b710a>] kmem_cache_alloc+0xad/0xb9
    [<ffffffff81270185>] sk_prot_alloc+0x29/0xc5
    [<ffffffff812702cf>] sk_clone_lock+0x14/0x283
    [<ffffffff812aaf3a>] inet_csk_clone_lock+0xf/0x7b
    [<ffffffff8129a893>] netlink_broadcast+0x14/0x16
    [<ffffffff812c1573>] tcp_create_openreq_child+0x1b/0x4c3
    [<ffffffff812c033e>] tcp_v4_syn_recv_sock+0x38/0x25d
    [<ffffffff812c13e4>] tcp_check_req+0x25c/0x3d0
    [<ffffffff812bf87a>] tcp_v4_do_rcv+0x287/0x40e
    [<ffffffff812a08a7>] ip_route_input_noref+0x843/0xa55
    [<ffffffff812bfeca>] tcp_v4_rcv+0x4c9/0x725
    [<ffffffff812a26f4>] ip_local_deliver_finish+0xe9/0x154
    [<ffffffff8127a927>] __netif_receive_skb+0x4b2/0x514
    [<ffffffff8127aa77>] process_backlog+0xee/0x1c5
    [<ffffffff8127c949>] net_rx_action+0xa7/0x200
    [<ffffffff81209d86>] add_interrupt_randomness+0x39/0x157

But there are many more, resulting in the machine going OOM after some
days.

From looking at the TPROXY code, and with help from Florian, I see
that the memory leak is introduced in tcp_v4_early_demux():

  void tcp_v4_early_demux(struct sk_buff *skb)
  {
    /* ... */

    iph = ip_hdr(skb);
    th = tcp_hdr(skb);

    if (th->doff < sizeof(struct tcphdr) / 4)
        return;

    sk = __inet_lookup_established(dev_net(skb->dev), &tcp_hashinfo,
                       iph->saddr, th->source,
                       iph->daddr, ntohs(th->dest),
                       skb->skb_iif);
    if (sk) {
        skb->sk = sk;

where the socket is assigned unconditionally to skb->sk, also bumping
the refcnt on it.  This is problematic, because in our case the skb
has already a socket assigned in the TPROXY target.  This then results
in the leak I see.

The very same issue seems to be with IPv6, but haven't tested.

Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-27 16:22:11 -08:00
..
netfilter netfilter: nf_tables: fix error path in the init functions 2014-01-09 23:25:48 +01:00
addrconf_core.c ipv6: move in6_dev_finish_destroy() into core kernel 2013-08-31 22:30:00 -04:00
addrconf.c ipv6: reallocate addrconf router for ipv6 address when lo device up 2014-01-24 15:59:38 -08:00
addrlabel.c ipv6: fix null pointer dereference in __ip6addrlbl_add 2013-09-04 14:14:53 -04:00
af_inet6.c ipv6: add flowlabel_consistency sysctl 2014-01-19 17:12:31 -08:00
ah6.c ipv4/ipv6: Fix FSF address in file headers 2013-12-06 12:37:56 -05:00
anycast.c ipv6: enable anycast addresses as source addresses for datagrams 2014-01-22 21:57:05 -08:00
datagram.c ipv6: enable anycast addresses as source addresses for datagrams 2014-01-22 21:57:05 -08:00
esp6.c ipv4/ipv6: Fix FSF address in file headers 2013-12-06 12:37:56 -05:00
exthdrs_core.c ipv6: Correct comparisons and calculations using skb->tail and skb-transport_header 2013-05-28 23:49:07 -07:00
exthdrs_offload.c
exthdrs.c ipv6/exthdrs: accept tlv which includes only padding 2013-09-11 15:52:27 -04:00
fib6_rules.c ipv6: move IPV6_TCLASS_SHIFT into ipv6.h and define a helper 2014-01-15 15:53:18 -08:00
icmp.c ipv6: enable anycast addresses as source addresses in ICMPv6 error messages 2014-01-21 16:53:23 -08:00
inet6_connection_sock.c net: Remove FLOWI_FLAG_CAN_SLEEP 2013-12-06 07:24:39 +01:00
inet6_hashtables.c inet: convert inet_ehash_secret and ipv6_hash_secret to net_get_random_once 2013-10-19 19:45:35 -04:00
ip6_checksum.c
ip6_fib.c ipv6: remove prune parameter for fib6_clean_all 2014-01-02 03:30:35 -05:00
ip6_flowlabel.c ipv6: add flowlabel_consistency sysctl 2014-01-19 17:12:31 -08:00
ip6_gre.c ipv6: move IPV6_TCLASS_SHIFT into ipv6.h and define a helper 2014-01-15 15:53:18 -08:00
ip6_icmp.c
ip6_input.c net: Fix memory leak if TPROXY used with TCP early demux 2014-01-27 16:22:11 -08:00
ip6_offload.c net-gre-gro: Add GRE support to the GRO stack 2014-01-07 16:21:31 -05:00
ip6_offload.h
ip6_output.c ipv6: introduce ip6_dst_mtu_forward and protect forwarding path with it 2014-01-13 11:22:54 -08:00
ip6_tunnel.c ipv6: move IPV6_TCLASS_SHIFT into ipv6.h and define a helper 2014-01-15 15:53:18 -08:00
ip6_vti.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-01-14 14:42:42 -08:00
ip6mr.c net: avoid reference counter overflows on fib_rules in multicast forwarding 2014-01-14 17:37:25 -08:00
ipcomp6.c ipv4/ipv6: Fix FSF address in file headers 2013-12-06 12:37:56 -05:00
ipv6_sockglue.c ipv6: make IPV6_RECVPKTINFO work for ipv4 datagrams 2014-01-19 19:53:18 -08:00
Kconfig ipv6: Remove privacy config option. 2013-10-28 20:07:50 -04:00
Makefile ipv6: Add support for IPsec virtual tunnel interfaces 2013-10-10 12:00:01 +02:00
mcast.c ipv6: send Change Status Report after DAD is completed 2014-01-17 18:12:29 -08:00
mip6.c ipv4/ipv6: Fix FSF address in file headers 2013-12-06 12:37:56 -05:00
ndisc.c neigh: use tbl->family to distinguish ipv4 from ipv6 2013-12-09 20:56:12 -05:00
netfilter.c
output_core.c ipv6: move ip6_local_out into core kernel 2013-08-31 22:30:00 -04:00
ping.c ipv6: protect protocols not handling ipv4 from v4 connection/bind attempts 2014-01-21 16:59:19 -08:00
proc.c net: add SNMP counters tracking incoming ECN bits 2013-08-08 22:24:59 -07:00
protocol.c net: remove outdated comment for ipv4 and ipv6 protocol handler 2013-11-28 18:47:51 -05:00
raw.c ipv6: protect protocols not handling ipv4 from v4 connection/bind attempts 2014-01-21 16:59:19 -08:00
reassembly.c ipv6: split inet6_hash_frag for netfilter and initialize secrets with net_get_random_once 2013-10-23 17:01:40 -04:00
route.c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-01-06 17:37:45 -05:00
sit.c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-01-06 17:37:45 -05:00
syncookies.c net: Remove FLOWI_FLAG_CAN_SLEEP 2013-12-06 07:24:39 +01:00
sysctl_net_ipv6.c ipv6: add flowlabel_consistency sysctl 2014-01-19 17:12:31 -08:00
tcp_ipv6.c tcp: delete redundant calls of tcp_mtup_init() 2014-01-21 16:52:31 -08:00
tcpv6_offload.c net-gro: Prepare GRO stack for the upcoming tunneling support 2013-12-12 13:47:53 -05:00
tunnel6.c ipv4/ipv6: Fix FSF address in file headers 2013-12-06 12:37:56 -05:00
udp_impl.h net: ipv4/ipv6: Remove extern from function prototypes 2013-10-19 19:12:11 -04:00
udp_offload.c ipv6: fix headroom calculation in udp6_ufo_fragment 2013-11-05 22:09:53 -05:00
udp.c ipv6: make IPV6_RECVPKTINFO work for ipv4 datagrams 2014-01-19 19:53:18 -08:00
udplite.c
xfrm6_input.c
xfrm6_mode_beet.c
xfrm6_mode_ro.c ipv4/ipv6: Fix FSF address in file headers 2013-12-06 12:37:56 -05:00
xfrm6_mode_transport.c
xfrm6_mode_tunnel.c ipv6: Add a receive path hook for vti6 in xfrm6_mode_tunnel. 2013-10-09 13:16:36 +02:00
xfrm6_output.c xfrm: revert ipv4 mtu determination to dst_mtu 2013-08-26 12:40:53 +02:00
xfrm6_policy.c xfrm: Fix null pointer dereference when decoding sessions 2013-11-01 07:08:46 +01:00
xfrm6_state.c xfrm: make local error reporting more robust 2013-08-14 13:07:12 +02:00
xfrm6_tunnel.c ipv4/ipv6: Fix FSF address in file headers 2013-12-06 12:37:56 -05:00