linux_dsm_epyc7002/net/ipv4
Yuchung Cheng b9f64820fb tcp: track data delivery rate for a TCP connection
This patch generates data delivery rate (throughput) samples on a
per-ACK basis. These rate samples can be used by congestion control
modules, and specifically will be used by TCP BBR in later patches in
this series.

Key state:

tp->delivered: Tracks the total number of data packets (original or not)
	       delivered so far. This is an already-existing field.

tp->delivered_mstamp: the last time tp->delivered was updated.

Algorithm:

A rate sample is calculated as (d1 - d0)/(t1 - t0) on a per-ACK basis:

  d1: the current tp->delivered after processing the ACK
  t1: the current time after processing the ACK

  d0: the prior tp->delivered when the acked skb was transmitted
  t0: the prior tp->delivered_mstamp when the acked skb was transmitted

When an skb is transmitted, we snapshot d0 and t0 in its control
block in tcp_rate_skb_sent().

When an ACK arrives, it may SACK and ACK some skbs. For each SACKed
or ACKed skb, tcp_rate_skb_delivered() updates the rate_sample struct
to reflect the latest (d0, t0).

Finally, tcp_rate_gen() generates a rate sample by storing
(d1 - d0) in rs->delivered and (t1 - t0) in rs->interval_us.

One caveat: if an skb was sent with no packets in flight, then
tp->delivered_mstamp may be either invalid (if the connection is
starting) or outdated (if the connection was idle). In that case,
we'll re-stamp tp->delivered_mstamp.

At first glance it seems t0 should always be the time when an skb was
transmitted, but actually this could over-estimate the rate due to
phase mismatch between transmit and ACK events. To track the delivery
rate, we ensure that if packets are in flight then t0 and and t1 are
times at which packets were marked delivered.

If the initial and final RTTs are different then one may be corrupted
by some sort of noise. The noise we see most often is sending gaps
caused by delayed, compressed, or stretched acks. This either affects
both RTTs equally or artificially reduces the final RTT. We approach
this by recording the info we need to compute the initial RTT
(duration of the "send phase" of the window) when we recorded the
associated inflight. Then, for a filter to avoid bandwidth
overestimates, we generalize the per-sample bandwidth computation
from:

    bw = delivered / ack_phase_rtt

to the following:

    bw = delivered / max(send_phase_rtt, ack_phase_rtt)

In large-scale experiments, this filtering approach incorporating
send_phase_rtt is effective at avoiding bandwidth overestimates due to
ACK compression or stretched ACKs.

Signed-off-by: Van Jacobson <vanj@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-21 00:23:00 -04:00
..
netfilter Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-09-12 15:52:44 -07:00
af_inet.c gso: Support partial splitting at the frag_list pointer 2016-09-19 20:59:34 -04:00
ah4.c ah4: Fix error return in ah_input(). 2015-08-25 13:38:50 -07:00
arp.c net: rename NET_{ADD|INC}_STATS_BH() 2016-04-27 22:48:24 -04:00
cipso_ipv4.c Merge branch 'stable-4.8' of git://git.infradead.org/users/pcmoore/selinux into next 2016-07-07 10:15:34 +10:00
datagram.c
devinet.c netconf: add a notif when settings are created 2016-09-01 15:18:08 -07:00
esp4.c esp: Fix ESN generation under UDP encapsulation 2016-06-23 11:52:00 -04:00
fib_frontend.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-09-12 15:52:44 -07:00
fib_lookup.h ipv4: consider TOS in fib_select_default 2015-07-24 22:46:11 -07:00
fib_rules.c net: flow: Add l3mdev flow update 2016-09-10 23:12:51 -07:00
fib_semantics.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-09-12 15:52:44 -07:00
fib_trie.c ipv4: fix value of ->nlmsg_flags reported in RTM_NEWROUTE events 2016-09-09 16:50:23 -07:00
fou.c fou: make nla_policy const 2016-09-01 14:09:00 -07:00
gre_demux.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-06-30 05:03:36 -04:00
gre_offload.c gso: Support partial splitting at the frag_list pointer 2016-09-19 20:59:34 -04:00
icmp.c net: icmp: rename ICMPMSGIN_INC_STATS_BH() 2016-04-27 22:48:23 -04:00
igmp.c net/multicast: should not send source list records when have filter mode change 2016-08-08 16:04:39 -07:00
inet_connection_sock.c timers, net/ipv4/inet: Initialize connection request timers as pinned 2016-07-07 10:35:06 +02:00
inet_diag.c net: inet: diag: expose the socket mark to privileged processes. 2016-09-08 16:13:09 -07:00
inet_fragment.c net: disable fragment reassembly if high_thresh is zero 2016-06-05 22:56:42 -04:00
inet_hashtables.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-05-04 00:52:29 -04:00
inet_timewait_sock.c timers, net/ipv4/inet: Initialize connection request timers as pinned 2016-07-07 10:35:06 +02:00
inetpeer.c net: Add helper function to compare inetpeer addresses 2015-08-28 13:32:36 -07:00
ip_forward.c net/ipv4: Introduce IPSKB_FRAG_SEGS bit to inet_skb_parm.flags 2016-07-19 16:40:22 -07:00
ip_fragment.c net: rename IP_INC_STATS_BH() 2016-04-27 22:48:23 -04:00
ip_gre.c net/ip_tunnels: Introduce tunnel_id_to_key32() and key32_to_tunnel_id() 2016-09-10 20:53:55 -07:00
ip_input.c net: original ingress device index in PKTINFO 2016-05-11 19:31:40 -04:00
ip_options.c net: ipv4: Convert IP network timestamps to be y2038 safe 2016-03-01 17:18:44 -05:00
ip_output.c net: l3mdev: remove redundant calls 2016-09-10 23:12:52 -07:00
ip_sockglue.c ipv4: accept u8 in IP_TOS ancillary data 2016-09-08 17:45:57 -07:00
ip_tunnel_core.c ip_tunnel: do not clear l4 hashes 2016-09-09 19:33:11 -07:00
ip_tunnel.c ip_tunnel: add collect_md mode to IPIP tunnel 2016-09-17 10:13:07 -04:00
ip_vti.c vti: flush x-netns xfrm cache when vti interface is removed 2016-08-09 12:57:49 -07:00
ipcomp.c
ipconfig.c net: ipconfig: Fix NULL pointer dereference on RARP/BOOTP/DHCP timeout 2016-08-22 21:04:41 -07:00
ipip.c ip_tunnel: add collect_md mode to IPIP tunnel 2016-09-17 10:13:07 -04:00
ipmr.c net: ipmr/ip6mr: update lastuse on entry change 2016-07-26 15:18:31 -07:00
Kconfig tcp: add NV congestion control 2016-06-10 23:07:49 -07:00
Makefile tcp: track data delivery rate for a TCP connection 2016-09-21 00:23:00 -04:00
netfilter.c ipv4: Pass struct net into ip_route_me_harder 2015-09-29 20:21:32 +02:00
ping.c sock: enable timestamping using control messages 2016-04-04 15:50:30 -04:00
proc.c tcp: md5: add LINUX_MIB_TCPMD5FAILURE counter 2016-08-25 16:43:11 -07:00
protocol.c
raw.c net: ipv4: Remove l3mdev_get_saddr 2016-09-10 23:12:53 -07:00
route.c net: l3mdev: remove redundant calls 2016-09-10 23:12:52 -07:00
syncookies.c net: rename NET_{ADD|INC}_STATS_BH() 2016-04-27 22:48:24 -04:00
sysctl_net_ipv4.c ipv4: Fix non-initialized TTL when CONFIG_SYSCTL=n 2016-05-23 14:32:06 -07:00
tcp_bic.c tcp: replace cnt & rtt with struct in pkts_acked() 2016-05-11 14:43:19 -04:00
tcp_cdg.c tcp: cdg: rename struct minmax in tcp_cdg.c to avoid a naming conflict 2016-09-21 00:22:59 -04:00
tcp_cong.c tcp: remove tcp_ecn_make_synack() socket argument 2015-09-25 13:00:38 -07:00
tcp_cubic.c tcp: replace cnt & rtt with struct in pkts_acked() 2016-05-11 14:43:19 -04:00
tcp_dctcp.c tcp: return sizeof tcp_dctcp_info in dctcp_get_info() 2016-06-14 23:46:30 -07:00
tcp_diag.c net: diag: Fix refcnt leak in error path destroying socket 2016-08-23 23:11:36 -07:00
tcp_fastopen.c tcp: fastopen: avoid negative sk_forward_alloc 2016-09-08 16:08:10 -07:00
tcp_highspeed.c
tcp_htcp.c tcp: replace cnt & rtt with struct in pkts_acked() 2016-05-11 14:43:19 -04:00
tcp_hybla.c
tcp_illinois.c tcp: replace cnt & rtt with struct in pkts_acked() 2016-05-11 14:43:19 -04:00
tcp_input.c tcp: track data delivery rate for a TCP connection 2016-09-21 00:23:00 -04:00
tcp_ipv4.c tcp: use an RB tree for ooo receive queue 2016-09-08 17:25:58 -07:00
tcp_lp.c tcp: replace cnt & rtt with struct in pkts_acked() 2016-05-11 14:43:19 -04:00
tcp_metrics.c tcp: make nla_policy const 2016-09-01 14:09:01 -07:00
tcp_minisocks.c tcp: use windowed min filter library for TCP min_rtt estimation 2016-09-21 00:22:59 -04:00
tcp_nv.c tcp: add NV congestion control 2016-06-10 23:07:49 -07:00
tcp_offload.c gso: Support partial splitting at the frag_list pointer 2016-09-19 20:59:34 -04:00
tcp_output.c tcp: track data delivery rate for a TCP connection 2016-09-21 00:23:00 -04:00
tcp_probe.c net: ipv4: tcp_probe: Replace timespec with timespec64 2016-03-01 17:18:44 -05:00
tcp_rate.c tcp: track data delivery rate for a TCP connection 2016-09-21 00:23:00 -04:00
tcp_recovery.c tcp: do not assume TCP code is non preemptible 2016-05-02 17:02:25 -04:00
tcp_scalable.c
tcp_timer.c tcp_timer.c: Add kernel-doc function descriptions 2016-07-15 23:18:14 -07:00
tcp_vegas.c tcp: replace cnt & rtt with struct in pkts_acked() 2016-05-11 14:43:19 -04:00
tcp_vegas.h tcp: replace cnt & rtt with struct in pkts_acked() 2016-05-11 14:43:19 -04:00
tcp_veno.c tcp: replace cnt & rtt with struct in pkts_acked() 2016-05-11 14:43:19 -04:00
tcp_westwood.c tcp: replace cnt & rtt with struct in pkts_acked() 2016-05-11 14:43:19 -04:00
tcp_yeah.c tcp: cwnd does not increase in TCP YeAH 2016-09-08 17:16:12 -07:00
tcp.c tcp: switch back to proper tcp_skb_cb size check in tcp_init() 2016-09-21 00:23:00 -04:00
tunnel4.c tunnels: correct conditional build of MPLS and IPv6 2016-07-11 13:27:06 -07:00
udp_diag.c net: inet: diag: expose the socket mark to privileged processes. 2016-09-08 16:13:09 -07:00
udp_impl.h
udp_offload.c gso: Support partial splitting at the frag_list pointer 2016-09-19 20:59:34 -04:00
udp_tunnel.c net: Remove deprecated tunnel specific UDP offload functions 2016-06-17 20:23:32 -07:00
udp.c net: ipv4: Remove l3mdev_get_saddr 2016-09-10 23:12:53 -07:00
udplite.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-08-30 00:54:02 -04:00
xfrm4_input.c netfilter: Pass net into okfn 2015-09-17 17:18:37 -07:00
xfrm4_mode_beet.c
xfrm4_mode_transport.c
xfrm4_mode_tunnel.c
xfrm4_output.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-10-24 06:54:12 -07:00
xfrm4_policy.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-09-12 15:52:44 -07:00
xfrm4_protocol.c
xfrm4_state.c
xfrm4_tunnel.c