linux_dsm_epyc7002/include/net
Nandita Dukkipati 6ba8a3b19e tcp: Tail loss probe (TLP)
This patch series implement the Tail loss probe (TLP) algorithm described
in http://tools.ietf.org/html/draft-dukkipati-tcpm-tcp-loss-probe-01. The
first patch implements the basic algorithm.

TLP's goal is to reduce tail latency of short transactions. It achieves
this by converting retransmission timeouts (RTOs) occuring due
to tail losses (losses at end of transactions) into fast recovery.
TLP transmits one packet in two round-trips when a connection is in
Open state and isn't receiving any ACKs. The transmitted packet, aka
loss probe, can be either new or a retransmission. When there is tail
loss, the ACK from a loss probe triggers FACK/early-retransmit based
fast recovery, thus avoiding a costly RTO. In the absence of loss,
there is no change in the connection state.

PTO stands for probe timeout. It is a timer event indicating
that an ACK is overdue and triggers a loss probe packet. The PTO value
is set to max(2*SRTT, 10ms) and is adjusted to account for delayed
ACK timer when there is only one oustanding packet.

TLP Algorithm

On transmission of new data in Open state:
  -> packets_out > 1: schedule PTO in max(2*SRTT, 10ms).
  -> packets_out == 1: schedule PTO in max(2*RTT, 1.5*RTT + 200ms)
  -> PTO = min(PTO, RTO)

Conditions for scheduling PTO:
  -> Connection is in Open state.
  -> Connection is either cwnd limited or no new data to send.
  -> Number of probes per tail loss episode is limited to one.
  -> Connection is SACK enabled.

When PTO fires:
  new_segment_exists:
    -> transmit new segment.
    -> packets_out++. cwnd remains same.

  no_new_packet:
    -> retransmit the last segment.
       Its ACK triggers FACK or early retransmit based recovery.

ACK path:
  -> rearm RTO at start of ACK processing.
  -> reschedule PTO if need be.

In addition, the patch includes a small variation to the Early Retransmit
(ER) algorithm, such that ER and TLP together can in principle recover any
N-degree of tail loss through fast recovery. TLP is controlled by the same
sysctl as ER, tcp_early_retrans sysctl.
tcp_early_retrans==0; disables TLP and ER.
		 ==1; enables RFC5827 ER.
		 ==2; delayed ER.
		 ==3; TLP and delayed ER. [DEFAULT]
		 ==4; TLP only.

The TLP patch series have been extensively tested on Google Web servers.
It is most effective for short Web trasactions, where it reduced RTOs by 15%
and improved HTTP response time (average by 6%, 99th percentile by 10%).
The transmitted probes account for <0.5% of the overall transmissions.

Signed-off-by: Nandita Dukkipati <nanditad@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-12 08:30:34 -04:00
..
9p 9p: turn fid->dlist into hlist 2013-02-27 22:51:08 -05:00
bluetooth Bluetooth: Keep track of UUID type upon addition 2013-02-01 15:50:17 -02:00
caif caif-hsi: Remove use of module parameters 2012-06-25 16:44:12 -07:00
irda treewide: Replace incomming with incoming in all comments and strings 2013-01-03 16:15:49 +01:00
iucv af_iucv: add shutdown for HS transport 2012-03-07 22:52:24 -08:00
netfilter netfilter: nf_ct_helper: better logging for dropped packets 2013-02-19 02:48:05 +01:00
netns xfrm: make gc_thresh configurable in all namespaces 2013-02-06 11:36:29 +01:00
nfc NFC: Initial Secure Element API 2013-01-10 00:51:54 +01:00
phonet net: remove my future former mail address 2012-06-17 16:29:38 -07:00
sctp hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
tc_act
act_api.h act_police: move struct tcf_police to act_police.c 2013-02-12 18:59:45 -05:00
addrconf.h ipv4: introduce address lifetime 2013-01-29 13:59:57 -05:00
af_ieee802154.h
af_rxrpc.h
af_unix.h unix: Remove unused field from unix_sock 2012-10-21 20:37:06 -04:00
ah.h
arp.h net: Dont use ifindices in hash fns 2012-08-09 16:18:06 -07:00
atmclip.h atm: clip: Use device neigh support on top of "arp_tbl". 2011-11-30 18:51:03 -05:00
ax25.h hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
ax88796.h
cfg80211-wext.h cfg80211: remove unused wext handler exports 2011-08-08 14:26:29 -04:00
cfg80211.h Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem 2013-02-18 13:47:13 -05:00
checksum.h net: core: add function for incremental IPv6 pseudo header checksum updates 2012-08-30 03:00:16 +02:00
cipso_ipv4.h cipso: handle CIPSO options correctly when NetLabel is disabled 2012-06-01 14:18:29 -04:00
cls_cgroup.h net: Update args to dummy sock_update_classid(). 2012-10-26 05:07:00 -04:00
codel.h codel: refine one condition to avoid a nul rec_inv_sqrt 2012-08-10 16:52:54 -07:00
compat.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
datalink.h
dcbevent.h dcb: Add stub routines for !CONFIG_DCB 2011-10-06 15:49:51 -04:00
dcbnl.h net/dcb: Add an optional max rate attribute 2012-04-05 05:08:04 -04:00
dn_dev.h
dn_fib.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
dn_neigh.h
dn_nsp.h
dn_route.h decnet: use correct RCU API to deref sk_dst_cache field 2013-01-28 00:15:27 -05:00
dn.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
dsa.h dsa: Include linux/if_ether.h to fix build error 2011-12-01 11:41:06 -05:00
dsfield.h ipv6: Optimize ipv6_change_dsfield(). 2013-01-09 23:59:53 -08:00
dst_ops.h net: Fix warnings in dst_ops.h 2012-07-19 10:43:03 -07:00
dst.h ipv6: fix race condition regarding dst->expires and dst->from. 2013-02-20 15:11:45 -05:00
esp.h
ethoc.h
fib_rules.h ipv4: Elide fib_validate_source() completely when possible. 2012-06-29 01:36:36 -07:00
flow_keys.h flow_dissector: use a 64bit load/store 2011-11-29 13:17:03 -05:00
flow.h ipv4: Add FLOWI_FLAG_KNOWN_NH 2012-10-08 17:42:36 -04:00
garp.h
gen_stats.h
genetlink.h netlink: Rename pid to portid to avoid confusion 2012-09-10 15:30:41 -04:00
gre.h
gro_cells.h gro: Fix kcalloc argument order 2013-01-27 22:46:33 -05:00
icmp.h ipv4: fix error handling in icmp_protocol. 2013-02-22 15:10:18 -05:00
ieee80211_radiotap.h mac80211: support (partial) VHT radiotap information 2012-11-27 11:56:18 +01:00
ieee802154_netdev.h mac802154: declare reduced mlme operations 2012-05-16 15:16:56 -04:00
ieee802154.h 6LoWPAN: add fragmentation support 2011-11-14 00:19:42 -05:00
if_inet6.h net: delete all instances of special processing for token ring 2012-05-15 20:14:35 -04:00
inet6_connection_sock.h ipv6: Add helper inet6_csk_update_pmtu(). 2012-07-16 03:44:56 -07:00
inet6_hashtables.h ipv6: use a stronger hash for tcp 2013-02-21 18:15:58 -05:00
inet_common.h net-tcp: Fast Open client - sendmsg(MSG_FASTOPEN) 2012-07-19 11:02:03 -07:00
inet_connection_sock.h tcp: Tail loss probe (TLP) 2013-03-12 08:30:34 -04:00
inet_ecn.h tunnel: drop packet if ECN present with not-ECT 2012-09-27 18:12:37 -04:00
inet_frag.h net: fix possible deadlock in sum_frag_mem_limit 2013-02-22 15:10:19 -05:00
inet_hashtables.h hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
inet_sock.h ipv6: use a stronger hash for tcp 2013-02-21 18:15:58 -05:00
inet_timewait_sock.h hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
inetpeer.h ipv4: Maintain redirect and PMTU info in struct rtable again. 2012-07-10 22:40:14 -07:00
ip6_checksum.h ipv6: move csum_ipv6_magic() and udp6_csum_init() into static library 2013-01-08 17:56:10 -08:00
ip6_fib.h ipv6: fix race condition regarding dst->expires and dst->from. 2013-02-20 15:11:45 -05:00
ip6_route.h ipv6: Remove unused neigh argument for icmp6_dst_alloc() and its callers. 2013-01-18 14:41:13 -05:00
ip6_tunnel.h ipv6: introduce ip6tunnel_xmit() helper 2013-03-10 16:53:34 -04:00
ip_fib.h ipv4: add a fib_type to fib_info 2012-10-04 13:58:26 -04:00
ip_vs.h ipvs: fix build errors related to config option combinations 2012-10-23 09:23:40 +09:00
ip.h ipv4: Add a socket release callback for datagram sockets 2013-01-21 14:17:05 -05:00
ipcomp.h
ipconfig.h
ipip.h tunnel: use iptunnel_xmit() again 2013-03-10 03:05:44 -04:00
ipv6.h ipv6: introdcue __ipv6_addr_needs_scope_id and ipv6_iface_scope_id helper functions 2013-03-08 12:29:22 -05:00
ipx.h
iw_handler.h
lapb.h lapb: Neaten debugging 2012-05-17 18:45:20 -04:00
lib80211.h include: replace linux/module.h with "struct module" wherever possible 2011-10-31 19:32:32 -04:00
llc_c_ac.h
llc_c_ev.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
llc_c_st.h
llc_conn.h
llc_if.h
llc_pdu.h net: delete all instances of special processing for token ring 2012-05-15 20:14:35 -04:00
llc_s_ac.h
llc_s_ev.h
llc_s_st.h
llc_sap.h
llc.h llc: Remove stray reference to sysctl_llc_station_ack_timeout. 2012-09-17 13:13:24 -04:00
mac80211.h mac80211: stop modifying HT SMPS capability 2013-02-15 09:41:41 +01:00
mac802154.h mac802154: add wpan device-class support 2012-06-26 21:06:11 -07:00
mip6.h
mld.h
mrp.h net/802: Implement Multiple Registration Protocol (MRP) 2013-02-10 20:37:22 -05:00
ndisc.h ndisc: Move ndisc_opt_addr_space() to include/net/ndisc.h. 2013-01-21 13:33:14 -05:00
neighbour.h net neighbour, decnet: Ensure to align device private data on preferred alignment. 2013-02-11 00:21:44 -05:00
net_namespace.h proc: Usable inode numbers for the namespace file descriptors. 2012-11-20 04:19:49 -08:00
net_ratelimit.h
netdma.h
netevent.h ipv6 netevent: Remove old_neigh from netevent_redirect. 2013-01-14 15:04:59 -05:00
netlabel.h userns: Convert the audit loginuid to be a kuid 2012-09-17 18:08:54 -07:00
netlink.h netlink: Rename pid to portid to avoid confusion 2012-09-10 15:30:41 -04:00
netprio_cgroup.h netprio_cgroup: use cgroup->id instead of cgroup_netprio_state->prioidx 2012-11-22 07:32:47 -08:00
netrom.h hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
nexthop.h
nl802154.h
p8022.h
ping.h
pkt_cls.h pkt_sched: namespace aware act_mirred 2013-01-14 15:09:36 -05:00
pkt_sched.h sch_api: introduce qdisc_watchdog_schedule_ns() 2013-02-12 18:59:45 -05:00
protocol.h net: Remove code duplication between offload structures 2012-11-15 17:39:51 -05:00
psnap.h
raw.h
rawv6.h ipv6: bool/const conversions phase2 2012-05-19 01:08:16 -04:00
red.h net_sched: red: Make minor corrections to comments 2012-04-16 23:53:11 -04:00
regulatory.h regulatory: use RCU to protect last_request 2013-01-03 13:01:30 +01:00
request_sock.h tcp: better retrans tracking for defer-accept 2012-11-03 14:45:00 -04:00
rose.h
route.h ipv4: avoid a test in ip_rt_put() 2012-11-03 14:59:04 -04:00
rtnetlink.h rtnelink: remove unused parameter from rtnl_create_link(). 2012-11-30 12:24:40 -05:00
sch_generic.h hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
scm.h net: Remove unnecessary NULL check in scm_destroy(). 2012-09-24 15:52:33 -04:00
secure_seq.h tcp: add const qualifiers where possible 2011-10-21 05:22:42 -04:00
slhc_vj.h
snmp.h net: avoid reloads in SNMP_UPD_PO_STATS 2012-08-06 13:40:47 -07:00
sock.h hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
stp.h
tcp_memcontrol.h cgroup: pass struct mem_cgroup instead of struct cgroup to socket memcg 2012-04-10 10:04:07 -07:00
tcp_states.h
tcp.h tcp: Tail loss probe (TLP) 2013-03-12 08:30:34 -04:00
timewait_sock.h [PATCH] tcp: Cache inetpeer in timewait socket, and only when necessary. 2012-06-09 14:56:12 -07:00
transp_v6.h ipv6: rename datagram_send_ctl and datagram_recv_ctl 2013-01-31 13:53:08 -05:00
udp.h net/ipv6/udp: UDP encapsulation: introduce encap_rcv hook into IPv6 2012-04-28 22:21:51 -04:00
udplite.h net: ipv4: Standardize prefixes for message logging 2012-03-12 17:05:21 -07:00
wext.h
wimax.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
wpan-phy.h mac802154: monitor device support 2012-05-16 15:17:08 -04:00
x25.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
x25device.h
xfrm.h Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next 2013-02-14 13:29:20 -05:00