linux_dsm_epyc7002/net/ipv4
Marcelo Leitner 1f37bf87aa tcp: zero retrans_stamp if all retrans were acked
Ueki Kohei reported that when we are using NewReno with connections that
have a very low traffic, we may timeout the connection too early if a
second loss occurs after the first one was successfully acked but no
data was transfered later. Below is his description of it:

When SACK is disabled, and a socket suffers multiple separate TCP
retransmissions, that socket's ETIMEDOUT value is calculated from the
time of the *first* retransmission instead of the *latest*
retransmission.

This happens because the tcp_sock's retrans_stamp is set once then never
cleared.

Take the following connection:

                      Linux                    remote-machine
                        |                           |
         send#1---->(*1)|--------> data#1 --------->|
                  |     |                           |
                 RTO    :                           :
                  |     |                           |
                 ---(*2)|----> data#1(retrans) ---->|
                  | (*3)|<---------- ACK <----------|
                  |     |                           |
                  |     :                           :
                  |     :                           :
                  |     :                           :
                16 minutes (or more)                :
                  |     :                           :
                  |     :                           :
                  |     :                           :
                  |     |                           |
         send#2---->(*4)|--------> data#2 --------->|
                  |     |                           |
                 RTO    :                           :
                  |     |                           |
                 ---(*5)|----> data#2(retrans) ---->|
                  |     |                           |
                  |     |                           |
                RTO*2   :                           :
                  |     |                           |
                  |     |                           |
      ETIMEDOUT<----(*6)|                           |

(*1) One data packet sent.
(*2) Because no ACK packet is received, the packet is retransmitted.
(*3) The ACK packet is received. The transmitted packet is acknowledged.

At this point the first "retransmission event" has passed and been
recovered from. Any future retransmission is a completely new "event".

(*4) After 16 minutes (to correspond with retries2=15), a new data
packet is sent. Note: No data is transmitted between (*3) and (*4).

The socket's timeout SHOULD be calculated from this point in time, but
instead it's calculated from the prior "event" 16 minutes ago.

(*5) Because no ACK packet is received, the packet is retransmitted.
(*6) At the time of the 2nd retransmission, the socket returns
ETIMEDOUT.

Therefore, now we clear retrans_stamp as soon as all data during the
loss window is fully acked.

Reported-by: Ueki Kohei
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Tested-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05 16:59:49 -05:00
..
netfilter netfilter: nf_reject_ipv4: split nf_send_reset() in smaller functions 2014-10-31 12:49:05 +01:00
af_inet.c net: gso: use feature flag argument in all protocol gso handlers 2014-10-20 12:38:12 -04:00
ah4.c ipsec: Remove obsolete MAX_AH_AUTH_LEN 2014-09-18 10:54:36 +02:00
arp.c arp: Do not perturb drop profiles with ignored ARP packets 2014-09-28 17:30:35 -04:00
cipso_ipv4.c cipso: add __init to cipso_v4_cache_init 2014-10-01 15:46:20 -04:00
datagram.c net: Save TX flow hash in sock and set in skbuf on xmit 2014-07-07 21:14:21 -07:00
devinet.c ipv4: fail early when creating netdev named all or default 2014-07-29 11:43:50 -07:00
esp4.c esp4: Use the IPsec protocol multiplexer API 2014-02-25 07:04:17 +01:00
fib_frontend.c ipv4: Restore accept_local behaviour in fib_validate_source() 2014-08-22 12:23:10 -07:00
fib_lookup.h ipv4: make fib_detect_death static 2013-12-28 17:01:46 -05:00
fib_rules.c inet: fix NULL pointer Oops in fib(6)_rule_suppress 2013-12-10 17:54:23 -05:00
fib_semantics.c ipv4: fix nexthop attlen check in fib_nh_match 2014-10-14 15:59:37 -04:00
fib_trie.c list: fix order of arguments for hlist_add_after(_rcu) 2014-08-06 18:01:24 -07:00
fou.c ipv4: fix a potential use after free in fou.c 2014-10-17 23:45:26 -04:00
geneve.c geneve: Unregister pernet subsys on module unload. 2014-11-05 15:00:51 -05:00
gre_demux.c net: Fix GRE RX to use skb_transport_header for GRE header offset 2014-09-08 15:23:05 -07:00
gre_offload.c gre: Use inner mac length when computing tunnel length 2014-10-30 19:51:56 -04:00
icmp.c icmp: add a global rate limitation 2014-09-23 12:47:38 -04:00
igmp.c ipv4: igmp: fix v3 general query drop monitor false positive 2014-10-06 17:14:54 -04:00
inet_connection_sock.c ipv4: make ip_local_reserved_ports per netns 2014-05-14 15:31:45 -04:00
inet_diag.c inet_diag: fix inet_diag_dump_icsk() to use correct state for timewait sockets 2014-01-13 22:35:46 -08:00
inet_fragment.c inet: frags: remove the WARN_ON from inet_evict_bucket 2014-10-29 15:21:30 -04:00
inet_hashtables.c net: use reciprocal_scale() helper 2014-08-23 12:21:21 -07:00
inet_lro.c lro: remove dead code 2013-12-29 16:34:25 -05:00
inet_timewait_sock.c tcp/dccp: remove twchain 2013-10-08 23:19:24 -04:00
inetpeer.c inet: remove dead inetpeer sequence code 2014-09-08 16:42:42 -07:00
ip_forward.c net: rename local_df to ignore_df 2014-05-12 14:03:41 -04:00
ip_fragment.c inet: frags: add __init to ip4_frags_ctl_register 2014-10-01 15:46:19 -04:00
ip_gre.c net: better IFF_XMIT_DST_RELEASE support 2014-10-07 13:22:11 -04:00
ip_input.c net: Fix memory leak if TPROXY used with TCP early demux 2014-01-27 16:22:11 -08:00
ip_options.c ipv4: rename ip_options_echo to __ip_options_echo() 2014-09-28 16:35:42 -04:00
ip_output.c net: make skb_gso_segment error handling more robust 2014-10-20 12:38:13 -04:00
ip_sockglue.c ipv4: rcu cleanup in ip_ra_control() 2014-09-09 20:10:44 -07:00
ip_tunnel_core.c ipv4: fix a potential use after free in ip_tunnel_core.c 2014-10-17 23:45:26 -04:00
ip_tunnel.c ip_tunnel: Add GUE support 2014-10-03 16:53:33 -07:00
ip_vti.c net: better IFF_XMIT_DST_RELEASE support 2014-10-07 13:22:11 -04:00
ipcomp.c ipcomp4: Use the IPsec protocol multiplexer API 2014-02-25 07:04:17 +01:00
ipconfig.c ipconfig: Use time_before 2014-08-22 12:23:11 -07:00
ipip.c net: better IFF_XMIT_DST_RELEASE support 2014-10-07 13:22:11 -04:00
ipmr.c net: set name_assign_type in alloc_netdev() 2014-07-15 16:12:48 -07:00
Kconfig openvswitch: fix a compilation error when CONFIG_INET is not setW! 2014-10-07 00:10:49 -04:00
Makefile net: Add Geneve tunneling protocol driver 2014-10-06 00:32:20 -04:00
netfilter.c netfilter: remove double colon 2014-02-19 11:41:25 +01:00
ping.c net/ipv4: bind ip_nonlocal_bind to current netns 2014-09-09 11:27:09 -07:00
proc.c inet: frag: don't account number of fragment queues 2014-07-27 22:34:36 -07:00
protocol.c net: Export inet_offloads and inet6_offloads 2014-09-19 17:15:31 -04:00
raw.c ipv4: Make IP_MULTICAST_ALL and IP_MSFILTER work on raw sockets 2014-07-23 15:13:26 -07:00
route.c ipv4: Do not cache routing failures due to disabled forwarding. 2014-10-30 19:20:40 -04:00
syncookies.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-10-18 09:31:37 -07:00
sysctl_net_ipv4.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2014-10-08 21:40:54 -04:00
tcp_bic.c tcp: whitespace fixes 2014-09-01 18:12:45 -07:00
tcp_cong.c tcp: Change tcp_slow_start function to return void 2014-09-30 17:09:16 -04:00
tcp_cubic.c tcp: whitespace fixes 2014-09-01 18:12:45 -07:00
tcp_dctcp.c net: tcp: add DCTCP congestion control algorithm 2014-09-29 00:13:10 -04:00
tcp_diag.c tcp: whitespace fixes 2014-09-01 18:12:45 -07:00
tcp_fastopen.c tcp: remove unnecessary assignment. 2014-09-29 12:31:12 -04:00
tcp_highspeed.c tcp: whitespace fixes 2014-09-01 18:12:45 -07:00
tcp_htcp.c tcp: whitespace fixes 2014-09-01 18:12:45 -07:00
tcp_hybla.c tcp: whitespace fixes 2014-09-01 18:12:45 -07:00
tcp_illinois.c tcp: whitespace fixes 2014-09-01 18:12:45 -07:00
tcp_input.c tcp: zero retrans_stamp if all retrans were acked 2014-11-05 16:59:49 -05:00
tcp_ipv4.c net: fix saving TX flow hash in sock for outgoing connections 2014-10-22 16:14:29 -04:00
tcp_lp.c tcp: remove in_flight parameter from cong_avoid() methods 2014-05-03 19:23:07 -04:00
tcp_memcontrol.c percpu_counter: add @gfp to percpu_counter_init() 2014-09-08 09:51:29 +09:00
tcp_metrics.c tcp: don't allow syn packets without timestamps to pass tcp_tw_recycle logic 2014-08-14 14:38:54 -07:00
tcp_minisocks.c tcp: change TCP_ECN prefixes to lower case 2014-09-29 14:41:22 -04:00
tcp_offload.c net: Remove gso_send_check as an offload callback 2014-09-26 00:22:47 -04:00
tcp_output.c net: skb_fclone_busy() needs to detect orphaned skb 2014-10-30 19:58:30 -04:00
tcp_probe.c tcp: whitespace fixes 2014-09-01 18:12:45 -07:00
tcp_scalable.c tcp: whitespace fixes 2014-09-01 18:12:45 -07:00
tcp_timer.c tcp: abort orphan sockets stalling on zero window probes 2014-10-01 16:27:52 -04:00
tcp_vegas.c tcp: whitespace fixes 2014-09-01 18:12:45 -07:00
tcp_vegas.h net: ipv4/ipv6: Remove extern from function prototypes 2013-10-19 19:12:11 -04:00
tcp_veno.c tcp: whitespace fixes 2014-09-01 18:12:45 -07:00
tcp_westwood.c net: tcp: split ack slow/fast events from cwnd_event 2014-09-29 00:13:10 -04:00
tcp_yeah.c tcp: whitespace fixes 2014-09-01 18:12:45 -07:00
tcp.c tcp: md5: do not use alloc_percpu() 2014-10-25 16:10:04 -04:00
tunnel4.c
udp_diag.c netlink: rename ssk to sk in struct netlink_skb_params 2013-04-19 14:57:56 -04:00
udp_impl.h net: ipv4/ipv6: Remove extern from function prototypes 2013-10-19 19:12:11 -04:00
udp_offload.c net: gso: use feature flag argument in all protocol gso handlers 2014-10-20 12:38:12 -04:00
udp_tunnel.c udp-tunnel: Add a few more UDP tunnel APIs 2014-09-19 15:57:15 -04:00
udp.c net: merge cases where sock_efree and sock_edemux are the same function 2014-09-05 17:43:45 -07:00
udplite.c net: Eliminate no_check from protosw 2014-05-23 16:28:53 -04:00
xfrm4_input.c xfrm4: Add IPsec protocol multiplexer 2014-02-25 07:04:16 +01:00
xfrm4_mode_beet.c ipv4: ERROR: code indent should use tabs where possible 2013-12-26 13:43:21 -05:00
xfrm4_mode_transport.c
xfrm4_mode_tunnel.c inetpeer: get rid of ip_id_count 2014-06-02 11:00:41 -07:00
xfrm4_output.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-05-24 00:32:30 -04:00
xfrm4_policy.c xfrm: Introduce xfrm_input_afinfo to access the the callbacks properly 2014-03-14 07:28:07 +01:00
xfrm4_protocol.c xfrm4: Remove duplicate semicolon 2014-06-30 07:49:47 +02:00
xfrm4_state.c inet: make no_pmtu_disc per namespace and kill ipv4_config 2013-12-18 16:58:20 -05:00
xfrm4_tunnel.c sit: add IPv4 over IPv4 support 2013-05-31 17:19:05 -07:00