linux_dsm_epyc7002/net/tipc
Jon Paul Maloy 10724cc7bb tipc: redesign connection-level flow control
There are two flow control mechanisms in TIPC; one at link level that
handles network congestion, burst control, and retransmission, and one
at connection level which' only remaining task is to prevent overflow
in the receiving socket buffer. In TIPC, the latter task has to be
solved end-to-end because messages can not be thrown away once they
have been accepted and delivered upwards from the link layer, i.e, we
can never permit the receive buffer to overflow.

Currently, this algorithm is message based. A counter in the receiving
socket keeps track of number of consumed messages, and sends a dedicated
acknowledge message back to the sender for each 256 consumed message.
A counter at the sending end keeps track of the sent, not yet
acknowledged messages, and blocks the sender if this number ever reaches
512 unacknowledged messages. When the missing acknowledge arrives, the
socket is then woken up for renewed transmission. This works well for
keeping the message flow running, as it almost never happens that a
sender socket is blocked this way.

A problem with the current mechanism is that it potentially is very
memory consuming. Since we don't distinguish between small and large
messages, we have to dimension the socket receive buffer according
to a worst-case of both. I.e., the window size must be chosen large
enough to sustain a reasonable throughput even for the smallest
messages, while we must still consider a scenario where all messages
are of maximum size. Hence, the current fix window size of 512 messages
and a maximum message size of 66k results in a receive buffer of 66 MB
when truesize(66k) = 131k is taken into account. It is possible to do
much better.

This commit introduces an algorithm where we instead use 1024-byte
blocks as base unit. This unit, always rounded upwards from the
actual message size, is used when we advertise windows as well as when
we count and acknowledge transmitted data. The advertised window is
based on the configured receive buffer size in such a way that even
the worst-case truesize/msgsize ratio always is covered. Since the
smallest possible message size (from a flow control viewpoint) now is
1024 bytes, we can safely assume this ratio to be less than four, which
is the value we are now using.

This way, we have been able to reduce the default receive buffer size
from 66 MB to 2 MB with maintained performance.

In order to keep this solution backwards compatible, we introduce a
new capability bit in the discovery protocol, and use this throughout
the message sending/reception path to always select the right unit.

Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-03 15:51:16 -04:00
..
addr.c tipc: simplify include dependencies 2015-05-14 12:24:45 -04:00
addr.h tipc: simplify include dependencies 2015-05-14 12:24:45 -04:00
bcast.c tipc: remove pre-allocated message header in link struct 2016-03-06 23:01:20 -05:00
bcast.h tipc: remove pre-allocated message header in link struct 2016-03-06 23:01:20 -05:00
bearer.c tipc: stricter filtering of packets in bearer layer 2016-04-07 17:00:13 -04:00
bearer.h tipc: remove remnants of old broadcast code 2016-04-13 17:49:11 -04:00
core.c tipc: redesign connection-level flow control 2016-05-03 15:51:16 -04:00
core.h tipc: make dist queue pernet 2016-04-11 15:22:20 -04:00
discover.c tipc: eliminate buffer leak in bearer layer 2016-04-07 17:00:13 -04:00
discover.h tipc: eliminate buffer leak in bearer layer 2016-04-07 17:00:13 -04:00
eth_media.c tipc: make media address offset a common define 2015-02-27 18:18:48 -05:00
ib_media.c tipc: rename media/msg related definitions 2015-02-27 18:18:48 -05:00
Kconfig tipc: add ip/udp media type 2015-03-05 22:08:42 -05:00
link.c tipc: fix stale links after re-enabling bearer 2016-04-24 14:35:07 -04:00
link.h tipc: let first message on link be a state message 2016-04-15 16:09:06 -04:00
Makefile tipc: add ip/udp media type 2015-03-05 22:08:42 -05:00
msg.c tipc: let broadcast packet reception use new link receive function 2015-10-24 06:56:37 -07:00
msg.h tipc: redesign connection-level flow control 2016-05-03 15:51:16 -04:00
name_distr.c tipc: purge deferred updates from dead nodes 2016-04-11 15:22:20 -04:00
name_distr.h tipc: reduce code dependency between binding table and node layer 2015-11-20 14:06:10 -05:00
name_table.c tipc: move netlink policies to netlink.c 2016-03-07 14:56:41 -05:00
name_table.h tipc: convert legacy nl name table dump to nl compat 2015-02-09 13:20:48 -08:00
net.c tipc: move netlink policies to netlink.c 2016-03-07 14:56:41 -05:00
net.h tipc: make tipc node table aware of net namespace 2015-01-12 16:24:32 -05:00
netlink_compat.c tipc: fix null deref crash in compat config path 2016-02-25 17:04:48 -05:00
netlink.c tipc: move netlink policies to netlink.c 2016-03-07 14:56:41 -05:00
netlink.h tipc: move netlink policies to netlink.c 2016-03-07 14:56:41 -05:00
node.c tipc: propagate peer node capabilities to socket layer 2016-05-03 15:51:15 -04:00
node.h tipc: redesign connection-level flow control 2016-05-03 15:51:16 -04:00
server.c tipc: fix a race condition leading to subscriber refcnt bug 2016-04-14 16:46:46 -04:00
server.h tipc: fix a race condition leading to subscriber refcnt bug 2016-04-14 16:46:46 -04:00
socket.c tipc: redesign connection-level flow control 2016-05-03 15:51:16 -04:00
socket.h tipc: redesign connection-level flow control 2016-05-03 15:51:16 -04:00
subscr.c tipc: remove an unnecessary NULL check 2016-04-28 16:54:12 -04:00
subscr.h tipc: remove struct tipc_name_seq from struct tipc_subscription 2016-02-06 03:40:43 -05:00
sysctl.c tipc: add name distributor resiliency queue 2014-09-01 17:51:48 -07:00
udp_media.c tipc: make sure IPv6 header fits in skb headroom 2016-03-14 12:23:12 -04:00