linux_dsm_epyc7002/net/sctp
Daniel Borkmann 2061dcd6bf net: sctp: fix race for one-to-many sockets in sendmsg's auto associate
I.e. one-to-many sockets in SCTP are not required to explicitly
call into connect(2) or sctp_connectx(2) prior to data exchange.
Instead, they can directly invoke sendmsg(2) and the SCTP stack
will automatically trigger connection establishment through 4WHS
via sctp_primitive_ASSOCIATE(). However, this in its current
implementation is racy: INIT is being sent out immediately (as
it cannot be bundled anyway) and the rest of the DATA chunks are
queued up for later xmit when connection is established, meaning
sendmsg(2) will return successfully. This behaviour can result
in an undesired side-effect that the kernel made the application
think the data has already been transmitted, although none of it
has actually left the machine, worst case even after close(2)'ing
the socket.

Instead, when the association from client side has been shut down
e.g. first gracefully through SCTP_EOF and then close(2), the
client could afterwards still receive the server's INIT_ACK due
to a connection with higher latency. This INIT_ACK is then considered
out of the blue and hence responded with ABORT as there was no
alive assoc found anymore. This can be easily reproduced f.e.
with sctp_test application from lksctp. One way to fix this race
is to wait for the handshake to actually complete.

The fix defers waiting after sctp_primitive_ASSOCIATE() and
sctp_primitive_SEND() succeeded, so that DATA chunks cooked up
from sctp_sendmsg() have already been placed into the output
queue through the side-effect interpreter, and therefore can then
be bundeled together with COOKIE_ECHO control chunks.

strace from example application (shortened):

socket(PF_INET, SOCK_SEQPACKET, IPPROTO_SCTP) = 3
sendmsg(3, {msg_name(28)={sa_family=AF_INET, sin_port=htons(8888), sin_addr=inet_addr("192.168.1.115")},
           msg_iov(1)=[{"hello", 5}], msg_controllen=0, msg_flags=0}, 0) = 5
sendmsg(3, {msg_name(28)={sa_family=AF_INET, sin_port=htons(8888), sin_addr=inet_addr("192.168.1.115")},
           msg_iov(1)=[{"hello", 5}], msg_controllen=0, msg_flags=0}, 0) = 5
sendmsg(3, {msg_name(28)={sa_family=AF_INET, sin_port=htons(8888), sin_addr=inet_addr("192.168.1.115")},
           msg_iov(1)=[{"hello", 5}], msg_controllen=0, msg_flags=0}, 0) = 5
sendmsg(3, {msg_name(28)={sa_family=AF_INET, sin_port=htons(8888), sin_addr=inet_addr("192.168.1.115")},
           msg_iov(1)=[{"hello", 5}], msg_controllen=0, msg_flags=0}, 0) = 5
sendmsg(3, {msg_name(28)={sa_family=AF_INET, sin_port=htons(8888), sin_addr=inet_addr("192.168.1.115")},
           msg_iov(0)=[], msg_controllen=48, {cmsg_len=48, cmsg_level=0x84 /* SOL_??? */, cmsg_type=, ...},
           msg_flags=0}, 0) = 0 // graceful shutdown for SOCK_SEQPACKET via SCTP_EOF
close(3) = 0

tcpdump before patch (fooling the application):

22:33:36.306142 IP 192.168.1.114.41462 > 192.168.1.115.8888: sctp (1) [INIT] [init tag: 3879023686] [rwnd: 106496] [OS: 10] [MIS: 65535] [init TSN: 3139201684]
22:33:36.316619 IP 192.168.1.115.8888 > 192.168.1.114.41462: sctp (1) [INIT ACK] [init tag: 3345394793] [rwnd: 106496] [OS: 10] [MIS: 10] [init TSN: 3380109591]
22:33:36.317600 IP 192.168.1.114.41462 > 192.168.1.115.8888: sctp (1) [ABORT]

tcpdump after patch:

14:28:58.884116 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [INIT] [init tag: 438593213] [rwnd: 106496] [OS: 10] [MIS: 65535] [init TSN: 3092969729]
14:28:58.888414 IP 192.168.1.115.8888 > 192.168.1.114.35846: sctp (1) [INIT ACK] [init tag: 381429855] [rwnd: 106496] [OS: 10] [MIS: 10] [init TSN: 2141904492]
14:28:58.888638 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [COOKIE ECHO] , (2) [DATA] (B)(E) [TSN: 3092969729] [...]
14:28:58.893278 IP 192.168.1.115.8888 > 192.168.1.114.35846: sctp (1) [COOKIE ACK] , (2) [SACK] [cum ack 3092969729] [a_rwnd 106491] [#gap acks 0] [#dup tsns 0]
14:28:58.893591 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [DATA] (B)(E) [TSN: 3092969730] [...]
14:28:59.096963 IP 192.168.1.115.8888 > 192.168.1.114.35846: sctp (1) [SACK] [cum ack 3092969730] [a_rwnd 106496] [#gap acks 0] [#dup tsns 0]
14:28:59.097086 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [DATA] (B)(E) [TSN: 3092969731] [...] , (2) [DATA] (B)(E) [TSN: 3092969732] [...]
14:28:59.103218 IP 192.168.1.115.8888 > 192.168.1.114.35846: sctp (1) [SACK] [cum ack 3092969732] [a_rwnd 106486] [#gap acks 0] [#dup tsns 0]
14:28:59.103330 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [SHUTDOWN]
14:28:59.107793 IP 192.168.1.115.8888 > 192.168.1.114.35846: sctp (1) [SHUTDOWN ACK]
14:28:59.107890 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [SHUTDOWN COMPLETE]

Looks like this bug is from the pre-git history museum. ;)

Fixes: 08707d5482df ("lksctp-2_5_31-0_5_1.patch")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-17 23:52:20 -05:00
..
associola.c net: sctp: fix panic on duplicate ASCONF chunks 2014-10-14 12:46:22 -04:00
auth.c net: sctp: fix memory leak in auth key management 2014-11-11 15:19:11 -05:00
bind_addr.c sctp: Fix FSF address in file headers 2013-12-06 12:37:56 -05:00
chunk.c switch sctp_user_addto_chunk() and sctp_datamsg_from_user() to passing iov_iter 2014-11-24 05:16:40 -05:00
debug.c sctp: Fix FSF address in file headers 2013-12-06 12:37:56 -05:00
endpointola.c net: sctp: migrate most recently used transport to ktime 2014-06-11 12:23:17 -07:00
input.c sctp: Change sctp to implement csum_levels 2014-08-29 20:41:11 -07:00
inqueue.c net: sctp: fix remote memory pressure from excessive queueing 2014-10-14 12:46:22 -04:00
ipv6.c sctp: Fixup v4mapped behaviour to comply with Sock API 2014-07-31 21:49:06 -07:00
Kconfig net: sctp: get rid of SCTP_DBG_TSNS entirely 2013-07-02 00:08:03 -07:00
Makefile net: sctp: Inline the functions from command.c 2014-07-08 14:38:48 -07:00
objcnt.c sctp: fix checkpatch errors with (foo*)|foo * bar|foo* bar 2013-12-26 13:47:47 -05:00
output.c net: sctp: use MAX_HEADER for headroom reserve in output path 2014-12-09 13:24:03 -05:00
outqueue.c net: sctp: Rename SCTP_XMIT_NAGLE_DELAY to SCTP_XMIT_DELAY 2014-07-22 13:32:11 -07:00
primitive.c sctp: Fix FSF address in file headers 2013-12-06 12:37:56 -05:00
probe.c sctp: loading sctp when load sctp_probe 2013-12-16 20:04:27 -05:00
proc.c sctp: replace seq_printf with seq_puts 2014-10-30 19:40:16 -04:00
protocol.c Merge branch 'for-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu 2014-10-10 07:26:02 -04:00
sm_make_chunk.c switch sctp_user_addto_chunk() and sctp_datamsg_from_user() to passing iov_iter 2014-11-24 05:16:40 -05:00
sm_sideeffect.c net: sctp: Don't transition to PF state when transport has exhausted 'Path.Max.Retrans'. 2014-04-27 23:41:14 -04:00
sm_statefuns.c net: sctp: fix remote memory pressure from excessive queueing 2014-10-14 12:46:22 -04:00
sm_statetable.c sctp: fix checkpatch errors with indent 2013-12-26 13:47:48 -05:00
socket.c net: sctp: fix race for one-to-many sockets in sendmsg's auto associate 2015-01-17 23:52:20 -05:00
ssnmap.c sctp: Fix FSF address in file headers 2013-12-06 12:37:56 -05:00
sysctl.c net: sctp: only warn in proc_sctp_do_alpha_beta if write 2014-07-02 18:44:07 -07:00
transport.c sctp: Fixup v4mapped behaviour to comply with Sock API 2014-07-31 21:49:06 -07:00
tsnmap.c sctp: Fix FSF address in file headers 2013-12-06 12:37:56 -05:00
ulpevent.c sctp: Fixup v4mapped behaviour to comply with Sock API 2014-07-31 21:49:06 -07:00
ulpqueue.c net: introduce SO_INCOMING_CPU 2014-11-11 13:00:06 -05:00