linux_dsm_epyc7002/net/sctp
Daniel Borkmann 1be9a950c6 net: sctp: inherit auth_capable on INIT collisions
Jason reported an oops caused by SCTP on his ARM machine with
SCTP authentication enabled:

Internal error: Oops: 17 [#1] ARM
CPU: 0 PID: 104 Comm: sctp-test Not tainted 3.13.0-68744-g3632f30c9b20-dirty #1
task: c6eefa40 ti: c6f52000 task.ti: c6f52000
PC is at sctp_auth_calculate_hmac+0xc4/0x10c
LR is at sg_init_table+0x20/0x38
pc : [<c024bb80>]    lr : [<c00f32dc>]    psr: 40000013
sp : c6f538e8  ip : 00000000  fp : c6f53924
r10: c6f50d80  r9 : 00000000  r8 : 00010000
r7 : 00000000  r6 : c7be4000  r5 : 00000000  r4 : c6f56254
r3 : c00c8170  r2 : 00000001  r1 : 00000008  r0 : c6f1e660
Flags: nZcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 0005397f  Table: 06f28000  DAC: 00000015
Process sctp-test (pid: 104, stack limit = 0xc6f521c0)
Stack: (0xc6f538e8 to 0xc6f54000)
[...]
Backtrace:
[<c024babc>] (sctp_auth_calculate_hmac+0x0/0x10c) from [<c0249af8>] (sctp_packet_transmit+0x33c/0x5c8)
[<c02497bc>] (sctp_packet_transmit+0x0/0x5c8) from [<c023e96c>] (sctp_outq_flush+0x7fc/0x844)
[<c023e170>] (sctp_outq_flush+0x0/0x844) from [<c023ef78>] (sctp_outq_uncork+0x24/0x28)
[<c023ef54>] (sctp_outq_uncork+0x0/0x28) from [<c0234364>] (sctp_side_effects+0x1134/0x1220)
[<c0233230>] (sctp_side_effects+0x0/0x1220) from [<c02330b0>] (sctp_do_sm+0xac/0xd4)
[<c0233004>] (sctp_do_sm+0x0/0xd4) from [<c023675c>] (sctp_assoc_bh_rcv+0x118/0x160)
[<c0236644>] (sctp_assoc_bh_rcv+0x0/0x160) from [<c023d5bc>] (sctp_inq_push+0x6c/0x74)
[<c023d550>] (sctp_inq_push+0x0/0x74) from [<c024a6b0>] (sctp_rcv+0x7d8/0x888)

While we already had various kind of bugs in that area
ec0223ec48 ("net: sctp: fix sctp_sf_do_5_1D_ce to verify if
we/peer is AUTH capable") and b14878ccb7 ("net: sctp: cache
auth_enable per endpoint"), this one is a bit of a different
kind.

Giving a bit more background on why SCTP authentication is
needed can be found in RFC4895:

  SCTP uses 32-bit verification tags to protect itself against
  blind attackers. These values are not changed during the
  lifetime of an SCTP association.

  Looking at new SCTP extensions, there is the need to have a
  method of proving that an SCTP chunk(s) was really sent by
  the original peer that started the association and not by a
  malicious attacker.

To cause this bug, we're triggering an INIT collision between
peers; normal SCTP handshake where both sides intent to
authenticate packets contains RANDOM; CHUNKS; HMAC-ALGO
parameters that are being negotiated among peers:

  ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ---------->
  <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] ---------
  -------------------- COOKIE-ECHO -------------------->
  <-------------------- COOKIE-ACK ---------------------

RFC4895 says that each endpoint therefore knows its own random
number and the peer's random number *after* the association
has been established. The local and peer's random number along
with the shared key are then part of the secret used for
calculating the HMAC in the AUTH chunk.

Now, in our scenario, we have 2 threads with 1 non-blocking
SEQ_PACKET socket each, setting up common shared SCTP_AUTH_KEY
and SCTP_AUTH_ACTIVE_KEY properly, and each of them calling
sctp_bindx(3), listen(2) and connect(2) against each other,
thus the handshake looks similar to this, e.g.:

  ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ---------->
  <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] ---------
  <--------- INIT[RANDOM; CHUNKS; HMAC-ALGO] -----------
  -------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] -------->
  ...

Since such collisions can also happen with verification tags,
the RFC4895 for AUTH rather vaguely says under section 6.1:

  In case of INIT collision, the rules governing the handling
  of this Random Number follow the same pattern as those for
  the Verification Tag, as explained in Section 5.2.4 of
  RFC 2960 [5]. Therefore, each endpoint knows its own Random
  Number and the peer's Random Number after the association
  has been established.

In RFC2960, section 5.2.4, we're eventually hitting Action B:

  B) In this case, both sides may be attempting to start an
     association at about the same time but the peer endpoint
     started its INIT after responding to the local endpoint's
     INIT. Thus it may have picked a new Verification Tag not
     being aware of the previous Tag it had sent this endpoint.
     The endpoint should stay in or enter the ESTABLISHED
     state but it MUST update its peer's Verification Tag from
     the State Cookie, stop any init or cookie timers that may
     running and send a COOKIE ACK.

In other words, the handling of the Random parameter is the
same as behavior for the Verification Tag as described in
Action B of section 5.2.4.

Looking at the code, we exactly hit the sctp_sf_do_dupcook_b()
case which triggers an SCTP_CMD_UPDATE_ASSOC command to the
side effect interpreter, and in fact it properly copies over
peer_{random, hmacs, chunks} parameters from the newly created
association to update the existing one.

Also, the old asoc_shared_key is being released and based on
the new params, sctp_auth_asoc_init_active_key() updated.
However, the issue observed in this case is that the previous
asoc->peer.auth_capable was 0, and has *not* been updated, so
that instead of creating a new secret, we're doing an early
return from the function sctp_auth_asoc_init_active_key()
leaving asoc->asoc_shared_key as NULL. However, we now have to
authenticate chunks from the updated chunk list (e.g. COOKIE-ACK).

That in fact causes the server side when responding with ...

  <------------------ AUTH; COOKIE-ACK -----------------

... to trigger a NULL pointer dereference, since in
sctp_packet_transmit(), it discovers that an AUTH chunk is
being queued for xmit, and thus it calls sctp_auth_calculate_hmac().

Since the asoc->active_key_id is still inherited from the
endpoint, and the same as encoded into the chunk, it uses
asoc->asoc_shared_key, which is still NULL, as an asoc_key
and dereferences it in ...

  crypto_hash_setkey(desc.tfm, &asoc_key->data[0], asoc_key->len)

... causing an oops. All this happens because sctp_make_cookie_ack()
called with the *new* association has the peer.auth_capable=1
and therefore marks the chunk with auth=1 after checking
sctp_auth_send_cid(), but it is *actually* sent later on over
the then *updated* association's transport that didn't initialize
its shared key due to peer.auth_capable=0. Since control chunks
in that case are not sent by the temporary association which
are scheduled for deletion, they are issued for xmit via
SCTP_CMD_REPLY in the interpreter with the context of the
*updated* association. peer.auth_capable was 0 in the updated
association (which went from COOKIE_WAIT into ESTABLISHED state),
since all previous processing that performed sctp_process_init()
was being done on temporary associations, that we eventually
throw away each time.

The correct fix is to update to the new peer.auth_capable
value as well in the collision case via sctp_assoc_update(),
so that in case the collision migrated from 0 -> 1,
sctp_auth_asoc_init_active_key() can properly recalculate
the secret. This therefore fixes the observed server panic.

Fixes: 730fc3d05c ("[SCTP]: Implete SCTP-AUTH parameter processing")
Reported-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Tested-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-22 19:56:58 -07:00
..
associola.c net: sctp: inherit auth_capable on INIT collisions 2014-07-22 19:56:58 -07:00
auth.c net: sctp: cache auth_enable per endpoint 2014-04-18 18:32:00 -04:00
bind_addr.c sctp: Fix FSF address in file headers 2013-12-06 12:37:56 -05:00
chunk.c sctp: fix checkpatch errors with space required or prohibited 2013-12-26 13:47:47 -05:00
command.c sctp: Fix FSF address in file headers 2013-12-06 12:37:56 -05:00
debug.c sctp: Fix FSF address in file headers 2013-12-06 12:37:56 -05:00
endpointola.c net: sctp: migrate most recently used transport to ktime 2014-06-11 12:23:17 -07:00
input.c sctp: remove macros sctp_bh_[un]lock_sock 2014-01-21 18:41:36 -08:00
inqueue.c sctp: Fix FSF address in file headers 2013-12-06 12:37:56 -05:00
ipv6.c net: Eliminate no_check from protosw 2014-05-23 16:28:53 -04:00
Kconfig net: sctp: get rid of SCTP_DBG_TSNS entirely 2013-07-02 00:08:03 -07:00
Makefile sctp: implement sctp association probing module 2010-04-30 22:41:09 -04:00
objcnt.c sctp: fix checkpatch errors with (foo*)|foo * bar|foo* bar 2013-12-26 13:47:47 -05:00
output.c net: rename local_df to ignore_df 2014-05-12 14:03:41 -04:00
outqueue.c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-01-06 17:37:45 -05:00
primitive.c sctp: Fix FSF address in file headers 2013-12-06 12:37:56 -05:00
probe.c sctp: loading sctp when load sctp_probe 2013-12-16 20:04:27 -05:00
proc.c snmp: fix some left over of snmp stats 2014-05-14 15:33:47 -04:00
protocol.c net: Eliminate no_check from protosw 2014-05-23 16:28:53 -04:00
sm_make_chunk.c ktime: add ktime_after and ktime_before helper 2014-06-11 12:23:17 -07:00
sm_sideeffect.c net: sctp: Don't transition to PF state when transport has exhausted 'Path.Max.Retrans'. 2014-04-27 23:41:14 -04:00
sm_statefuns.c net: sctp: cache auth_enable per endpoint 2014-04-18 18:32:00 -04:00
sm_statetable.c sctp: fix checkpatch errors with indent 2013-12-26 13:47:48 -05:00
socket.c net: Split sk_no_check into sk_no_check_{rx,tx} 2014-05-23 16:28:53 -04:00
ssnmap.c sctp: Fix FSF address in file headers 2013-12-06 12:37:56 -05:00
sysctl.c net: sctp: check proc_dointvec result in proc_sctp_do_auth 2014-06-19 21:30:19 -07:00
transport.c net: sctp: migrate most recently used transport to ktime 2014-06-11 12:23:17 -07:00
tsnmap.c sctp: Fix FSF address in file headers 2013-12-06 12:37:56 -05:00
ulpevent.c net: sctp: fix information leaks in ulpevent layer 2014-07-14 14:18:56 -07:00
ulpqueue.c sctp: add support for busy polling to sctp protocol 2014-04-20 18:18:55 -04:00