linux_dsm_epyc7002/drivers/infiniband/sw/rdmavt
Michael J. Ruhl d757c60eca IB/rdmavt: Fix concurrency panics in QP post_send and modify to error
The RC/UC code path can go through a software loopback. In this code path
the receive side QP is manipulated.

If two threads are working on the QP receive side (i.e. post_send, and
modify_qp to an error state), QP information can be corrupted.

(post_send via loopback)
  set r_sge
  loop
     update r_sge
(modify_qp)
     take r_lock
     update r_sge <---- r_sge is now incorrect
(post_send)
     update r_sge <---- crash, etc.
     ...

This can lead to one of the two following crashes:

 BUG: unable to handle kernel NULL pointer dereference at (null)
  IP:  hfi1_copy_sge+0xf1/0x2e0 [hfi1]
  PGD 8000001fe6a57067 PUD 1fd9e0c067 PMD 0
 Call Trace:
  ruc_loopback+0x49b/0xbc0 [hfi1]
  hfi1_do_send+0x38e/0x3e0 [hfi1]
  _hfi1_do_send+0x1e/0x20 [hfi1]
  process_one_work+0x17f/0x440
  worker_thread+0x126/0x3c0
  kthread+0xd1/0xe0
  ret_from_fork_nospec_begin+0x21/0x21

or:

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
  IP:  rvt_clear_mr_refs+0x45/0x370 [rdmavt]
  PGD 80000006ae5eb067 PUD ef15d0067 PMD 0
 Call Trace:
  rvt_error_qp+0xaa/0x240 [rdmavt]
  rvt_modify_qp+0x47f/0xaa0 [rdmavt]
  ib_security_modify_qp+0x8f/0x400 [ib_core]
  ib_modify_qp_with_udata+0x44/0x70 [ib_core]
  modify_qp.isra.23+0x1eb/0x2b0 [ib_uverbs]
  ib_uverbs_modify_qp+0xaa/0xf0 [ib_uverbs]
  ib_uverbs_write+0x272/0x430 [ib_uverbs]
  vfs_write+0xc0/0x1f0
  SyS_write+0x7f/0xf0
  system_call_fastpath+0x1c/0x21

Fix by using the appropriate locking on the receiving QP.

Fixes: 1570346153 ("IB/{hfi1, qib, rdmavt}: Move ruc_loopback to rdmavt")
Cc: <stable@vger.kernel.org> #v4.9+
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-04 15:47:23 -04:00
..
ah.c RDMA: Mark if destroy address handle is in a sleepable context 2018-12-19 16:28:03 -07:00
ah.h RDMA: Mark if destroy address handle is in a sleepable context 2018-12-19 16:28:03 -07:00
cq.c IB/{hfi1, rdmavt, qib}: Implement CQ completion vector support 2018-05-09 15:53:30 -04:00
cq.h IB/{hfi1, rdmavt, qib}: Implement CQ completion vector support 2018-05-09 15:53:30 -04:00
Kconfig IB/{hfi1, qib, rdmavt}: Move copy SGE logic into rdmavt 2018-10-03 16:38:28 -06:00
mad.c RDMA: Mark if destroy address handle is in a sleepable context 2018-12-19 16:28:03 -07:00
mad.h
Makefile
mcast.c
mcast.h
mmap.c
mmap.h
mr.c RDMA/rdmavt: Adapt to handle non-uniform sizes on umem SGEs 2019-02-13 09:00:43 -07:00
mr.h
pd.c RDMA: Handle PD allocations by IB/core 2019-02-08 16:51:04 -07:00
pd.h RDMA: Handle PD allocations by IB/core 2019-02-08 16:51:04 -07:00
qp.c IB/rdmavt: Fix concurrency panics in QP post_send and modify to error 2019-03-04 15:47:23 -04:00
qp.h IB/{hfi1, qib, rdmavt}: Move copy SGE logic into rdmavt 2018-10-03 16:38:28 -06:00
rc.c IB/hfi: Move RC functions into a header file 2019-02-05 17:51:09 -05:00
srq.c IB/{hw,sw}: Remove 'uobject->context' dependency in object creation APIs 2019-02-15 15:38:38 -07:00
srq.h
trace_cq.h IB/rdmavt: Add wc_flags and wc_immdata to cq entry trace 2019-01-18 13:48:19 -07:00
trace_mr.h
trace_qp.h
trace_rc.h
trace_rvt.h
trace_tx.h IB/{hfi1, qib, rdmavt}: Move send completion logic to rdmavt 2018-10-03 16:38:28 -06:00
trace.c
trace.h
vt.c RDMA: Handle ucontext allocations by IB/core 2019-02-22 14:11:37 -07:00
vt.h