linux_dsm_epyc7002/drivers/infiniband/hw/hfi1
Mike Marciniszyn cc78076af1 IB/hfi1: Correct tid qp rcd to match verbs context
The qp priv rcd pointer doesn't match the context being used for verbs
causing issues when 9B and kdeth packets are processed by different
receive contexts and hence different CPUs.

When running on different CPUs the following panic can occur:

 WARNING: CPU: 3 PID: 2584 at lib/list_debug.c:59 __list_del_entry+0xa1/0xd0
 list_del corruption. prev->next should be ffff9a7ac31f7a30, but was ffff9a7c3bc89230
 CPU: 3 PID: 2584 Comm: z_wr_iss Kdump: loaded Tainted: P           OE  ------------   3.10.0-862.2.3.el7_lustre.x86_64 #1
 Call Trace:
  <IRQ>  [<ffffffffb7b0d78e>] dump_stack+0x19/0x1b
  [<ffffffffb74916d8>] __warn+0xd8/0x100
  [<ffffffffb749175f>] warn_slowpath_fmt+0x5f/0x80
  [<ffffffffb7768671>] __list_del_entry+0xa1/0xd0
  [<ffffffffc0c7a945>] process_rcv_qp_work+0xb5/0x160 [hfi1]
  [<ffffffffc0c7bc2b>] handle_receive_interrupt_nodma_rtail+0x20b/0x2b0 [hfi1]
  [<ffffffffc0c70683>] receive_context_interrupt+0x23/0x40 [hfi1]
  [<ffffffffb7540a94>] __handle_irq_event_percpu+0x44/0x1c0
  [<ffffffffb7540c42>] handle_irq_event_percpu+0x32/0x80
  [<ffffffffb7540ccc>] handle_irq_event+0x3c/0x60
  [<ffffffffb7543a1f>] handle_edge_irq+0x7f/0x150
  [<ffffffffb742d504>] handle_irq+0xe4/0x1a0
  [<ffffffffb7b23f7d>] do_IRQ+0x4d/0xf0
  [<ffffffffb7b16362>] common_interrupt+0x162/0x162
  <EOI>  [<ffffffffb775a326>] ? memcpy+0x6/0x110
  [<ffffffffc109210d>] ? abd_copy_from_buf_off_cb+0x1d/0x30 [zfs]
  [<ffffffffc10920f0>] ? abd_copy_to_buf_off_cb+0x30/0x30 [zfs]
  [<ffffffffc1093257>] abd_iterate_func+0x97/0x120 [zfs]
  [<ffffffffc10934d9>] abd_copy_from_buf_off+0x39/0x60 [zfs]
  [<ffffffffc109b828>] arc_write_ready+0x178/0x300 [zfs]
  [<ffffffffb7b11032>] ? mutex_lock+0x12/0x2f
  [<ffffffffb7b11032>] ? mutex_lock+0x12/0x2f
  [<ffffffffc1164d05>] zio_ready+0x65/0x3d0 [zfs]
  [<ffffffffc04d725e>] ? tsd_get_by_thread+0x2e/0x50 [spl]
  [<ffffffffc04d1318>] ? taskq_member+0x18/0x30 [spl]
  [<ffffffffc115ef22>] zio_execute+0xa2/0x100 [zfs]
  [<ffffffffc04d1d2c>] taskq_thread+0x2ac/0x4f0 [spl]
  [<ffffffffb74cee80>] ? wake_up_state+0x20/0x20
  [<ffffffffc115ee80>] ? zio_taskq_member.isra.7.constprop.10+0x80/0x80 [zfs]
  [<ffffffffc04d1a80>] ? taskq_thread_spawn+0x60/0x60 [spl]
  [<ffffffffb74bae31>] kthread+0xd1/0xe0
  [<ffffffffb74bad60>] ? insert_kthread_work+0x40/0x40
  [<ffffffffb7b1f5f7>] ret_from_fork_nospec_begin+0x21/0x21
  [<ffffffffb74bad60>] ? insert_kthread_work+0x40/0x40

Fix by reading the map entry in the same manner as the hardware so that
the kdeth and verbs contexts match.

Cc: <stable@vger.kernel.org>
Fixes: 5190f052a3 ("IB/hfi1: Allow the driver to initialize QP priv struct")
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-06-11 17:06:45 -03:00
..
affinity.c mm: replace all open encodings for NUMA_NO_NODE 2019-03-05 21:07:14 -08:00
affinity.h IB/{hfi1, rdmavt, qib}: Implement CQ completion vector support 2018-05-09 15:53:30 -04:00
aspm.h IB/hfi1: Convert timers to use timer_setup() 2017-10-18 11:48:19 -04:00
chip_registers.h IB/hfi1: Add selected Rcv counters 2019-04-24 11:48:10 -03:00
chip.c IB/hfi1: Correct tid qp rcd to match verbs context 2019-06-11 17:06:45 -03:00
chip.h IB/hfi1: Correct tid qp rcd to match verbs context 2019-06-11 17:06:45 -03:00
common.h IB/hfi1: Remove reference to RHF.VCRCErr 2019-04-24 11:48:11 -03:00
debugfs.c IB/hfi1: Add debugfs to control expansion ROM write protect 2019-04-24 11:26:41 -03:00
debugfs.h infiniband: hfi1: drop crazy DEBUGFS_SEQ_FILE_CREATE() macro 2019-01-24 09:22:29 -07:00
device.c infiniband: utilize the new cdev_set_parent function 2017-03-21 06:44:33 +01:00
device.h
driver.c IB/hfi1: Remove reference to RHF.VCRCErr 2019-04-24 11:48:11 -03:00
efivar.c IB/hfi1: Check upper-case EFI variables 2017-02-19 09:18:37 -05:00
efivar.h
eprom.c IB/hfi1: Check eeprom config partition validity 2017-09-27 11:10:36 -04:00
eprom.h IB/hfi1: Add ability to read platform config from the EPROM 2016-10-02 08:42:20 -04:00
exp_rcv.c IB/hfi1: Remove WARN_ON when freeing expected receive groups 2019-04-03 15:27:30 -03:00
exp_rcv.h IB/hfi1: Cleanup of exp_rcv 2018-05-24 09:39:25 -06:00
fault.c IB/hfi1: Validate fault injection opcode user input 2019-06-11 17:06:37 -03:00
fault.h IB/hfi1: Rework fault injection machinery 2018-05-09 15:53:30 -04:00
file_ops.c IB/hfi1: Remove overly conservative VM_EXEC flag check 2019-01-21 14:20:08 -07:00
firmware.c IB/hfi1: Fix infinite loop in 8051 command error path 2018-01-05 13:34:55 -05:00
hfi.h hfi1: Convert hfi1_unit_table to XArray 2019-04-01 13:27:35 -03:00
init.c IB/hfi1: Fix WQ_MEM_RECLAIM warning 2019-05-06 12:57:45 -03:00
intr.c IB/hfi1: Allow MgmtAllowed on B2B setups 2017-11-13 15:53:56 -05:00
iowait.c IB/hfi1: Prioritize the sending of ACK packets 2019-02-05 18:07:44 -05:00
iowait.h IB/hfi1: Prioritize the sending of ACK packets 2019-02-05 18:07:44 -05:00
Kconfig treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
mad.c RDMA: Mark if create address handle is in a sleepable context 2018-12-19 16:17:19 -07:00
mad.h IB/hfi1: Convert PortXmitWait/PortVLXmitWait counters to flit times 2018-02-01 15:43:30 -07:00
Makefile IB/hfi1: OPFN interface 2019-01-31 11:36:05 -05:00
mmu_rb.c mm/mmu_notifier: use structure for invalidate_range_start/end callback 2018-12-28 12:11:50 -08:00
mmu_rb.h IB/hfi1: Don't remove RB entry when not needed. 2017-06-27 16:56:33 -04:00
msix.c IB/hfi1: Rework the IRQ API to be more flexible 2018-09-01 08:13:38 -04:00
msix.h IB/hfi1: Make the MSIx resource allocation a bit more flexible 2018-09-01 08:13:38 -04:00
opa_compat.h IB/hfi1: Document phys port state bits not used in IB 2017-08-22 14:22:37 -04:00
opfn.c IB/hfi1: Add TID RDMA retry timer 2019-02-05 18:07:43 -05:00
opfn.h IB/hfi1: Make opfn.h self sufficient 2019-04-24 11:31:49 -03:00
pcie.c First merge window pull request 2018-10-26 07:38:19 -07:00
pio_copy.c IB/hfi1: Optimize pio_buf and send_context structs 2016-11-15 16:37:27 -05:00
pio.c drivers: Remove explicit invocations of mmiowb() 2019-04-08 12:01:02 +01:00
pio.h IB/hfi1: Reduce lock contention on iowait_lock for sdma and pio 2018-12-06 20:15:36 -07:00
platform.c IB/{hfi1, rdmavt}: Fix memory leak in hfi1_alloc_devdata() upon failure 2018-05-03 15:24:48 -04:00
platform.h IB/hfi1: Define platform_config_table_limits once 2016-12-11 15:29:42 -05:00
qp.c Merge HFI1 updates into k.o/for-next 2019-04-03 15:28:05 -03:00
qp.h IB/hfi1: Add the dual leg code 2019-02-05 18:07:44 -05:00
qsfp.c IB/{hfi1, rdmavt}: Fix memory leak in hfi1_alloc_devdata() upon failure 2018-05-03 15:24:48 -04:00
qsfp.h
rc.c IB/{rdmavt, qib, hfi1}: Use new routine to release reference counts 2019-04-24 11:31:49 -03:00
rc.h IB/hfi1: Delay the release of destination mr for TID RDMA WRITE DATA 2019-04-03 15:27:30 -03:00
ruc.c IB/{rdmavt, hfi1): Miscellaneous comment fixes 2019-04-24 11:31:48 -03:00
sdma_txreq.h IB/hfi1: Prioritize the sending of ACK packets 2019-02-05 18:07:44 -05:00
sdma.c IB/hfi1: Prioritize the sending of ACK packets 2019-02-05 18:07:44 -05:00
sdma.h IB/hfi1: Reduce lock contention on iowait_lock for sdma and pio 2018-12-06 20:15:36 -07:00
sysfs.c RDMA: Introduce and use rdma_device_to_ibdev() 2019-01-14 13:12:03 -07:00
tid_rdma.c IB/hfi1: Correct tid qp rcd to match verbs context 2019-06-11 17:06:45 -03:00
tid_rdma.h IB/hfi1: Unify the software PSN check for TID RDMA READ/WRITE 2019-04-03 15:27:30 -03:00
trace_ctxts.h treewide: remove large struct-pass-by-value from tracepoint arguments 2018-03-28 22:55:18 +02:00
trace_dbg.h IB/hfi1: Fix two format strings 2019-03-28 11:03:49 -03:00
trace_ibhdrs.h IB/hfi1: Add static trace for TID RDMA WRITE protocol 2019-02-05 18:07:44 -05:00
trace_iowait.h IB/hfi1: Add static trace for iowait 2018-09-30 19:21:12 -06:00
trace_misc.h IB/hfi1: Add traces for TID operations 2017-06-27 16:58:13 -04:00
trace_mmu.h IB/hif1: Remove static tracing from SDMA hot path 2017-08-28 19:12:27 -04:00
trace_rc.h IB/hfi1: Add static trace for TID RDMA READ protocol 2019-02-05 17:53:56 -05:00
trace_rx.h IB/hfi1: Add static trace for OPFN 2019-01-31 11:37:40 -05:00
trace_tid.h IB/hfi1: Unify the software PSN check for TID RDMA READ/WRITE 2019-04-03 15:27:30 -03:00
trace_tx.h IB/hfi1: Add static trace for TID RDMA WRITE protocol 2019-02-05 18:07:44 -05:00
trace.c IB/hfi1: Add static trace for TID RDMA WRITE protocol 2019-02-05 18:07:44 -05:00
trace.h IB/hfi1: Add static trace for OPFN 2019-01-31 11:37:40 -05:00
uc.c IB/hfi1: OPFN support discovery 2019-01-31 11:36:04 -05:00
ud.c Linux 5.0-rc5 2019-02-04 14:53:42 -07:00
user_exp_rcv.c IB/hfi1: Validate page aligned for a given virtual address 2019-05-29 12:56:05 -03:00
user_exp_rcv.h IB/hfi1: TID RDMA RcvArray programming and TID allocation 2019-02-05 17:53:55 -05:00
user_pages.c IB/hfi1: use the new FOLL_LONGTERM flag to get_user_pages_fast() 2019-05-14 09:47:46 -07:00
user_sdma.c IB/hfi1: Close PSM sdma_progress sleep window 2019-06-11 17:06:45 -03:00
user_sdma.h IB/hfi1: Close PSM sdma_progress sleep window 2019-06-11 17:06:45 -03:00
verbs_txreq.c IB/hfi1: Fix incorrect mixing of ERR_PTR and NULL return values 2018-06-26 14:35:55 -06:00
verbs_txreq.h IB/hfi1: Prioritize the sending of ACK packets 2019-02-05 18:07:44 -05:00
verbs.c IB/{qib, hfi1, rdmavt}: Correct ibv_devinfo max_mr value 2019-05-29 12:56:05 -03:00
verbs.h IB/hfi1: Add running average for adaptive pio 2019-03-26 09:33:21 -03:00
vnic_main.c 5.2 Merge Window pull request 2019-05-09 09:02:46 -07:00
vnic_sdma.c IB/hfi1: Prioritize the sending of ACK packets 2019-02-05 18:07:44 -05:00
vnic.h IB/hfi1: Add support to receive 16B bypass packets 2017-08-22 14:22:37 -04:00