linux_dsm_epyc7002/drivers/infiniband/hw
Michael J. Ruhl 82a9792656 IB/hfi1: Re-order IRQ cleanup to address driver cleanup race
The pci_request_irq() interfaces always adds the IRQF_SHARED bit to
all IRQ requests.

When the kernel is built with CONFIG_DEBUG_SHIRQ config flag, if the
IRQF_SHARED bit is set, a call to the IRQ handler is made from the
__free_irq() function. This is testing a race condition between the
IRQ cleanup and an IRQ racing the cleanup.  The HFI driver should be
able to handle this race, but does not.

This race can cause traces that start with this footprint:

BUG: unable to handle kernel NULL pointer dereference at   (null)
Call Trace:
 <hfi1 irq handler>
 ...
 __free_irq+0x1b3/0x2d0
 free_irq+0x35/0x70
 pci_free_irq+0x1c/0x30
 clean_up_interrupts+0x53/0xf0 [hfi1]
 hfi1_start_cleanup+0x122/0x190 [hfi1]
 postinit_cleanup+0x1d/0x280 [hfi1]
 remove_one+0x233/0x250 [hfi1]
 pci_device_remove+0x39/0xc0

Export IRQ cleanup function so it can be called from other modules.

Using the exported cleanup function:

  Re-order the driver cleanup code to clean up IRQ resources before
  other resources, eliminating the race.

  Re-order error path for init so that the race does not occur.

Reduce severity on spurious error message for SDMA IRQs to info.

Reviewed-by: Alex Estrin <alex.estrin@intel.com>
Reviewed-by: Patel Jay P <jay.p.patel@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-02-01 15:24:32 -07:00
..
bnxt_re RDMA/bnxt_re: Use common error handling code in bnxt_qplib_alloc_dpi_tbl() 2018-02-01 15:24:31 -07:00
cxgb3 Updates for 4.15 kernel merge window 2017-11-15 14:54:53 -08:00
cxgb4 iw_cxgb4: Change error/warn prints to pr_debug 2017-12-29 11:09:23 -07:00
hfi1 IB/hfi1: Re-order IRQ cleanup to address driver cleanup race 2018-02-01 15:24:32 -07:00
hns RDMA/hns: Fix misplaced call to hns_roce_cleanup_hem_table 2018-02-01 15:24:32 -07:00
i40iw i40iw: Free IEQ resources 2018-01-16 20:38:18 -07:00
mlx4 RDMA: Move enum ib_cq_creation_flags to uapi headers 2018-01-29 12:58:34 -07:00
mlx5 Linux 4.15 2018-01-30 09:30:00 -07:00
mthca IB/mthca: remove mthca_user.h 2018-01-28 14:07:16 -07:00
nes nes: Change accelerated flag to bool 2017-12-22 13:33:30 -07:00
ocrdma IB/ocrdma: Use zeroing memory allocator than allocator/memset 2018-01-02 11:20:13 -07:00
qedr RDMA/qedr: lower print level of flushed CQEs 2018-01-25 10:58:36 -05:00
qib IB/qib: remove qib_keys.c 2018-01-28 14:07:16 -07:00
usnic drivers: infiniband: remove duplicate includes 2017-12-22 09:39:35 -07:00
vmw_pvrdma RDMA/vmw_pvrdma: Use zeroing memory allocator than allocator/memset 2018-01-02 11:20:13 -07:00
Makefile License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00