linux_dsm_epyc7002/drivers/scsi/mpt3sas
Sreekanth Reddy cc41f11a21 scsi: mpt3sas: Fix kernel panic observed on soft HBA unplug
Generic protection fault type kernel panic is observed when user performs
soft (ordered) HBA unplug operation while IOs are running on drives
connected to HBA.

When user performs ordered HBA removal operation, the kernel calls PCI
device's .remove() call back function where driver is flushing out all the
outstanding SCSI IO commands with DID_NO_CONNECT host byte and also unmaps
sg buffers allocated for these IO commands.

However, in the ordered HBA removal case (unlike of real HBA hot removal),
HBA device is still alive and hence HBA hardware is performing the DMA
operations to those buffers on the system memory which are already unmapped
while flushing out the outstanding SCSI IO commands and this leads to
kernel panic.

Don't flush out the outstanding IOs from .remove() path in case of ordered
removal since HBA will be still alive in this case and it can complete the
outstanding IOs. Flush out the outstanding IOs only in case of 'physical
HBA hot unplug' where there won't be any communication with the HBA.

During shutdown also it is possible that HBA hardware can perform DMA
operations on those outstanding IO buffers which are completed with
DID_NO_CONNECT by the driver from .shutdown(). So same above fix is applied
in shutdown path as well.

It is safe to drop the outstanding commands when HBA is inaccessible such
as when permanent PCI failure happens, when HBA is in non-operational
state, or when someone does a real HBA hot unplug operation. Since driver
knows that HBA is inaccessible during these cases, it is safe to drop the
outstanding commands instead of waiting for SCSI error recovery to kick in
and clear these outstanding commands.

Link: https://lore.kernel.org/r/1585302763-23007-1-git-send-email-sreekanth.reddy@broadcom.com
Fixes: c666d3be99 ("scsi: mpt3sas: wait for and flush running commands on shutdown/unload")
Cc: stable@vger.kernel.org #v4.14.174+
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-03-31 22:02:37 -04:00
..
mpi scsi: mpt3sas: Update MPI Headers to v02.00.57 2020-01-02 22:23:16 -05:00
Kconfig scsi: mpt3sas: Irq poll to avoid CPU hard lockups 2019-03-18 17:16:43 -04:00
Makefile
mpt3sas_base.c block, scsi: final compat_ioctl cleanup 2020-01-10 00:14:46 -05:00
mpt3sas_base.h scsi: mpt3sas: Update drive version to 33.100.00.00 2020-01-02 22:23:17 -05:00
mpt3sas_config.c scsi: mpt3sas: Print function name in which cmd timed out 2020-01-02 22:23:17 -05:00
mpt3sas_ctl.c scsi: mpt3sas: Print function name in which cmd timed out 2020-01-02 22:23:17 -05:00
mpt3sas_ctl.h scsi: mpt3sas: Reuse diag buffer allocated at load time 2019-09-30 22:32:47 -04:00
mpt3sas_debug.h
mpt3sas_scsih.c scsi: mpt3sas: Fix kernel panic observed on soft HBA unplug 2020-03-31 22:02:37 -04:00
mpt3sas_transport.c scsi: mpt3sas: Optimize mpt3sas driver logging 2020-01-02 22:23:17 -05:00
mpt3sas_trigger_diag.c scsi: mpt3sas: Display message before releasing diag buffer 2019-09-30 22:32:46 -04:00
mpt3sas_trigger_diag.h
mpt3sas_warpdrive.c scsi: mpt3sas: Convert uses of pr_<level> with MPT3SAS_FMT to ioc_<level> 2018-10-10 22:00:43 -04:00