Commit Graph

949184 Commits

Author SHA1 Message Date
Brian King
4b29cb6197 scsi: ibmvfc: Avoid link down on FS9100 canister reboot
When a canister on a FS9100, or similar storage, running in NPIV mode, is
rebooted, its WWPNs will fail over to another canister. When this occurs,
we see a WWPN going away from the fabric at one N-Port ID, and, a short
time later, the same WWPN appears at a different N-Port ID.  When the
canister is fully operational again, the WWPNs fail back to the original
canister. If there is any I/O outstanding to the target when this occurs,
it will result in the implicit logout the ibmvfc driver issues before
removing the rport to fail. When the WWPN then shows up at a different
N-Port ID, and we issue a PLOGI to it, the VIOS will see that it still has
a login for this WWPN at the old N-Port ID, which results in the VIOS
simulating a link down / link up sequence to the client, in order to get
the VIOS and client LPAR in sync.

The patch below improves the way we handle this scenario so as to avoid the
link bounce, which affects all targets under the virtual host adapter. The
change is to utilize the Move Login MAD, which will work even when I/O is
outstanding to the target. The change only alters the target state machine
for the case where the implicit logout fails prior to deleting the rport.
If this implicit logout fails, we defer deleting the ibmvfc_target object
after calling fc_remote_port_delete. This enables us to later retry the
implicit logout after terminate_rport_io occurs, or to issue the Move Login
request if a WWPN shows up at a new N-Port ID prior to this occurring.

This has been tested by IBM's storage interoperability team on a FS9100,
forcing the failover to occur. With debug tracing enabled in the ibmvfc
driver, we confirmed the move login was sent in this scenario and confirmed
the link bounce no longer occurred.

[mkp: fix checkpatch warnings]

Link: https://lore.kernel.org/r/1599859706-8505-1-git-send-email-brking@linux.vnet.ibm.com
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-15 20:48:14 -04:00
Damien Le Moal
46c9d608f9 scsi: core: Update additional sense codes list
Add missing Additional Sense Codes listed in
http://www.t10.org/lists/asc-num.txt.

Link: https://lore.kernel.org/r/20200910074843.217661-3-damien.lemoal@wdc.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-15 20:28:06 -04:00
Damien Le Moal
342c81eeaa scsi: core: Clean up scsi_noretry_cmd()
No need for else after return.

Link: https://lore.kernel.org/r/20200910074843.217661-2-damien.lemoal@wdc.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-15 20:28:05 -04:00
Xiongfeng Wang
6d70cb3434 scsi: target: tcmu: Add missing newline when printing parameters
The tcmu 'global_max_data_area_mb' parameter in sysfs is missing a
newline. Add it.

Link: https://lore.kernel.org/r/1599132573-33818-1-git-send-email-wangxiongfeng2@huawei.com
Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-15 20:19:13 -04:00
Daejun Park
782e2efb74 scsi: ufs: Fix NOP OUT timeout value
Boot occasionally fails with some Samsung low-power UFS devices. The reason
is that these devices have a little bit higher latency for NOP OUT
responses. This causes boot to fail because the NOP OUT command is issued
during initialization to check whether the device transport protocol is
ready or not. Increase NOP_OUT_TIMEOUT value from 30 to 50ms.

Link: https://lore.kernel.org/r/231786897.01599016081767.JavaMail.epsvc@epcpadp2
Acked-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-15 20:14:01 -04:00
Tomas Henzl
3d49f7426e scsi: mpt3sas: A small correction in _base_process_reply_queue
There is no need to compute modulo. A simple comparison is good enough.

Link: https://lore.kernel.org/r/20200911180057.14633-1-thenzl@redhat.com
Acked-by: sreekanth reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-15 18:37:00 -04:00
Tomas Henzl
45181eab8b scsi: mpt3sas: Fix sync irqs
_base_process_reply_queue() called from _base_interrupt() may schedule a
new irq poll. Fix this by calling synchronize_irq() first.

Also ensure that enable_irq() is called only when necessary to avoid
"Unbalanced enable for IRQ..." errors.

Link: https://lore.kernel.org/r/20200910142126.8147-1-thenzl@redhat.com
Fixes: 320e77acb3 ("scsi: mpt3sas: Irq poll to avoid CPU hard lockups")
Acked-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-15 18:36:50 -04:00
Sreekanth Reddy
f38c43a0e9 scsi: mpt3sas: Detect tampered Aero and Sea adapters
The driver will throw an error message when a tampered type controller
is detected. The intent is to avoid interacting with any firmware
which is not secured/signed by Broadcom. Any tampering on firmware
component will be detected by hardware and it will be communicated to
the driver to avoid any further interaction with that component.

[mkp: switched back to dev_err]

Link: https://lore.kernel.org/r/20200814130426.2741171-1-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-15 18:23:18 -04:00
Jason Yan
62aa501dc9 scsi: megaraid: Make smp_affinity_enable static
This addresses the following sparse warning:

drivers/scsi/megaraid/megaraid_sas_base.c:80:5: warning: symbol
'smp_affinity_enable' was not declared. Should it be static?

Link: https://lore.kernel.org/r/20200915083948.2826598-1-yanaijie@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-15 18:07:31 -04:00
Jing Xiangfeng
1c370903d1 scsi: target: Remove redundant assignment to variable 'ret'
The variable ret has been initialized with a value '0'. The assignment in
switch-case is redundant. Remove it.

Link: https://lore.kernel.org/r/20200914023207.113792-1-jingxiangfeng@huawei.com
Reviewed-by: Maurizio Lombardi <mlombard@redhat.com>
Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-15 18:06:00 -04:00
Julian Wiedmann
d251193d17 scsi: zfcp: Clarify access to erp_action in zfcp_fsf_req_complete()
While reviewing commit 936e6b85da ("scsi: zfcp: Fix panic on ERP timeout
for previously dismissed ERP action"), I stumbled over
zfcp_fsf_req_complete() and wondered whether it has similar issues wrt
concurrent modification of req->erp_action by
zfcp_erp_strategy_check_fsfreq().

But a closer look shows that both its two callers [zfcp_fsf_reqid_check(),
zfcp_fsf_req_dismiss_all()] remove the request from the adapter's req_list
under the req_list's lock.  Hence we can trust that if
zfcp_erp_strategy_check_fsfreq() concurrently looks up the corresponding
req_id, it won't find this request and is thus unable to modify it while
it's being processed by zfcp_fsf_req_complete().

Add a code comment that hopefully makes this easier for future readers, and
condense the two accesses to ->erp_action that made me trip over this code
path in the first place.

Link: https://lore.kernel.org/r/c500eac301fcbba5af942bbd200f2d6b14e46994.1599765652.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-15 18:01:58 -04:00
Julian Wiedmann
addf137296 scsi: zfcp: Use list_first_entry_or_null() in zfcp_erp_thread()
Use the right helper to avoid poking around in the list's internals.

Link: https://lore.kernel.org/r/ed669555c73aab95b29444c10066f492c0c43391.1599765652.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-15 18:01:57 -04:00
YueHaibing
3f4fee002b scsi: aic94xx: Remove unused inline function
There is no caller in tree. Remove it.

Link: https://lore.kernel.org/r/20200909135711.35728-1-yuehaibing@huawei.com
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-15 17:54:09 -04:00
YueHaibing
a9e81c2922 scsi: libfc: Fix passing zero to 'PTR_ERR' warning
drivers/scsi/libfc/fc_disc.c:304
 fc_disc_error() warn: passing zero to 'PTR_ERR'

fp may be NULL in fc_disc_error(), use PTR_ERR_OR_ZERO to handle this.

Link: https://lore.kernel.org/r/20200909135432.36772-1-yuehaibing@huawei.com
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-15 17:51:39 -04:00
Jason Yan
bff8b14b09 scsi: fnic: Remove unneeded semicolon
This addresses the following coccinelle warning:

drivers/scsi/fnic/fnic_main.c:446:2-3: Unneeded semicolon

Link: https://lore.kernel.org/r/20200911091057.2938685-1-yanaijie@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-15 17:35:00 -04:00
Jason Yan
94e476520e scsi: nsp32: Remove unneeded semicolon
This addresses the following coccinelle warning:

drivers/scsi/nsp32.c:1250:4-5: Unneeded semicolon
drivers/scsi/nsp32.c:1842:2-3: Unneeded semicolon

Link: https://lore.kernel.org/r/20200911091049.2938158-1-yanaijie@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-15 17:34:18 -04:00
Jason Yan
8d4089cdc3 scsi: sym53c8xx_2: Remove unneeded semicolon
This addresses the following coccinelle warning:

drivers/scsi/sym53c8xx_2/sym_fw.c:372:3-4: Unneeded semicolon
drivers/scsi/sym53c8xx_2/sym_fw.c:480:3-4: Unneeded semicolon
drivers/scsi/sym53c8xx_2/sym_fw.c:536:2-3: Unneeded semicolon

Link: https://lore.kernel.org/r/20200911091031.2937834-1-yanaijie@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-15 17:33:16 -04:00
Jason Yan
34eb5ccf35 scsi: qla2xxx: Remove unneeded variable 'rval'
This addresses the following coccinelle warning:

drivers/scsi/qla2xxx/qla_init.c:7112:5-9: Unneeded variable: "rval".
Return "QLA_SUCCESS" on line 7115

Link: https://lore.kernel.org/r/20200911091021.2937708-1-yanaijie@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-15 17:32:01 -04:00
Adrian Hunter
247f994459 scsi: ufs-pci: Add LTR support for Intel controllers
Intel host controllers support the setting of latency tolerance.
Accordingly, implement the PM QoS ->set_latency_tolerance() callback. The
raw register values are also exposed via debugfs.

Link: https://lore.kernel.org/r/20200827072030.24655-1-adrian.hunter@intel.com
Fixes: 8c09d75276 ("scsi: ufshdc-pci: Add Intel PCI IDs for EHL")
Fixes: 1ab27c9cf8 ("ufs: Add support for clock gating")
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Acked-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-15 16:11:47 -04:00
Martin K. Petersen
02f7415054 Merge branch '5.9/scsi-fixes' into 5.10/scsi-ufs
Resolve UFS discrepancies between fixes and queue.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-15 11:36:40 -04:00
Ye Bin
2de7649cff scsi: lpfc: Remove set but not used 'qp'
This addresses the following gcc warning with "make W=1":

not used [-Wunused-but-set-variable]
  struct lpfc_sli4_hdw_queue *qp;
                                ^

Link: https://lore.kernel.org/r/20200909082716.37787-1-yebin10@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
drivers/scsi/lpfc/lpfc_debugfs.c: In function ‘lpfc_debugfs_hdwqstat_data’:
drivers/scsi/lpfc/lpfc_debugfs.c:1699:30: warning: variable ‘qp’ set but
2020-09-09 22:42:39 -04:00
Ye Bin
8b02fc756a scsi: gdth: Remove set but used 'cmd_index'
This addresses the following gcc warning with "make W=1":

drivers/scsi/gdth.c: In function ‘gdth_async_event’:
drivers/scsi/gdth.c:3010:9: warning: variable ‘cmd_index’ set but not
used [-Wunused-but-set-variable]
     int cmd_index;

Link: https://lore.kernel.org/r/20200909082627.101984-1-yebin10@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-09 22:41:50 -04:00
Ye Bin
27216a9d85 scsi: pmcraid: Remove set but not used 'res'
This addresses the following gcc warning with "make W=1":

 drivers/scsi/pmcraid.c: In function ‘pmcraid_abort_cmd’:
 drivers/scsi/pmcraid.c:2863:33: warning: variable ‘res’ set but not
 used [-Wunused-but-set-variable]
   struct pmcraid_resource_entry *res;
                                    ^

Link: https://lore.kernel.org/r/20200909082627.101984-2-yebin10@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-09 22:40:41 -04:00
Jason Yan
c8d67fbb60 scsi: qla1280: Remove set but not used variable in qla1280_status_entry()
This addresses the following gcc warning with "make W=1":

drivers/scsi/qla1280.c: In function ‘qla1280_status_entry’:
drivers/scsi/qla1280.c:3607:28: warning: variable ‘lun’ set but not used
[-Wunused-but-set-variable]
 3607 |  unsigned int bus, target, lun;
      |                            ^~~
drivers/scsi/qla1280.c:3607:20: warning: variable ‘target’ set but not
used [-Wunused-but-set-variable]
 3607 |  unsigned int bus, target, lun;
      |                    ^~~~~~
drivers/scsi/qla1280.c:3607:15: warning: variable ‘bus’ set but not used
[-Wunused-but-set-variable]
 3607 |  unsigned int bus, target, lun;
      |               ^~~

Link: https://lore.kernel.org/r/20200907074518.2326360-5-yanaijie@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-09 22:37:48 -04:00
Jason Yan
bf70bf28bf scsi: qla1280: Remove set but not used variable in qla1280_mailbox_command()
This addresses the following gcc warning with "make W=1":

drivers/scsi/qla1280.c: In function ‘qla1280_mailbox_command’:
drivers/scsi/qla1280.c:2430:11: warning: variable ‘data’ set but not
used [-Wunused-but-set-variable]
 2430 |  uint16_t data;
      |           ^~~~

Link: https://lore.kernel.org/r/20200907074518.2326360-4-yanaijie@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-09 22:37:47 -04:00
Jason Yan
9b0f9e59bc scsi: qla1280: Remove set but not used variable in qla1280_nvram_config()
This addresses the following gcc warning with "make W=1":

drivers/scsi/qla1280.c: In function ‘qla1280_nvram_config’:
drivers/scsi/qla1280.c:2188:36: warning: variable ‘ddma_conf’ set but
not used [-Wunused-but-set-variable]
 2188 |   uint16_t hwrev, cfg1, cdma_conf, ddma_conf;
      |                                    ^~~~~~~~~

Link: https://lore.kernel.org/r/20200907074518.2326360-3-yanaijie@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-09 22:37:47 -04:00
Jason Yan
3eedb4202d scsi: qla1280: Remove set but not used variable in qla1280_done()
This addresses the following gcc warning with "make W=1":

drivers/scsi/qla1280.c: In function ‘qla1280_done’:
drivers/scsi/qla1280.c:1244:19: warning: variable ‘lun’ set but not used
[-Wunused-but-set-variable]
 1244 |  int bus, target, lun;
      |                   ^~~

Link: https://lore.kernel.org/r/20200907074518.2326360-2-yanaijie@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-09 22:37:46 -04:00
Alim Akhtar
09fd5f0ddf scsi: ufs: Fix 'unmet direct dependencies' config warning
With !CONFIG_OF and SCSI_UFS_EXYNOS selected, the below warning is given:

WARNING: unmet direct dependencies detected for PHY_SAMSUNG_UFS
  Depends on [n]: OF [=n] && (ARCH_EXYNOS || COMPILE_TEST [=y])
  Selected by [y]:
  - SCSI_UFS_EXYNOS [=y] && SCSI_LOWLEVEL [=y] && SCSI [=y] && SCSI_UFSHCD_PLATFORM [=y] && (ARCH_EXYNOS || COMPILE_TEST [=y])

Fix it by removing PHY_SAMSUNG_UFS dependency.

Link: https://lore.kernel.org/r/20200721172021.28922-1-alim.akhtar@samsung.com
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Alim Akhtar <alim.akhtar@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-09 22:26:29 -04:00
Jing Xiangfeng
5e48a084f4 scsi: ibmvfc: Fix error return in ibmvfc_probe()
Fix to return error code PTR_ERR() from the error handling case instead of
0.

Link: https://lore.kernel.org/r/20200907083949.154251-1-jingxiangfeng@huawei.com
Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-09 22:11:04 -04:00
Stanley Chu
71957b6112 scsi: ufs: ufs-mediatek: Fix build warnings with make W=1
Fix build warnings with make W=1 as below,

1.
>> drivers/scsi/ufs/ufs-mediatek.c:116:22: warning: format '%d' expects
>> argument of type 'int', but argument 4 has type 'long int'

2.
  CC [M]  drivers/scsi/ufs/ufs-mediatek.o
../drivers/scsi/ufs/ufs-mediatek.c:749: error: Cannot parse struct or union!

/** is used specifically with kernel-doc tool.
As a quick fix by removing dubious /** in the comment block of
struct ufs_hba_variant_ops ufs_hba_mtk_vops.

Link: https://lore.kernel.org/r/20200910013756.11385-1-stanley.chu@mediatek.com
Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-09 22:08:46 -04:00
Daniel Wagner
31a3271ff1 scsi: qla2xxx: Handle incorrect entry_type entries
It was observed on an ISP8324 16Gb HBA with fw=8.08.203 (d0d5) in a
PowerPC64 machine that pkt->entry_type was MBX_IOCB_TYPE/0x39 with an
sp->type SRB_SCSI_CMD which is invalid and should not be possible.

Reading the entry_type from the crash dump shows the expected value of
STATUS_TYPE/0x03 but the call trace shows that qla24xx_mbx_iocb_entry() is
used.

Add a check to verify for consistency and reset the HBA if an invalid state
is reached. Obviously, this is only a workaround until the real problem is
solved.

Link: https://lore.kernel.org/r/20200908081516.8561-5-dwagner@suse.de
Reviewed-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-09 22:01:44 -04:00
Daniel Wagner
7d88d5dff9 scsi: qla2xxx: Log calling function name in qla2x00_get_sp_from_handle()
Commit 7c3df1320e ("[SCSI] qla2xxx: Code changes to support new dynamic
logging infrastructure.") removed the use of the func argument. Let's add
it back.

Link: https://lore.kernel.org/r/20200908081516.8561-4-dwagner@suse.de
Reviewed-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-09 22:01:44 -04:00
Daniel Wagner
622299f16f scsi: qla2xxx: Simplify return value logic in qla2x00_get_sp_from_handle()
Refactor qla2x00_get_sp_from_handle() to avoid the unnecessary goto if
early returns are used. With this we can also avoid preinitialzing the sp
pointer.

Link: https://lore.kernel.org/r/20200908081516.8561-3-dwagner@suse.de
Reviewed-by: Martin Wilck <mwilck@suse.com>
Reviewed-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-09 22:01:43 -04:00
Daniel Wagner
c0014f9421 scsi: qla2xxx: Warn if done() or free() are called on an already freed srb
Emit a warning when ->done or ->free are called on an already freed
srb. There is a hidden use-after-free bug in the driver which corrupts
the srb memory pool which originates from the cleanup callbacks.

An extensive search didn't bring any lights on the real problem. The
initial fix was to set both pointers to NULL and try to catch invalid
accesses. But instead the memory corruption was gone and the driver
didn't crash. Since not all calling places check for NULL pointer, add
explicitly default handlers. With this we workaround the memory
corruption and add a debug help.

Link: https://lore.kernel.org/r/20200908081516.8561-2-dwagner@suse.de
Reviewed-by: Martin Wilck <mwilck@suse.com>
Reviewed-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-09 22:01:42 -04:00
Dan Carpenter
244359c99f scsi: libsas: Fix error path in sas_notify_lldd_dev_found()
In sas_notify_lldd_dev_found(), if we can't allocate the necessary
resources, then it seems like the wrong thing to mark the device as found
and to increment the reference count.  None of the callers ever drop the
reference in that situation.

[mkp: tweaked commit desc based on feedback from John]

Link: https://lore.kernel.org/r/20200905125836.GF183976@mwanda
Fixes: 735f7d2fed ("[SCSI] libsas: fix domain_device leak")
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Acked-by: John Garry <john.garry@huawei.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-09 21:25:02 -04:00
Saurav Kashyap
988100a7de scsi: qedf: Retry qed->probe during recovery
During recovery due to FCoE fn ramrod failure we wait for 2 sec and then
call qed->probe. If probe fails then retry max 10 times.

Link: https://lore.kernel.org/r/20200907121443.5150-8-jhasan@marvell.com
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Javed Hasan <jhasan@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-08 23:14:19 -04:00
Saurav Kashyap
55e049910e scsi: qedf: Add schedule_hw_err_handler callback for fan failure
On fan failure, disable the PCI function and initiate recovery for ramrod
failure.

Link: https://lore.kernel.org/r/20200907121443.5150-7-jhasan@marvell.com
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Javed Hasan <jhasan@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-08 23:14:19 -04:00
Saurav Kashyap
10aff62fab scsi: qedf: Return SUCCESS if stale rport is encountered
If SUCCESS is not returned, error handling will escalate. Return SUCCESS
similar to other conditions in this function.

Link: https://lore.kernel.org/r/20200907121443.5150-6-jhasan@marvell.com
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Javed Hasan <jhasan@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-08 23:14:18 -04:00
Javed Hasan
41715c6292 scsi: qedf: FDMI attributes correction
Correction in the FDMI attributes required for RHBA and RPA registration.

Link: https://lore.kernel.org/r/20200907121443.5150-5-jhasan@marvell.com
Signed-off-by: Javed Hasan <jhasan@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-08 23:14:17 -04:00
Javed Hasan
f78f812626 scsi: qedf: Fix for the session’s E_D_TOV value
Firmware expects E_D_TOV field in connection offload parameters as “msec”.
Earlier incorrect value (100ms), was leading to abort from driver in the
case when data frames for read take more than 100ms from target side,
resulting in firmware reporting E_D_TOV expiration.

Link: https://lore.kernel.org/r/20200907121443.5150-4-jhasan@marvell.com
Signed-off-by: Javed Hasan <jhasan@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-08 23:14:16 -04:00
Saurav Kashyap
31fc82d7fb scsi: qedf: Correct the comment in qedf_initiate_els
Correct the misleading comment in qedf_initiate_els().

Link: https://lore.kernel.org/r/20200907121443.5150-3-jhasan@marvell.com
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Javed Hasan <jhasan@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-08 23:14:16 -04:00
Javed Hasan
066664645d scsi: qedf: Change the debug parameter permission to read & write
Change the debug parameter permission to read & write.  Gives flexibility
to change the debug verbosity dynamically.

Link: https://lore.kernel.org/r/20200907121443.5150-2-jhasan@marvell.com
Signed-off-by: Javed Hasan <jhasan@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-08 23:14:15 -04:00
Stanley Chu
e0f9f86262 scsi: ufs: ufs-mediatek: Add host reset mechanism
Add host reset mechanism to try to recover host-side errors during recovery
flow.

Link: https://lore.kernel.org/r/20200908064507.30774-5-stanley.chu@mediatek.com
Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-08 22:49:55 -04:00
Stanley Chu
9a9ddb8a3a scsi: ufs: ufs-mediatek: Fix flag of unipro low-power mode
Forcibly leave UniPro low-power mode if UIC commands failed.  This makes
hba_enable_delay_us as correct (default) value for re-enabling the host.

At the same time, change type of parameter "lpm" in function
ufs_mtk_unipro_set_pm() to "bool".

Link: https://lore.kernel.org/r/20200908064507.30774-4-stanley.chu@mediatek.com
Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-08 22:49:54 -04:00
Stanley Chu
a3e40b80dc scsi: ufs: ufs-mediatek: Fix HOST_PA_TACTIVATE quirk
Simply add HOST_PA_TACTIVATE quirk back since it was incorrectly removed
before.

Link: https://lore.kernel.org/r/20200908064507.30774-3-stanley.chu@mediatek.com
Fixes: 47d054580a ("scsi: ufs-mediatek: fix HOST_PA_TACTIVATE quirk for Samsung UFS Devices")
Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-08 22:49:54 -04:00
Stanley Chu
30a90782c1 scsi: ufs: ufs-mediatek: Eliminate error message for unbound mphy
Some MediaTek platforms does not have to bind MPHY so users shall not see
any unnecessary logs. Simply remove logs for this case.

Link: https://lore.kernel.org/r/20200908064507.30774-2-stanley.chu@mediatek.com
Fixes: fc4983018f ("scsi: ufs-mediatek: Allow unbound mphy")
Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-08 22:49:53 -04:00
Manish Rangankar
96a766a789 scsi: qedi: Add support for handling PCIe errors
The error recovery is handled by management firmware (MFW) with the help of
qed/qedi drivers. Upon detecting errors, driver informs MFW about this
event which in turn starts a recovery process. MFW sends ERROR_RECOVERY
notification to the driver which performs the required cleanup/recovery
from the driver side.

Link: https://lore.kernel.org/r/20200908095657.26821-9-mrangankar@marvell.com
Signed-off-by: Manish Rangankar <mrangankar@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-08 22:40:25 -04:00
Manish Rangankar
f4ba4e55db scsi: qedi: Add firmware error recovery invocation support
Add support to initiate MFW process recovery for all the devices if storage
function receives the event first.

Also added fix for kernel test robot warning,

>> drivers/scsi/qedi/qedi_main.c:1119:6: warning: no previous prototype
>> for 'qedi_schedule_hw_err_handler' [-Wmissing-prototypes]

Link: https://lore.kernel.org/r/20200908095657.26821-8-mrangankar@marvell.com
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Manish Rangankar <mrangankar@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-08 22:40:25 -04:00
Nilesh Javali
4118879be3 scsi: qedi: Mark all connections for recovery on link down event
For short time cable pulls, the in-flight I/O to the firmware is never
cleaned up, resulting in the behaviour of stale I/O completion causing
list_del corruption and soft lockup of the system.

On link down event, mark all the connections for recovery, causing cleanup
of all the in-flight I/O immediately.

Link: https://lore.kernel.org/r/20200908095657.26821-7-mrangankar@marvell.com
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Manish Rangankar <mrangankar@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-08 22:40:23 -04:00
Manish Rangankar
5a2e69af16 scsi: qedi: Use snprintf instead of sprintf
Use snprintf to limit max number of bytes to the buffer.

Link: https://lore.kernel.org/r/20200908095657.26821-6-mrangankar@marvell.com
Signed-off-by: Manish Rangankar <mrangankar@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-08 22:40:23 -04:00