linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-24 22:16:43 +07:00

Author	SHA1	Message	Date
James Smart	df3fe76658	scsi: lpfc: add RDF registration and Link Integrity FPIN logging This patch modifies lpfc to register for Link Integrity events via the use of an RDF ELS and to perform Link Integrity FPIN logging. Specifically, the driver was modified to: - Format and issue the RDF ELS immediately following SCR registration. This registers the ability of the driver to receive FPIN ELS. - Adds decoding of the FPIN els into the received descriptors, with logging of the Link Integrity event information. After decoding, the ELS is delivered to the scsi fc transport to be delivered to any user-space applications. - To aid in logging, simple helpers were added to create enum to name string lookup functions that utilize the initialization helpers from the fc_els.h header. - Note: base header definitions for the ELS's don't populate the descriptor payloads. As such, lpfc creates it's own version of the structures, using the base definitions (mostly headers) and additionally declaring the descriptors that will complete the population of the ELS. Link: https://lore.kernel.org/r/20200210173155.547-3-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-02-18 00:08:38 -05:00
James Smart	145e5a8a5c	scsi: lpfc: Copyright updates for 12.6.0.4 patches Update copyrights to 2020 for files modified in the 12.6.0.4 patch set. Link: https://lore.kernel.org/r/20200128002312.16346-13-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-02-10 22:46:56 -05:00
James Smart	6cde2e3e28	scsi: lpfc: Remove handler for obsolete ELS - Read Port Status (RPS) There was report of an odd "Fix me..." log message, which was tracked down to the lpfc_els_rcv_rps() routine. This was in handling of a very old and obsolete ELS - Read Port Status. The RPS ELS was defined in FC-LS-1, but deprecated in FC-LS-2, and removed from all later FC-LS revisions. It was replaced by the Read Diagnostic Parameters (RDP) ELS and the Link Error Status Block descriptor. There should be no support for the RSP ELS. Remove support from driver. Link: https://lore.kernel.org/r/20200128002312.16346-9-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-02-10 22:46:56 -05:00
James Smart	6c6d59e0fe	scsi: lpfc: fix: Coverity: lpfc_cmpl_els_rsp(): Null pointer dereferences Coverity reported the following: *** CID 101747: Null pointer dereferences (FORWARD_NULL) /drivers/scsi/lpfc/lpfc_els.c: 4439 in lpfc_cmpl_els_rsp() 4433 kfree(mp); 4434 } 4435 mempool_free(mbox, phba->mbox_mem_pool); 4436 } 4437 out: 4438 if (ndlp && NLP_CHK_NODE_ACT(ndlp)) { vvv CID 101747: Null pointer dereferences (FORWARD_NULL) vvv Dereferencing null pointer "shost". 4439 spin_lock_irq(shost->host_lock); 4440 ndlp->nlp_flag &= ~(NLP_ACC_REGLOGIN \| NLP_RM_DFLT_RPI); 4441 spin_unlock_irq(shost->host_lock); 4442 4443 /* If the node is not being used by another discovery thread, 4444 * and we are sending a reject, we are done with it. Fix by adding a check for non-null shost in line 4438. The scenario when shost is set to null is when ndlp is null. As such, the ndlp check present was sufficient. But better safe than sorry so add the shost check. Reported-by: coverity-bot <keescook+coverity-bot@chromium.org> Addresses-Coverity-ID: 101747 ("Null pointer dereferences") Fixes: `2e0fef85e0` ("[SCSI] lpfc: NPIV: split ports") CC: James Bottomley <James.Bottomley@SteelEye.com> CC: "Gustavo A. R. Silva" <gustavo@embeddedor.com> CC: linux-next@vger.kernel.org Link: https://lore.kernel.org/r/20191111230401.12958-3-jsmart2021@gmail.com Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-11-12 22:21:33 -05:00
James Smart	2332e6e475	scsi: lpfc: Fix unexpected error messages during RSCN handling During heavy RCN activity and log_verbose = 0 we see these messages: 2754 PRLI failure DID:521245 Status:x9/xb2c00, data: x0 0231 RSCN timeout Data: x0 x3 0230 Unexpected timeout, hba link state x5 This is due to delayed RSCN activity. Correct by avoiding the timeout thus the messages by restarting the discovery timeout whenever an rscn is received. Filter PRLI responses such that severity depends on whether expected for the configuration or not. For example, PRLI errors on a fabric will be informational (they are expected), but Point-to-Point errors are not necessarily expected so they are raised to an error level. Link: https://lore.kernel.org/r/20191105005708.7399-5-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-11-06 00:04:04 -05:00
James Smart	b4b3417cf6	scsi: lpfc: Add additional discovery log messages When debugging a recent discovery customer problem it was very hard to tell what was happening with the existing discovery log messages. To fully debug the issue additional log messages were necessary. Add or extend log messages so that sufficient information is present for debugging. Link: https://lore.kernel.org/r/20191018211832.7917-16-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-10-24 21:02:06 -04:00
James Smart	f84f8f93f0	scsi: lpfc: fix coverity error of dereference after null check Log message conditional upon vport being NULL dereferences vport to determine log verbose setting. Changed to use lpfc_print_log which uses phba to determine the active log verbose setting. Fixes: `43bfea1bff` ("scsi: lpfc: Fix coverity errors on NULL pointer checks") Link: https://lore.kernel.org/r/20191018211832.7917-8-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-10-24 21:02:05 -04:00
James Smart	d38b4a527f	scsi: lpfc: Fix spinlock_irq issues in lpfc_els_flush_cmd() While reviewing the CT behavior, issues with spinlock_irq were seen. The driver should be using spinlock_irqsave/irqrestore in the els flush routine. Changed to spinlock_irqsave/irqrestore. Link: https://lore.kernel.org/r/20190922035906.10977-15-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-09-30 22:07:10 -04:00
James Smart	15498dc1a5	scsi: lpfc: Fix list corruption in lpfc_sli_get_iocbq After study, it was determined there was a double free of a CT iocb during execution of lpfc_offline_prep and lpfc_offline. The prep routine issued an abort for some CT iocbs, but the aborts did not complete fast enough for a subsequent routine that waits for completion. Thus the driver proceeded to lpfc_offline, which releases any pending iocbs. Unfortunately, the completions for the aborts were then received which re-released the ct iocbs. Turns out the issue for why the aborts didn't complete fast enough was not their time on the wire/in the adapter. It was the lpfc_work_done routine, which requires the adapter state to be UP before it calls lpfc_sli_handle_slow_ring_event() to process the completions. The issue is the prep routine takes the link down as part of it's processing. To fix, the following was performed: - Prevent the offline routine from releasing iocbs that have had aborts issued on them. Defer to the abort completions. Also means the driver fully waits for the completions. Given this change, the recognition of "driver-generated" status which then releases the iocb is no longer valid. As such, the change made in the commit `296012285c` is reverted. As recognition of "driver-generated" status is no longer valid, this patch reverts the changes made in commit `296012285c` ("scsi: lpfc: Fix leak of ELS completions on adapter reset") - Modify lpfc_work_done to allow slow path completions so that the abort completions aren't ignored. - Updated the fdmi path to recognize a CT request that fails due to the port being unusable. This stops FDMI retries. FDMI will be restarted on next link up. Link: https://lore.kernel.org/r/20190922035906.10977-14-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-09-30 22:07:10 -04:00
James Smart	43bfea1bff	scsi: lpfc: Fix coverity errors on NULL pointer checks Coverity flagged several scenarios where checking of null pointer values wasn't consistent. Fix the code to that be consistent on checking. Link: https://lore.kernel.org/r/20190922035906.10977-12-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-09-30 22:07:10 -04:00
James Smart	0d8af09643	scsi: lpfc: Add NVMe sequence level error recovery support FC-NVMe-2 added support for sequence level error recovery in the FC-NVME protocol. This allows for the detection of errors and lost frames and immediate retransmission of data to avoid exchange termination, which escalates into NVMeoFC connection and association failures. A significant RAS improvement. The driver is modified to indicate support for SLER in the NVMe PRLI is issues and to check for support in the PRLI response. When both sides support it, the driver will set a bit in the WQE to enable the recovery behavior on the exchange. The adapter will take care of all detection and retransmission. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:12 -04:00
James Smart	e62245d923	scsi: lpfc: Add MDS driver loopback diagnostics support Added code to support driver loopback with MDS Diagnostics. This style of diagnostics passes frames from the fabric to the driver who then echo them back out the link. SEND_FRAME WQEs are used to transmit the frames. Added the SOF and EOF field location definitions for use by SEND_FRAME. Also ensure that enable_mds_diags is a RW parameter. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:12 -04:00
James Smart	3235066449	scsi: lpfc: Migrate to %px and %pf in kernel print calls In order to see real addresses, convert %p with %px for kernel addresses and replace %p with %pf for functions. While converting, standardize on "x%px" throughout (not %px or 0x%px). Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:11 -04:00
James Smart	d9f492a1a1	scsi: lpfc: Fix coverity warnings Running on Coverity produced the following errors: - coding style (indentation) - memset size mismatch errors note: comment cases where it is purposely a mismatch Fix the errors. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:11 -04:00
James Smart	a6d10f24a0	scsi: lpfc: Fix driver nvme rescan logging In situations where zoning is not being used, thus NVMe initiators see other NVMe initiators as well as NVMe targets, a link bounce on an initiator will cause the NVMe initiators to spew "6169" State Error messages. The driver is not qualifying whether the remote port is a NVMe targer or not before calling the lpfc_nvme_rescan_port(), which validates the role and prints the message if its only an NVMe initiator. Fix by the following: - Before calling lpfc_nvme_rescan_port() ensure that the node is a NVMe storage target or a NVMe discovery controller. - Clean up implementation of lpfc_nvme_rescan_port. remoteport pointer will always be NULL if a NVMe initiator only. But, grabbing of remoteport pointer should be done under lock to coincide with the registering of the remote port with the fc transport. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:10 -04:00
James Smart	6ede2ddd8b	scsi: lpfc: Fix FLOGI handling across multiple link up/down conditions It's possible for the driver to initiate an FLOGI and before it completes, another link down/up transition occurs requiring a new FLOGI. Currently, nothing is done to abort/noop the older FLOGI request to the adapter, so if this transition occurs and the FLOGI completion is received after the link down/up transition, the driver may erroneously act on the older FLOGI. In most cases, the adapter properly terminates/fails the FLOGI, but there is a timing condition where the FLOGI may complete on the wire prior to the transition, but the response may not be seen/processed by the driver before the driver sees the link transition. Fix by having the link down handler in the driver run through any outstanding ELS's and change the completion handler of the ELS so that it will be no-op'd and released. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-08-19 22:41:09 -04:00
Linus Torvalds	ba6d10ab80	SCSI misc on 20190709 This is mostly update of the usual drivers: qla2xxx, hpsa, lpfc, ufs, mpt3sas, ibmvscsi, megaraid_sas, bnx2fc and hisi_sas as well as the removal of the osst driver (I heard from Willem privately that he would like the driver removed because all his test hardware has failed). Plus number of minor changes, spelling fixes and other trivia. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXSTl4yYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishdcxAQDCJVbd fPUX76/V1ldupunF97+3DTharxxbst+VnkOnCwD8D4c0KFFFOI9+F36cnMGCPegE fjy17dQLvsJ4GsidHy8= =aS5B -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "This is mostly update of the usual drivers: qla2xxx, hpsa, lpfc, ufs, mpt3sas, ibmvscsi, megaraid_sas, bnx2fc and hisi_sas as well as the removal of the osst driver (I heard from Willem privately that he would like the driver removed because all his test hardware has failed). Plus number of minor changes, spelling fixes and other trivia. The big merge conflict this time around is the SPDX licence tags. Following discussion on linux-next, we believe our version to be more accurate than the one in the tree, so the resolution is to take our version for all the SPDX conflicts" Note on the SPDX license tag conversion conflicts: the SCSI tree had done its own SPDX conversion, which in some cases conflicted with the treewide ones done by Thomas & co. In almost all cases, the conflicts were purely syntactic: the SCSI tree used the old-style SPDX tags ("GPL-2.0" and "GPL-2.0+") while the treewide conversion had used the new-style ones ("GPL-2.0-only" and "GPL-2.0-or-later"). In these cases I picked the new-style one. In a few cases, the SPDX conversion was actually different, though. As explained by James above, and in more detail in a pre-pull-request thread: "The other problem is actually substantive: In the libsas code Luben Tuikov originally specified gpl 2.0 only by dint of stating: * This file is licensed under GPLv2. In all the libsas files, but then muddied the water by quoting GPLv2 verbatim (which includes the or later than language). So for these files Christoph did the conversion to v2 only SPDX tags and Thomas converted to v2 or later tags" So in those cases, where the spdx tag substantially mattered, I took the SCSI tree conversion of it, but then also took the opportunity to turn the old-style "GPL-2.0" into a new-style "GPL-2.0-only" tag. Similarly, when there were whitespace differences or other differences to the comments around the copyright notices, I took the version from the SCSI tree as being the more specific conversion. Finally, in the spdx conversions that had no conflicts (because the treewide ones hadn't been done for those files), I just took the SCSI tree version as-is, even if it was old-style. The old-style conversions are perfectly valid, even if the "-only" and "-or-later" versions are perhaps more descriptive. * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (185 commits) scsi: qla2xxx: move IO flush to the front of NVME rport unregistration scsi: qla2xxx: Fix NVME cmd and LS cmd timeout race condition scsi: qla2xxx: on session delete, return nvme cmd scsi: qla2xxx: Fix kernel crash after disconnecting NVMe devices scsi: megaraid_sas: Update driver version to 07.710.06.00-rc1 scsi: megaraid_sas: Introduce various Aero performance modes scsi: megaraid_sas: Use high IOPS queues based on IO workload scsi: megaraid_sas: Set affinity for high IOPS reply queues scsi: megaraid_sas: Enable coalescing for high IOPS queues scsi: megaraid_sas: Add support for High IOPS queues scsi: megaraid_sas: Add support for MPI toolbox commands scsi: megaraid_sas: Offload Aero RAID5/6 division calculations to driver scsi: megaraid_sas: RAID1 PCI bandwidth limit algorithm is applicable for only Ventura scsi: megaraid_sas: megaraid_sas: Add check for count returned by HOST_DEVICE_LIST DCMD scsi: megaraid_sas: Handle sequence JBOD map failure at driver level scsi: megaraid_sas: Don't send FPIO to RL Bypass queue scsi: megaraid_sas: In probe context, retry IOC INIT once if firmware is in fault scsi: megaraid_sas: Release Mutex lock before OCR in case of DCMD timeout scsi: megaraid_sas: Call disable_irq from process IRQ poll scsi: megaraid_sas: Remove few debug counters from IO path ...	2019-07-11 15:14:01 -07:00
James Smart	6f2589f478	lpfc: add support for translating an RSCN rcv into a discovery rescan This patch updates RSCN receive processing to check for the remote port being an NVME port, and if so, invoke the nvme_fc callback to rescan the remote port. The rescan will generate a discovery udev event. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>	2019-06-21 11:08:38 +02:00
James Smart	f60cb93bbf	lpfc: add support to generate RSCN events for nport This patch adds general RSCN support: - The ability to transmit an RSCN to the port on the other end of the link (regular port if pt2pt, or fabric controller if fabric). - And general recognition of an RSCN ELS when an ELS is received. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>	2019-06-21 11:08:37 +02:00
James Smart	f22bfe8d1c	scsi: lpfc: Fix PT2PT PLOGI collison stopping discovery Under heavy load the target stops responding, the drivers aborts timeout and we start recovery by logging out of the target, but the target is never logged into again. In a point-to-point scenario, there were battling PLOGI's. When we received a PLOGI request after having sent one, the driver cancels the processing of the original plogi. However, the completion path of the remaining plogi was coded to skip the reg_rpi that should be happening on the 2nd plogi. Correct by adding a simple pt2pt check such that the 2nd plogi isn't skipped and the reg_login occurs. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-06-18 19:46:21 -04:00
James Smart	c8cb261a07	scsi: lpfc: add check for loss of ndlp when sending RRQ There was a missing qualification of a valid ndlp structure when calling to send an RRQ for an abort. Add the check. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Tested-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-05-13 20:32:50 -04:00
James Smart	1a61e5486a	scsi: lpfc: add support for posting FC events on FPIN reception This patch adds support to recognize FPIN ELS's that are received. When one is received, the fc transport will be called to handle the the FPIN. Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-04-08 21:29:16 -04:00
Bart Van Assche	b27cbd5549	scsi: lpfc: Remove set-but-not-used variables This patch does not change any functionality but avoids that the compiler complains about set-but-not-used variables when building with W=1. Cc: James Smart <james.smart@broadcom.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-04-03 23:11:36 -04:00
Bart Van Assche	cd05c155d7	scsi: lpfc: Annotate switch/case fall-through This patch avoids that the compiler warns about missing fall-through annotation when building with W=1. Cc: James Smart <james.smart@broadcom.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-04-03 23:11:35 -04:00
James Smart	0d041215f0	scsi: lpfc: Update 12.2.0.0 file copyrights to 2019 For files modified as part of 12.2.0.0 patches, update copyright to 2019 Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:29:50 -05:00
James Smart	f6e8479052	scsi: lpfc: Fix default driver parameter collision for allowing NPIV support The conversion to enable SCSI and NVME fc4 support ran into an issue with NPIV support. With NVME, NPIV is not currently supported, but with SCSI it was. The driver reverted to its lowest setting meaning NPIV with SCSI was not allowed. Convert the NPIV checks and implementation so that SCSI can continue to allow NPIV support. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2019-02-05 22:29:50 -05:00
James Smart	719162bd5b	scsi: lpfc: Enable Management features for IF_TYPE=6 Addition of support for if_type=6 missed several checks for interface type, resulting in the failure of several key management features such as firmware dump and loopback testing. Correct the checks on the if_type so that both SLI4 IF_TYPE's 2 and 6 are supported. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-12 20:33:08 -05:00
Martin K. Petersen	2d1036aea4	Revert "scsi: lpfc: ls_rjt erroneus FLOGIs" This reverts commit `287aba2592`. We killed the bad firmware and this mod is no longer necessary. Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-12 20:26:56 -05:00
James Smart	0a9e9687ac	scsi: lpfc: Defer LS_ACC to FLOGI on point to point logins The current discovery state machine the driver treated FLOGI oddly. When point to point, an FLOGI is to be exchanged by the two ports, with the port with the most significant WWN then proceeding with PLOGI. The implementation in the driver was keyed to closely with "what have I sent", not with what has happened between the two endpoints. Thus, it blatantly would ACC an FLOGI, but reject PLOGI's until it had its FLOGI ACC'd. The problem is - the sending of FLOGI may be delayed for some reason, or the response to FLOGI held off by the other side. In the failing situation the other side sent an FLOGI, which was ACC'd, then sent PLOGIs which were then rjt'd until the retry count for the PLOGIs were exceeded and the port gave up. The FLOGI may have been very late in transmit, or the response held off until the PLOGIs failed. Given the other port had the higher WWN, no PLOGIs would occur and communication stopped. Correct the situation by changing the FLOGI handling. Defer any response to an FLOGI until the driver has sent its FLOGI as well. Then, upon either completion of the sent FLOGI, or upon sending an ACC to a received FLOGI (which may be received before or just after FLOGI was sent). the driver will act on who has the higher WWN. if the other port does, the driver will noop any handling of an FLOGI response (if outstanding) and wait for PLOGI. If the local port does, the driver will transition to sending PLOGI and will noop any action on responding to an FLOGI (if not yet received). Fortunately, to implement this, it only took another state flag and deferring any FLOGI response if the FLOGI has yet to be transmit. All subsequent actions were already in place. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-07 22:35:32 -05:00
James Smart	287aba2592	scsi: lpfc: ls_rjt erroneus FLOGIs In some link initialization sequences, the fw generates an erroneous FLOGI payload to the driver without an intervening link bounce. The driver, when it sees a 2nd FLOGI without an intervening link bounce, automatically performs a link bounce. In this, the link bounce causes the situate to repeat and in a nasty loop of link bounces. Resolve the issue by validating the FLOGI payload. The erroneous FLOGI will contain VVL signatures that are not normal. When the driver sees these, it will simply reject the flogi rather than bouncing the link. The reject is consumed within the firmware. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-07 22:35:32 -05:00
James Smart	92ea83a878	scsi: lpfc: rport port swap discovery issue. Two initiator ports were cable swapped and after swap both went down. The driver internally swaps the nlp nodes based on matching node wwn's but not the same nport id as before. After detecting a change in the nodes RPI, the driver sends an UNREG_RPI command and clears the NLP_RPI_REGISTERED flag, then swaps the node information with the other node. But the other node's NLP_RPI_REGISTERED flag is also cleared, but it is done so without an UNREG_RPI being sent, which causes the later REG_RPI for that other node to fail as the hardware believes its still registered. Additionally, if the node swap occurred while the two nodes had PLOGI's in flight, the fc4_types weren't properly getting swapped such that when the PLOGIs commpleted and PRLI's were then sent, the PRLI's acted on bad protocol types so the PRLI was for the wrong protocol. NVME devices saw SCSI FCP PRLIs and vice versa. Clean up the node swap so that the NLP_RPI_REGISTERED flag is handled properly. Fix the handling of the fc4_types when the nodes are swapped as well Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-07 22:35:32 -05:00
James Smart	5a9eeff57f	scsi: lpfc: Fix kernel Oops due to null pring pointers Driver is hitting null pring pointers in lpfc_do_work(). Pointer assignment occurs based on SLI-revision. If recovering after an error, its possible the sli revision for the port was cleared, making the lpfc_phba_elsring() not return a ring pointer, thus the null pointer. Add SLI revision checking to lpfc_phba_elsring() and status checking to all callers. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-07 22:35:32 -05:00
James Smart	dea16bdae2	scsi: lpfc: Fix discovery failures during port failovers with lots of vports The driver is getting hit with 100s of RSCNs during remote port address changes. Each of those RSCN's ends up generating UNREG_RPI and REG_PRI mailbox commands. The discovery engine within the driver doesn't wait for the mailbox command completions. Instead it sets state flags and moves forward. At some point, there's a massive backlog of mailbox commands which take time for the adapter to process. Additionally, it appears there were duplicate events from the switch so the driver generated duplicate mailbox commands for the same remote port. During this window, failures on PLOGI and PRLI ELS's are see as the adapter is rejecting them as they are for remote ports that still have pending mailbox commands. Streamline the discovery engine so that PLOGI log checks for outstanding UNREG_RPIs and defer the processing until the commands complete. This better synchronizes the ELS transmission vs the RPI registrations. Filter out multiple UNREG_RPIs being queued up for the same remote port. Beef up log messages in this area. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-07 22:35:32 -05:00
James Smart	3e1f071892	scsi: lpfc: refactor mailbox structure context fields The driver data structure for managing a mailbox command contained two context fields. Unfortunately, the context were considered "generic" to be used at the whim of the command code. Of course, one section of code used fields this way, while another did it that way, and eventually there were mixups. Refactored the structure so that the generic contexts become a node context and a buffer context and all code standardizes on their use. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-12-07 22:35:32 -05:00
James Smart	1dc5ec2452	scsi: lpfc: add Trunking support Add trunking support to the driver. Trunking is found on more recent asics. In general, trunking appears as a single "port" to the driver and overall behavior doesn't differ. Link speed is reported as an aggregate value, while link speed control is done on a per-physical link basis with all links in the trunk symmetrical. Some commands returning port information are updated to additionally provide trunking information. And new ACQEs are generated to report physical link events relative to the trunk. This patch contains the following modifications: - Added link speed settings of 128GB and 256GB. - Added handling of trunk-related ACQEs, mainly logging and trapping of physical link statuses. - Added additional bsg interface to query trunk state by applications. - Augment link_state sysfs attribtute to display trunk link status Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-11-06 20:42:51 -05:00
James Smart	7ea92eb458	scsi: lpfc: Implement GID_PT on Nameserver query to support faster failover The switches seem to respond faster to GID_PT vs GID_FT NameServer queries. Add support for GID_PT to be used over GID_FT to enable faster storage failover detection. Includes addition of new module parameter to select between GID_PT and GID_FT (GID_FT is default). Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-11-06 20:42:51 -05:00
James Smart	d83ca3ea83	scsi: lpfc: Correct loss of fc4 type on remote port address change An address change for a remote port cause PRLI for the wrong protocol to be sent. The node copy done in the discovery code skipped copying the fc4 protocols supported as well. Fix the copy logic for the address change. Beefed up log messages in this area as well. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-11-06 20:42:51 -05:00
James Smart	d496b9a724	scsi: lpfc: Fix odd recovery in duplicate FLOGIs in point-to-point Testing a point-to-point topology and a case of re-FLOGI without intervening link bouncing, showed an odd interaction with firmware and a resulting scenario where the driver no longer probed after accepting the new FLOGI. Work around the firmware issue by issuing a link bounce if a FLOGI is received after the link is already up and FLOGI's accepted. While debugging the issue, realized that some debug traces should be clarified to help in the future. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-11-06 20:42:51 -05:00
James Smart	b114d9009d	scsi: lpfc: Correct LCB RJT handling When LCB's are rejected, if beaconing was already in progress, the Reason Code Explanation was not being set. Should have been set to command in progress. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-11-06 20:42:51 -05:00
James Smart	036cad1f1a	scsi: lpfc: fcoe: Fix link down issue after 1000+ link bounces On FCoE adapters, when running link bounce test in a loop, initiator failed to login with switch switch and required driver reload to recover. Switch reached a point where all subsequent FLOGIs would be LS_RJT'd. Further testing showed the condition to be related to not performing FCF discovery between FLOGI's. Fix by monitoring FLOGI failures and once a repeated error is seen repeat FCF discovery. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-11-06 20:42:51 -05:00
James Smart	5cca2ab1b3	scsi: lpfc: Reset link or adapter instead of doing infinite nameserver PLOGI retry Currently, PLOGI failures are infinitely delayed/retried. There have been some fabric situations where the PLOGI's were to the nameserver and it stopped responding. The retries would never clear up. A better resolution in this situation is to retry a couple of times, then drop the link and reinit. This brings back connectivity to the nameserver. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-11-06 20:42:50 -05:00
James Smart	30e196cace	scsi: lpfc: Fix LOGO/PLOGI handling when triggerd by ABTS Timeout event After a LOGO in response to an ABTS timeout, a PLOGI wasn't issued to re-establish the login. An nlp_type check in the LOGO completion handler failed to restart discovery for NVME targets. Revised the nlp_type check for NVME as well as SCSI. While reviewing the LOGO handling a few other issues were seen and were addressed: - Better lock synchronization around ndlp data types - When the ABTS times out, unregister the RPI before sending the LOGO so that all local exchange contexts are cleared and nothing received while awaiting LOGO/PLOGI handling will be accepted. - LOGO handling optimized to: Wait only R_A_TOV for a response. It doesn't need to be retried on timeout. If there wasn't a response, a PLOGI will be sent, thus an implicit logout applies as well when the other port sees it. If there is a response, any kind of response is considered "good" and the XRI quarantined for a exchange qualifier window. - PLOGI is issued as soon a LOGO state is resolved. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-11-06 20:42:50 -05:00
James Smart	523128e53b	scsi: lpfc: Correct irq handling via locks when taking adapter offline When taking the board offline while performing i/o, unsafe locking errors occurred and irq level isn't properly managed. In lpfc_sli_hba_down, spin_lock_irqsave(&phba->hbalock, flags) does not disable softirqs raised from timer expiry. It is possible that a softirq is raised from the lpfc_els_retry_delay routine and recursively requests the same phba->hbalock spinlock causing deadlock. Address the deadlocks by creating a new port_list lock. The softirq behavior can then be managed a level deeper into the calling sequences. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-09-11 20:37:33 -04:00
James Smart	24bc311942	scsi: lpfc: Correct LCB ACCept payload After memory allocation for the LCB response frame, the memory wasn't zero initialized, and not all fields are set. Thus garbage shows up in the payload. Fix by zeroing the memory at allocation. Also properly set the Capability field based on duration support. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-08-02 15:45:19 -04:00
James Smart	4ae2ebde31	scsi: lpfc: Revise copyright for new company language Change references from "Broadcom Limited" to "Broadcom Inc." in the copyright message. Update copyright duration if not yet updated for 2018. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-07-10 22:15:09 -04:00
James Smart	66e9e6bf07	scsi: lpfc: Support duration field in Link Cable Beacon V1 command Current implementation missed setting the duration field. Correct the code to set the field. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-07-10 22:15:09 -04:00
James Smart	b04744ce52	scsi: lpfc: Fix driver not recovering NVME rports during target link faults During target-side port faults, the driver would not recover all target port logins. This resulted in a loss of nvme device discovery. The driver is coded to wait for all GID_FT requests to complete before restarting discovery. A fault is seen where the outstanding GIT_FT counts are not properly decremented, thus discovery would never start. Another fault was found in the clearing of the gidft_inp counter that would be skipped in this condition. And a third fault found with lpfc_nvme_register_port that would remove a reverence on the ndlp which then allows a node swap on a port address change to prematurely remove the reference and release the ndlp. The following changes are made: - Correct the decrementing of the outstanding GID_FT counters. - In RSCN handling, no longer zero the counter before calling to issue another GID_FT. - No longer remove the reference on the dlp when the ndlp->nrport value is not yet null. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-04-18 19:34:05 -04:00
James Smart	a3da825b49	scsi: lpfc: Fix SCSI lun discovery when port configured for both SCSI and NVME When a port is configured for NVME and SCSI Initiator support and it probes a target supporting both SCSI and NVME, NVME devices are discovered, but SCSI devices are not. The nlp_fc4_type for all NPorts should be cleared on Link Up or just before GID_FTs get issued, as opposed to just during GID_FT cmpl. RSCN activity as well as Link Up can trigger GID_FT. One GID_FT may complete before the next one is issued. Fix by clearng nlp_fc4_type on link up and just before both GID_FTs are issued. During port swapping, copy nlp_fc4_type to the new ndlp Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-03-12 21:55:23 -04:00
James Smart	fbd8a6ba65	scsi: lpfc: Add 64G link speed support The G7 adapter supports 64G link speeds. Add support to the driver. In addition, a small cleanup to replace the odd bitmap logic with a switch case. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-02-22 20:39:29 -05:00
James Smart	128bddacc4	scsi: lpfc: Update 11.4.0.7 modified files for 2018 Copyright Updated Copyright in files updated 11.4.0.7 Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-02-12 11:43:24 -05:00

1 2 3 4 5 ...

313 Commits