Commit Graph

14805 Commits

Author SHA1 Message Date
Linus Torvalds
ef918d3c80 SCSI fixes on 20170610
This is a set of user visible fixes (excepting one format string
 change).  Four of the qla2xxx fixes only affect the firmware dump
 path, but it's still important to the enterprise.  The rest are
 various NULL pointer crash conditions or outright driver hangs.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJZPGMCAAoJEAVr7HOZEZN4jekQAJN1x7WIB4HuI3EoHSSMej8h
 bdssAbqQn/H++nIJ/e6ZRHKt0P6ngSzXuIb4lwbOJUQa7sxEWaWDeXywPEqjDYqP
 BBjloYOrAff492uYXL48xjG4Xl4qOxb8GfKT7iFptIzAdk/2Rxhj56XqlhY7IMSG
 ut4binbz+3v0NEKnI6od+uxvXAc6EumyF0zW9a4rbjK/wAukciRIGWkOrsQpa8cJ
 VdgUsMdbpjTlYbMnPfHa+oUqKkWir3PI9rQ01AvMUugrqAXiAPLgoHFB6H8eVVn7
 vzVnJd31RoUrv6JNnWcRsi0VWsciPw5XBpd6VRVjZUdOlUds3vW7n1G2ut5TfAAp
 sYkFSuhxcWgp3QJpqDbS/l976dXyfdzhQpahgYLbRAuhoi8HDmcpwzTdWC9a41tw
 k2sqAbgZd60ZHu8OSrD2HqJrkMqSXzklMkZMS33nfE1Ki7c+aWHImby4P+lEKIIw
 nJCiVc3yO+TcWvdH5w+6Fu/nA0HJ9OcFEk1P+4Xz38n5o/WcduoXG6NgpVT+mKXO
 zQZDEYbWQYixDEs1m8fJpTHu5p2tXYzdMS9L/Fa0B2MQ3kY9XIT41rHqnJPBOp2R
 wKXksIyzQagW6r0bQ2lFkth0elLHGxDlwfCDgrN6zQFrdBcpRfT+GdTDpDWiWggt
 qgIbBvEO4sd12V5miQsK
 =jZXV
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "This is a set of user visible fixes (excepting one format string
  change).

  Four of the qla2xxx fixes only affect the firmware dump path, but it's
  still important to the enterprise. The rest are various NULL pointer
  crash conditions or outright driver hangs"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: cxgb4i: libcxgbi: in error case RST tcp conn
  scsi: scsi_debug: Avoid PI being disabled when TPGS is enabled
  scsi: qla2xxx: Fix extraneous ref on sp's after adapter break
  scsi: lpfc: prevent potential null pointer dereference
  scsi: lpfc: Avoid NULL pointer dereference in lpfc_els_abort()
  scsi: lpfc: nvmet_fc: fix format string
  scsi: qla2xxx: Fix crash due to NULL pointer dereference of ctx
  scsi: qla2xxx: Fix mailbox pointer error in fwdump capture
  scsi: qla2xxx: Set bit 15 for DIAG_ECHO_TEST MBC
  scsi: qla2xxx: Modify T262 FW dump template to specify same start/end to debug customer issues
  scsi: qla2xxx: Fix crash due to mismatch mumber of Q-pair creation for Multi queue
  scsi: qla2xxx: Fix NULL pointer access due to redundant fc_host_port_name call
  scsi: qla2xxx: Fix recursive loop during target mode configuration for ISP25XX leaving system unresponsive
  scsi: bnx2fc: fix race condition in bnx2fc_get_host_stats()
  scsi: qla2xxx: don't disable a not previously enabled PCI device
2017-06-11 11:21:08 -07:00
Linus Torvalds
1f915b7fed SCSI fixes on 20170603
This is nine fixes, seven of which are for the qedi driver (new in
 4.10) the other two are a use after free in the cxgbi drivers and a
 potential NULL dereference in the rdac device handler.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJZMyq/AAoJEAVr7HOZEZN4AtUP/1oTEhpXp2epNkEBC0fMywAu
 T6uUUmZJhcC/k5HkaI7AJUZxYSkVcu1A1QzolJ5GoBIEesfY2W/IOwKO77hhZ1Hp
 l+e22ioTRXSE2AWyreyVAGSPweG0PyK66kn2vuZXht7xVmcgTelnxp8GPK2MT6vM
 2Hxhf+velQxDLwLYOkLY0SAw6JAJpxkFkNcaBsuF2av3fSJtFCzTfgIqLgX1LpcA
 ckN8yLW0izIeOYKs0Opc65mCZu6AxKOTcvVFglFal7EB99RIU89brifBOqJIue0d
 KvdoeED2ReO76LZcvDObzEsfNNObHXgNZ100Dk8rF1ovY5JX1L+0fSuYZnJ2Imdg
 lHlgKZFVCmtLWnWUQYGizcdXHuLHt90ye2sCFK+SZx1EBBVO1maC2ME52nGVTNoU
 Jrk39dbg//n3ll1FNtlBIFtlFb+uD8DsrWq1KWhg98JcP2XgagCjc2VPF4Eo6Z0l
 v0JrxG7wATPgRNGhHXRrh1pGxvpuIXvDNVth3cVRa48RxkEJPSW+8Lg63olpCfV3
 sAlbThDU/ZUwZybRLa1nbjN7awiR4GaQL+HE9kj0BdR8JpIQHe7162C9KN5NKK7u
 TReIFSd+oo3AyQzvSR4m0lWeHRoRarDTHkB0/NqbwjMkRUP5125onGf45IaLDI0n
 SaAsYtnsGBEFK/xjAai1
 =MYEc
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "This is nine fixes, seven of which are for the qedi driver (new as of
  4.10) the other two are a use after free in the cxgbi drivers and a
  potential NULL dereference in the rdac device handler"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: libcxgbi: fix skb use after free
  scsi: qedi: Fix endpoint NULL panic during recovery.
  scsi: qedi: set max_fin_rt default value
  scsi: qedi: Set firmware tcp msl timer value.
  scsi: qedi: Fix endpoint NULL panic in qedi_set_path.
  scsi: qedi: Set dma_boundary to 0xfff.
  scsi: qedi: Correctly set firmware max supported BDs.
  scsi: qedi: Fix bad pte call trace when iscsiuio is stopped.
  scsi: scsi_dh_rdac: Use ctlr directly in rdac_failover_get()
2017-06-04 11:15:43 -07:00
Varun Prakash
e0f8e8cf3b scsi: cxgb4i: libcxgbi: in error case RST tcp conn
If logout response is not received and ->ep_disconnect() is called then
close tcp conn by RST instead of FIN to cleanup conn resources
immediately.

Also move ->csk_push_tx_frames() above 'done:' to avoid calling
->csk_push_tx_frames() in error cases.

Signed-off-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-02 14:59:19 -04:00
Linus Torvalds
393bcfaeb8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target fixes from Nicholas Bellinger:
 "Here are the target-pending fixes for v4.12-rc4:

   - ibmviscsis ABORT_TASK handling fixes that missed the v4.12 merge
     window. (Bryant Ly and Michael Cyr)

   - Re-add a target-core check enforcing WRITE overflow reject that was
     relaxed in v4.3, to avoid unsupported iscsi-target immediate data
     overflow. (nab)

   - Fix a target-core-user OOPs during device removal. (MNC + Bryant
     Ly)

   - Fix a long standing iscsi-target potential issue where kthread exit
     did not wait for kthread_should_stop(). (Jiang Yi)

   - Fix a iscsi-target v3.12.y regression OOPs involving initial login
     PDU processing during asynchronous TCP connection close. (MNC +
     nab)

  This is a little larger than usual for an -rc4, primarily due to the
  iscsi-target v3.12.y regression OOPs bug-fix.

  However, it's an important patch as MNC + Hannes where both able to
  trigger it using a reduced iscsi initiator login timeout combined with
  a backend taking a long time to complete I/Os during iscsi login
  driven session reinstatement"

* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  iscsi-target: Always wait for kthread_should_stop() before kthread exit
  iscsi-target: Fix initial login PDU asynchronous socket close OOPs
  tcmu: fix crash during device removal
  target: Re-add check to reject control WRITEs with overflow data
  ibmvscsis: Fix the incorrect req_lim_delta
  ibmvscsis: Clear left-over abort_cmd pointers
2017-06-01 10:40:41 -07:00
Martin K. Petersen
70bdf2026d scsi: scsi_debug: Avoid PI being disabled when TPGS is enabled
It was not possible to enable both T10 PI and TPGS because they share
the same byte in the INQUIRY response. Logically OR the TPGS value
instead of using assignment.

Reported-by: Ritika Srivastava <ritika.srivastava@oracle.com>
Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-31 22:58:25 -04:00
Bill Kuzeja
4cd3b6ebff scsi: qla2xxx: Fix extraneous ref on sp's after adapter break
Hung task timeouts can result if a qlogic board breaks unexpectedly
while running I/O. These tasks become hung because command srb reference
counts are not going to zero, hence the affected srbs and commands do
not get freed. This fix accounts for this extra reference in the srbs in
the case of a board failure.

Fixes: a465537ad1 ("qla2xxx: Disable the adapter and skip error recovery in case of register disconnect")
Signed-off-by: Bill Kuzeja <william.kuzeja@stratus.com>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-31 22:49:06 -04:00
Gustavo A. R. Silva
e6ef6a77f5 scsi: lpfc: prevent potential null pointer dereference
Null check at line 966: if (ndlp) {, implies that ndlp might be NULL.
Functions lpfc_nlp_set_state() and lpfc_issue_els_prli() dereference
pointer ndlp. Include these function calls inside the IF block that
tests pointer ndlp.

Addresses-Coverity-ID: 1401856
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-31 22:45:15 -04:00
Guilherme G. Piccoli
7c9fdfb700 scsi: lpfc: Avoid NULL pointer dereference in lpfc_els_abort()
We might have a NULL pring in lpfc_els_abort(), for example on error
recovery path, since queues are destroyed during error recovery
mechanism.

In this case, we should just drop the abort since the queues will be
recreated anyway. This patch just verifies for NULL pointer and stop the
abortion of the queue in case of a NULL pring.

Also, this patch converts return type of lpfc_els_abort() from int to
void, since it's not checked anywhere.

Reported-by: Harsha Thyagaraja <hathyaga@in.ibm.com>
Reported-by: Naresh Bannoth <nbannoth@in.ibm.com>
Tested-by: Raphael Silva <raphasil@linux.vnet.ibm.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Acked-by: James Smart  <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-31 22:44:13 -04:00
Arnd Bergmann
9094367a91 scsi: lpfc: nvmet_fc: fix format string
The lpfc_nvmeio_data() tracing helper always takes a format string and
three additional arguments. The latest caller has a format string with
only two integer arguments, causing this harmless warning:

drivers/scsi/lpfc/lpfc_nvmet.c: In function 'lpfc_nvmet_xmt_fcp_release':
drivers/scsi/lpfc/lpfc_nvmet.c:802:25: error: too many arguments for format [-Werror=format-extra-args]
  lpfc_nvmeio_data(phba, "NVMET FCP FREE: xri x%x ste %d\n", ctxp->oxid,

We could add a dummy argument here, but it seems reasonable to print
the 'abort' flag as the third argument.

Fixes: 19b58d9473 ("nvmet_fc: add req_release to lldd api")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-31 22:44:06 -04:00
Linus Torvalds
be941bf2e6 SCSI fixes on 20170524
This is quite a big update because it includes a rework of the lpfc
 driver to separate the NVMe part from the FC part.  The reason for
 doing this is because two separate trees (the nvme and scsi trees
 respectively) want to update the individual components and this
 separation will prevent a really nasty cross tree entanglement by the
 time we reach the next merge window.  The rest of the fixes are the
 usual minor sort with no significant security implications.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJZJhx1AAoJEAVr7HOZEZN4+lMQALqrWA4Kty2nHU1EfWXd8lOR
 VJt6TlthMQWn57MCuwi1Q6bQR8PXaDr9yDvSkHu1Kqu0ZnmZRRs5CsKgN5RFkO7s
 F8jZlqKtE36lfavqv+Li+ie110NfFDJVoQOACqhRybcT7En59nwu8dvPJZ1vXtCO
 qevukGFyDnHR3VJR/LJOGs7NUmVdGegUxALfOZHH22oOVU8v+iAARfgM0DI4bPS7
 BTlhJDEVL0/uiYb/D1l8xVQCCuChX7yVygPLC57Ag8eRMAiTVyTN6Y1L6AGeDye0
 hHty1Cv0yfEf51ZXNCizIvMlcEIB6lA40VUiZ62c2+Dp9TOceVgbVrVLF28c2e2o
 z73xcrnUBdPi1znGOrQuJlTBLBYUvsFrq4ZhzlS5vGsUNslYyFi5p8xtnbHxrIQq
 qRfTLeYWuOSyULvIiYkFyZkksr7up21wsaplN5OrNw0f0hTOf8ff2duM09MTARQO
 xxTTS1/TD2KCMm4qh638qNbrIdZgjvMFeNP+G/XagloZ5D8NCdn+pzm/vLm+7lAx
 D4AhwHcQ7I57YhDHLs56yhzL7cPyPsxeFPtYKFO7Vz1B0Xw+prgKRcCA+vOrs0ae
 vKMV1ctyo5E0BfUk7lYl3NP0IPqupc82GeO5IvUmh+swNYrg3TCct13Afr4sa0n+
 yNlLgoYLnJ3mVGMWvDgL
 =NtGp
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "This is quite a big update because it includes a rework of the lpfc
  driver to separate the NVMe part from the FC part.

  The reason for doing this is because two separate trees (the nvme and
  scsi trees respectively) want to update the individual components and
  this separation will prevent a really nasty cross tree entanglement by
  the time we reach the next merge window.

  The rest of the fixes are the usual minor sort with no significant
  security implications"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (25 commits)
  scsi: zero per-cmd private driver data for each MQ I/O
  scsi: csiostor: fix use after free in csio_hw_use_fwconfig()
  scsi: ufs: Clean up some rpm/spm level SysFS nodes upon remove
  scsi: lpfc: fix build issue if NVME_FC_TARGET is not defined
  scsi: lpfc: Fix NULL pointer dereference during PCI error recovery
  scsi: lpfc: update version to 11.2.0.14
  scsi: lpfc: Add MDS Diagnostic support.
  scsi: lpfc: Fix NVMEI's handling of NVMET's PRLI response attributes
  scsi: lpfc: Cleanup entry_repost settings on SLI4 queues
  scsi: lpfc: Fix debugfs root inode "lpfc" not getting deleted on driver unload.
  scsi: lpfc: Fix NVME I+T not registering NVME as a supported FC4 type
  scsi: lpfc: Added recovery logic for running out of NVMET IO context resources
  scsi: lpfc: Separate NVMET RQ buffer posting from IO resources SGL/iocbq/context
  scsi: lpfc: Separate NVMET data buffer pool fir ELS/CT.
  scsi: lpfc: Fix NMI watchdog assertions when running nvmet IOPS tests
  scsi: lpfc: Fix NVMEI driver not decrementing counter causing bad rport state.
  scsi: lpfc: Fix nvmet RQ resource needs for large block writes.
  scsi: lpfc: Adding additional stats counters for nvme.
  scsi: lpfc: Fix system crash when port is reset.
  scsi: lpfc: Fix used-RPI accounting problem.
  ...
2017-05-24 20:29:53 -07:00
Joe Carnuccio
d5ff0eed3a scsi: qla2xxx: Fix crash due to NULL pointer dereference of ctx
Fixes following signature in the stack trace:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000374
IP: [<ffffffffa06ec8eb>] qla2x00_sp_free_dma+0xeb/0x2a0 [qla2xxx]

Cc: <stable@vger.kernel.org> # v4.10+
Signed-off-by: Joe Carnuccio <joe.carnuccio@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-24 21:55:51 -04:00
Joe Carnuccio
74939a0bc7 scsi: qla2xxx: Fix mailbox pointer error in fwdump capture
Cc: <stable@vger.kernel.org> # v4.10+
Signed-off-by: Joe Carnuccio <joe.carnuccio@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-24 21:55:51 -04:00
Joe Carnuccio
1d63496516 scsi: qla2xxx: Set bit 15 for DIAG_ECHO_TEST MBC
Set bit (BIT_15) to send right ECHO payload information for Diagnostic
Echo Test command.

Cc: <stable@vger.kernel.org> # v4.10+
Signed-off-by: Joe Carnuccio <joe.carnuccio@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-24 21:55:50 -04:00
Joe Carnuccio
ce6c668b14 scsi: qla2xxx: Modify T262 FW dump template to specify same start/end to debug customer issues
Firmware dump allows for debugging customer issues. This patch fixes
start/end pointer calculation to capture T262 template entry for dump
tool.

Cc: <stable@vger.kernel.org> # v4.10+
Signed-off-by: Joe Carnuccio <joe.carnuccio@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-24 21:55:50 -04:00
Sawan Chandak
b95b9452aa scsi: qla2xxx: Fix crash due to mismatch mumber of Q-pair creation for Multi queue
when driver is loaded with Multi Queue enabled, it was noticed that
there was one less queue pair created.

Following message would indicate this:

"No resources to create additional q pair."

The result of one less queue pair means that system can crash, if the
block mq layer thinks there is an extra hardware queue available, and
the driver will use a NULL ptr qpair in that instance.

Following stack trace is seen in one of the crash:

irq_create_affinity_masks+0x98/0x530
irq_create_affinity_masks+0x98/0x530
__pci_enable_msix+0x321/0x4e0
mutex_lock+0x12/0x40
pci_alloc_irq_vectors_affinity+0xb5/0x140
qla24xx_enable_msix+0x79/0x530 [qla2xxx]
qla2x00_request_irqs+0x61/0x2d0 [qla2xxx]
qla2x00_probe_one+0xc73/0x2390 [qla2xxx]
ida_simple_get+0x98/0x100
kernfs_next_descendant_post+0x40/0x50
local_pci_probe+0x45/0xa0
pci_device_probe+0xfc/0x140
driver_probe_device+0x2c5/0x470
__driver_attach+0xdd/0xe0
driver_probe_device+0x470/0x470
bus_for_each_dev+0x6c/0xc0
driver_attach+0x1e/0x20
bus_add_driver+0x45/0x270
driver_register+0x60/0xe0
__pci_register_driver+0x4c/0x50
qla2x00_module_init+0x1ce/0x21e [qla2xxx]

Cc: <stable@vger.kernel.org> # v4.10+
Signed-off-by: Sawan Chandak <sawan.chandak@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-24 21:55:50 -04:00
Quinn Tran
0ea88662b5 scsi: qla2xxx: Fix NULL pointer access due to redundant fc_host_port_name call
Remove redundant fc_host_port_name calls to prevent early access of
scsi_host->shost_data buffer. This prevent null pointer access.

Following stack trace is seen:

BUG: unable to handle kernel NULL pointer dereference at 00000000000008
IP: qla24xx_report_id_acquisition+0x22d/0x3a0 [qla2xxx]

Cc: <stable@vger.kernel.org> # v4.11+
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-24 21:55:50 -04:00
himanshu.madhani@cavium.com
cb590700e0 scsi: qla2xxx: Fix recursive loop during target mode configuration for ISP25XX leaving system unresponsive
Following messages are seen into system logs

qla2xxx [0000:09:00.0]-00af:9: Performing ISP error recovery - ha=ffff98315ee30000.
qla2xxx [0000:09:00.0]-504b:9: RISC paused -- HCCR=40, Dumping firmware.
qla2xxx [0000:09:00.0]-d009:9: Firmware has been previously dumped (ffffba488c001000) -- ignoring request.
qla2xxx [0000:09:00.0]-504b:9: RISC paused -- HCCR=40, Dumping firmware.

See Bugzilla for details
https://bugzilla.kernel.org/show_bug.cgi?id=195285

Fixes: d74595278f ("scsi: qla2xxx: Add multiple queue pair functionality.")
Cc: <stable@vger.kernel.org> # v4.10+
Reported-by: Laurence Oberman <loberman@redhat.com>
Reported-by: Anthony Bloodoff <anthony.bloodoff@gmail.com>
Tested-by: Laurence Oberman <loberman@redhat.com>
Tested-by: Anthony Bloodoff <anthony.bloodoff@gmail.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Giridhar Malavali <giridhar.malavali@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-24 21:55:50 -04:00
Maurizio Lombardi
c2dd893a3b scsi: bnx2fc: fix race condition in bnx2fc_get_host_stats()
If multiple tasks attempt to read the stats, it may happen that the
start_req_done completion is re-initialized while still being used by
another task, causing a list corruption.

This patch fixes the bug by adding a mutex to serialize the calls to
bnx2fc_get_host_stats().

WARNING: at lib/list_debug.c:48 list_del+0x6e/0xa0() (Not tainted)
Hardware name: PowerEdge R820
list_del corruption. prev->next should be ffff882035627d90, but was ffff884069541588

Pid: 40267, comm: perl Not tainted 2.6.32-642.3.1.el6.x86_64 #1
Call Trace:
 [<ffffffff8107c691>] ? warn_slowpath_common+0x91/0xe0
 [<ffffffff8107c796>] ? warn_slowpath_fmt+0x46/0x60
 [<ffffffff812ad16e>] ? list_del+0x6e/0xa0
 [<ffffffff81547eed>] ? wait_for_common+0x14d/0x180
 [<ffffffff8106c4a0>] ? default_wake_function+0x0/0x20
 [<ffffffff81547fd3>] ? wait_for_completion_timeout+0x13/0x20
 [<ffffffffa05410b1>] ? bnx2fc_get_host_stats+0xa1/0x280 [bnx2fc]
 [<ffffffffa04cf630>] ? fc_stat_show+0x90/0xc0 [scsi_transport_fc]
 [<ffffffffa04cf8b6>] ? show_fcstat_tx_frames+0x16/0x20 [scsi_transport_fc]
 [<ffffffff8137c647>] ? dev_attr_show+0x27/0x50
 [<ffffffff8113b9be>] ? __get_free_pages+0xe/0x50
 [<ffffffff812170e1>] ? sysfs_read_file+0x111/0x200
 [<ffffffff8119a305>] ? vfs_read+0xb5/0x1a0
 [<ffffffff8119b0b6>] ? fget_light_pos+0x16/0x50
 [<ffffffff8119a651>] ? sys_read+0x51/0xb0
 [<ffffffff810ee1fe>] ? __audit_syscall_exit+0x25e/0x290
 [<ffffffff8100b0d2>] ? system_call_fastpath+0x16/0x1b

Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Acked-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-24 15:14:42 -04:00
Johannes Thumshirn
ddff7ed45e scsi: qla2xxx: don't disable a not previously enabled PCI device
When pci_enable_device() or pci_enable_device_mem() fail in
qla2x00_probe_one() we bail out but do a call to
pci_disable_device(). This causes the dev_WARN_ON() in
pci_disable_device() to trigger, as the device wasn't enabled
previously.

So instead of taking the 'probe_out' error path we can directly return
*iff* one of the pci_enable_device() calls fails.

Additionally rename the 'probe_out' goto label's name to the more
descriptive 'disable_device'.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Fixes: e315cd28b9 ("[SCSI] qla2xxx: Code changes for qla data structure refactoring")
Cc: <stable@vger.kernel.org>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Giridhar Malavali <giridhar.malavali@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-24 15:09:54 -04:00
Varun Prakash
75b61250bf scsi: libcxgbi: fix skb use after free
skb->data is assigned to task->hdr in cxgbi_conn_alloc_pdu(),
skb gets freed after tx but task->hdr is still dereferenced in
iscsi_tcp_task_xmit() to avoid this call skb_get() after allocating skb
and free the skb in cxgbi_cleanup_task() or before allocating new skb in
cxgbi_conn_alloc_pdu().

Signed-off-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-23 22:39:14 -04:00
manish.rangankar@cavium.com
b19775e478 scsi: qedi: Fix endpoint NULL panic during recovery.
Signed-off-by: Manish Rangankar <manish.rangankar@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-23 22:16:43 -04:00
Nilesh Javali
3d61a31322 scsi: qedi: set max_fin_rt default value
max_fin_rt is the maximum re-transmission of FIN packets
as part of the termination flow. After reaching this value
the FW will send a single RESET.

Signed-off-by: Nilesh Javali <nilesh.javali@cavium.com>
Signed-off-by: Manish Rangankar <manish.rangankar@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-23 22:16:43 -04:00
manish.rangankar@cavium.com
962ea1c0df scsi: qedi: Set firmware tcp msl timer value.
Signed-off-by: Manish Rangankar <manish.rangankar@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-23 22:16:43 -04:00
manish.rangankar@cavium.com
0ea9314f4e scsi: qedi: Fix endpoint NULL panic in qedi_set_path.
RIP: 0010:qedi_set_path+0x114/0x570 [qedi]
 Call Trace:
  [<ffffffffa0472923>] iscsi_if_recv_msg+0x623/0x14a0
  [<ffffffff81307de6>] ? rhashtable_lookup_compare+0x36/0x70
  [<ffffffffa047382e>] iscsi_if_rx+0x8e/0x1f0
  [<ffffffff8155983d>] netlink_unicast+0xed/0x1b0
  [<ffffffff81559c30>] netlink_sendmsg+0x330/0x770
  [<ffffffff81510d60>] sock_sendmsg+0xb0/0xf0
  [<ffffffff8101360b>] ? __switch_to+0x17b/0x4b0
  [<ffffffff8163a2c8>] ? __schedule+0x2d8/0x900
  [<ffffffff81511199>] ___sys_sendmsg+0x3a9/0x3c0
  [<ffffffff810e2298>] ? get_futex_key+0x1c8/0x2b0
  [<ffffffff810e25a0>] ? futex_wake+0x80/0x160

Signed-off-by: Manish Rangankar <manish.rangankar@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-23 22:16:43 -04:00
manish.rangankar@cavium.com
d0788a528d scsi: qedi: Set dma_boundary to 0xfff.
Signed-off-by: Manish Rangankar <manish.rangankar@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-23 22:16:43 -04:00
manish.rangankar@cavium.com
fc2fbf0d42 scsi: qedi: Correctly set firmware max supported BDs.
Signed-off-by: Manish Rangankar <manish.rangankar@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-23 22:16:43 -04:00
Arun Easi
5e901d0b15 scsi: qedi: Fix bad pte call trace when iscsiuio is stopped.
munmap done by iscsiuio during a stop of the service triggers a "bad
pte" warning sometimes. munmap kernel path goes through the mmapped
pages and has a validation check for mapcount (in struct page) to be
zero or above. kzalloc, which we had used to allocate udev->ctrl, uses
slab allocations, which re-uses mapcount (union) for other purposes that
can make the mapcount look negative. Avoid all these trouble by invoking
one of the __get_free_pages wrappers to be used instead of kzalloc for
udev->ctrl.

 BUG: Bad page map in process iscsiuio  pte:80000000aa624067 pmd:3e6777067
 page:ffffea0002a98900 count:2 mapcount:-2143289280
     mapping: (null) index:0xffff8800aa624e00
 page flags: 0x10075d00000090(dirty|slab)
 page dumped because: bad pte
 addr:00007fcba70a3000 vm_flags:0c0400fb anon_vma: (null)
     mapping:ffff8803edf66e90 index:0

 Call Trace:
     dump_stack+0x19/0x1b
     print_bad_pte+0x1af/0x250
     unmap_page_range+0x7a7/0x8a0
     unmap_single_vma+0x81/0xf0
     unmap_vmas+0x49/0x90
     unmap_region+0xbe/0x140
     ? vma_rb_erase+0x121/0x220
     do_munmap+0x245/0x420
     vm_munmap+0x41/0x60
     SyS_munmap+0x22/0x30
     tracesys+0xdd/0xe2

Signed-off-by: Arun Easi <arun.easi@cavium.com>
Signed-off-by: Manish Rangankar <manish.rangankar@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-23 22:16:42 -04:00
Artem Savkov
0648a07c9b scsi: scsi_dh_rdac: Use ctlr directly in rdac_failover_get()
rdac_failover_get references struct rdac_controller as
ctlr->ms_sdev->handler_data->ctlr for no apparent reason. Besides being
inefficient this also introduces a null-pointer dereference as
send_mode_select() sets ctlr->ms_sdev to NULL before calling
rdac_failover_get():

[   18.432550] device-mapper: multipath service-time: version 0.3.0 loaded
[   18.436124] BUG: unable to handle kernel NULL pointer dereference at 0000000000000790
[   18.436129] IP: send_mode_select+0xca/0x560
[   18.436129] PGD 0
[   18.436130] P4D 0
[   18.436130]
[   18.436132] Oops: 0000 [#1] SMP
[   18.436133] Modules linked in: dm_service_time sd_mod dm_multipath amdkfd amd_iommu_v2 radeon(+) i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm qla2xxx drm serio_raw scsi_transport_fc bnx2 i2c_core dm_mirror dm_region_hash dm_log dm_mod
[   18.436143] CPU: 4 PID: 443 Comm: kworker/u16:2 Not tainted 4.12.0-rc1.1.el7.test.x86_64 #1
[   18.436144] Hardware name: IBM BladeCenter LS22 -[79013SG]-/Server Blade, BIOS -[L8E164AUS-1.07]- 05/25/2011
[   18.436145] Workqueue: kmpath_rdacd send_mode_select
[   18.436146] task: ffff880225116a40 task.stack: ffffc90002bd8000
[   18.436148] RIP: 0010:send_mode_select+0xca/0x560
[   18.436148] RSP: 0018:ffffc90002bdbda8 EFLAGS: 00010246
[   18.436149] RAX: 0000000000000000 RBX: ffffc90002bdbe08 RCX: ffff88017ef04a80
[   18.436150] RDX: ffffc90002bdbe08 RSI: ffff88017ef04a80 RDI: ffff8802248e4388
[   18.436151] RBP: ffffc90002bdbe48 R08: 0000000000000000 R09: ffffffff81c104c0
[   18.436151] R10: 00000000000001ff R11: 000000000000035a R12: ffffc90002bdbdd8
[   18.436152] R13: ffff8802248e4390 R14: ffff880225152800 R15: ffff8802248e4400
[   18.436153] FS:  0000000000000000(0000) GS:ffff880227d00000(0000) knlGS:0000000000000000
[   18.436154] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   18.436154] CR2: 0000000000000790 CR3: 000000042535b000 CR4: 00000000000006e0
[   18.436155] Call Trace:
[   18.436159]  ? rdac_activate+0x14e/0x150
[   18.436161]  ? refcount_dec_and_test+0x11/0x20
[   18.436162]  ? kobject_put+0x1c/0x50
[   18.436165]  ? scsi_dh_activate+0x6f/0xd0
[   18.436168]  process_one_work+0x149/0x360
[   18.436170]  worker_thread+0x4d/0x3c0
[   18.436172]  kthread+0x109/0x140
[   18.436173]  ? rescuer_thread+0x380/0x380
[   18.436174]  ? kthread_park+0x60/0x60
[   18.436176]  ret_from_fork+0x2c/0x40
[   18.436177] Code: 49 c7 46 20 00 00 00 00 4c 89 ef c6 07 00 0f 1f 40 00 45 31 ed c7 45 b0 05 00 00 00 44 89 6d b4 4d 89 f5 4c 8b 75 a8 49 8b 45 20 <48> 8b b0 90 07 00 00 48 8b 56 10 8b 42 10 48 8d 7a 28 85 c0 0f
[   18.436192] RIP: send_mode_select+0xca/0x560 RSP: ffffc90002bdbda8
[   18.436192] CR2: 0000000000000790
[   18.436198] ---[ end trace 40f3e4dca1ffabdd ]---
[   18.436199] Kernel panic - not syncing: Fatal exception
[   18.436222] Kernel Offset: disabled
[-- MARK -- Thu May 18 11:45:00 2017]

Fixes: 3278255741 scsi_dh_rdac: switch to scsi_execute_req_flags()
Cc: stable@vger.kernel.org
Signed-off-by: Artem Savkov <asavkov@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-23 21:53:15 -04:00
Linus Torvalds
894e21642d Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "A small collection of fixes that should go into this cycle.

   - a pull request from Christoph for NVMe, which ended up being
     manually applied to avoid pulling in newer bits in master. Mostly
     fibre channel fixes from James, but also a few fixes from Jon and
     Vijay

   - a pull request from Konrad, with just a single fix for xen-blkback
     from Gustavo.

   - a fuseblk bdi fix from Jan, fixing a regression in this series with
     the dynamic backing devices.

   - a blktrace fix from Shaohua, replacing sscanf() with kstrtoull().

   - a request leak fix for drbd from Lars, fixing a regression in the
     last series with the kref changes. This will go to stable as well"

* 'for-linus' of git://git.kernel.dk/linux-block:
  nvmet: release the sq ref on rdma read errors
  nvmet-fc: remove target cpu scheduling flag
  nvme-fc: stop queues on error detection
  nvme-fc: require target or discovery role for fc-nvme targets
  nvme-fc: correct port role bits
  nvme: unmap CMB and remove sysfs file in reset path
  blktrace: fix integer parse
  fuseblk: Fix warning in super_setup_bdi_name()
  block: xen-blkback: add null check to avoid null pointer dereference
  drbd: fix request leak introduced by locking/atomic, kref: Kill kref_sub()
2017-05-20 16:12:30 -07:00
James Smart
4b8ba5fa52 nvmet-fc: remove target cpu scheduling flag
Remove NVMET_FCTGTFEAT_NEEDS_CMD_CPUSCHED. It's unnecessary.

Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-05-20 10:11:34 -06:00
Linus Torvalds
6fe1de43c5 SCSI fixes on 20170519
This is the first sweep of mostly minor fixes.  There's one security
 one: the read past the end of a buffer in qedf, and a panic fix for
 lpfc SLI-3 adapters, but the rest are a set of include and build
 dependency tidy ups and assorted other small fixes and updates.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJZH24IAAoJEAVr7HOZEZN49ToP/1UHEJrhlj2AsOx24/JCMMSn
 MGw0Epha7QQ6d1uiXqB7ZTmpcRykzK4xFLrneP9BYSekTIWPWmKhAcy7Uza0EJiJ
 FYvuSDDEQd+T2anqlxw3N/EevkH9nzVp/uYxpU2IAVtvvnyUgnhZpPNrrRC+d6kM
 MJJjsid9SFmEQK20PYKw3LpLMqKYMQnaHVWdMPo8lXd1VqdqJB98fxjJ6mpo1yZP
 3VcCT4KJeQkX8PW8pOR+yto5oCw0pHK3oTiICLwLr8tTMdO5/XIhq004pV2mI6p4
 fWlD7chFZYjfuAT+qUmjQfglG8S8M5iLpygNUxkCtATWHeOJ+E4GtpIpUGVzn1Xv
 NTtXtOn93Glb7Em3XAemqxnh1/iHxk+mcWMcLa2YyTTiFUE5YJRm4oV/WBOssyAP
 9jXhaJwKn3AFdb5cXPSD083+jtxDFB/5PRfCKHVFKD86SxQR5nEpJj8XsjnaY5Bf
 uAh7EPiledKa6YaXlVk9Bx14G0mMyk3qAwqqOBRl3uakMYUfDVhhWM11GqG/DqVG
 H5CMcCcS1WleilhmuS3tidooUFejkwaImVIEBnjpyoDrjI5BGpRL/Cl2iLyeFQm8
 6ifDHhbfeHNAmgXCkGcXaSKeDKSbuxvRV7Q2xbX5lyTMSTXs3ek1KO5N7gaWYlAA
 RgkFBeuY8O1dk0qJrFtH
 =FJ21
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "This is the first sweep of mostly minor fixes. There's one security
  one: the read past the end of a buffer in qedf, and a panic fix for
  lpfc SLI-3 adapters, but the rest are a set of include and build
  dependency tidy ups and assorted other small fixes and updates"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: pmcraid: remove redundant check to see if request_size is less than zero
  scsi: lpfc: ensure els_wq is being checked before destroying it
  scsi: cxlflash: Select IRQ_POLL
  scsi: qedf: Avoid reading past end of buffer
  scsi: qedf: Cleanup the type of io_log->op
  scsi: lpfc: double lock typo in lpfc_ns_rsp()
  scsi: qedf: properly update arguments position in function call
  scsi: scsi_lib: Add #include <scsi/scsi_transport.h>
  scsi: MAINTAINERS: update OSD entries
  scsi: Skip deleted devices in __scsi_device_lookup
  scsi: lpfc: Fix panic on BFS configuration
  scsi: libfc: do not flood console with messages 'libfc: queue full ...'
2017-05-19 17:46:51 -07:00
Long Li
1bad6c4a57 scsi: zero per-cmd private driver data for each MQ I/O
In lower layer driver's (LLD) scsi_host_template, the driver may
optionally ask SCSI to allocate its private driver memory for each
command, by specifying cmd_size. This memory is allocated at the end of
scsi_cmnd by SCSI.  Later when SCSI queues a command, the LLD can use
scsi_cmd_priv to get to its private data.

Some LLD, e.g. hv_storvsc, doesn't clear its private data before use. In
this case, the LLD may get to stale or uninitialized data in its private
driver memory. This may result in unexpected driver and hardware
behavior.

Fix this problem by also zeroing the private driver memory before
passing them to LLD.

Signed-off-by: Long Li <longli@microsoft.com>
Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com>
Reviewed-by: KY Srinivasan <kys@microsoft.com>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
CC: <stable@vger.kernel.org> # 4.11+
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-18 21:44:49 -04:00
Varun Prakash
a351e40b6d scsi: csiostor: fix use after free in csio_hw_use_fwconfig()
mbp pointer is passed to csio_hw_validate_caps() so call mempool_free()
after calling csio_hw_validate_caps().

Signed-off-by: Varun Prakash <varun@chelsio.com>
Fixes: 541c571fa2 ("csiostor:Use firmware version from cxgb4/t4fw_version.h")
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-18 21:37:27 -04:00
Michał Potomski
463f620b12 scsi: ufs: Clean up some rpm/spm level SysFS nodes upon remove
When reloading module these two attributes aren't cleaned up properly
and they persist causing warnings when trying to load module
again. Additionally they are not recreated properly due to that.

Signed-off-by: Michał Potomski <michalx.potomski@intel.com>
Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-18 21:29:21 -04:00
James Smart
eeeb51d834 scsi: lpfc: fix build issue if NVME_FC_TARGET is not defined
fix build issue if NVME_FC_TARGET is not defined. noop the code.  The
code will never be invoked if target mode is not enabled.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-17 20:22:47 -04:00
Guilherme G. Piccoli
53cf29d3b1 scsi: lpfc: Fix NULL pointer dereference during PCI error recovery
Recent commit on patchset "lpfc updates for 11.2.0.14" fixed an issue
about dereferencing a NULL pointer on port reset. The specific commit,
named "lpfc: Fix system crash when port is reset.", is missing a check
against NULL pointer on lpfc_els_flush_cmd() though.

Since we destroy the queues on adapter resets, like in PCI error
recovery path, we need the validation present on this patch in order to
avoid a NULL pointer dereference when trying to flush commands of ELS
wq, after it has been destroyed (which would lead to a kernel oops).

Tested-by: Raphael Silva <raphasil@linux.vnet.ibm.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-17 20:19:23 -04:00
James Smart
2848e1d503 scsi: lpfc: update version to 11.2.0.14
Change driver version to 11.2.0.14.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-16 21:25:09 -04:00
James Smart
ae9e28f36a scsi: lpfc: Add MDS Diagnostic support.
Added code to support Cisco MDS loopback diagnostic. The diagnostics run
various loopbacks including one which loops-back frame through the
driver.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-16 21:24:47 -04:00
James Smart
dc53a61852 scsi: lpfc: Fix NVMEI's handling of NVMET's PRLI response attributes
Code review of NVMEI's FC_PORT_ROLE_NVME_DISCOVERY looked wrong.

Discussions with storage architecture team clarified NVMEI's audit of
the PRLI response port roles.  Following up discussion with code review
showed a few minor corrections were required - especially in
anticipation of NVME auto discovery.

During PRLI, NVMEI should sent prli_init - which it it does.  NVMET
should send prli_tgt and prli_disc - which it does.  When NVMEI receives
a PRLI Response now, it audits the incoming target bits and stores the
attributes in the corresponding NDLP.  Later, when NVMEI registers the
NVME rport, it uses the stored ndlp attributes to set the rport
port_roles correctly.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-16 21:24:17 -04:00
James Smart
64eb4dcb14 scsi: lpfc: Cleanup entry_repost settings on SLI4 queues
Too many work items being processed in IRQ context take a lot of CPU
time and cause problems.

With a recent change, we get out of the ISR after hitting entry_repost
work items on a queue. However, the actual values for entry repost are
still high. EQ is 128 and CQ is 128, this could translate into
processing 128 * 128 (16384) work items under IRQ context.

Set entry_repost in the actual queue creation routine now.  Limit EQ
repost to 8 and CQ repost to 64 to further limit the amount of time
spent in the IRQ.

Fix fof IRQ routines as well.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-16 21:23:42 -04:00
James Smart
667a766252 scsi: lpfc: Fix debugfs root inode "lpfc" not getting deleted on driver unload.
When unloading and reloading the driver, the driver fails to recreate
the lpfc root inode in the debugfs tree.

The driver is incorrectly removing the lpfc root inode in
lpfc_debugfs_terminate in the first driver instance that unloads and
then sets the lpfc_debugfs_root global parameter to NULL.  When the
final driver instance unloads, the debugfs calls quietly ignore the
remove on a NULL pointer.  The bug is that the debugfs_remove call
returns void so the driver doesn't know to correctly set the global
parameter to NULL.

Base the debugfs_remove of the lpfc_debugfs_root parameter on
lpfc_debugfs_hba_count because this parameter tracks the fnX instance
tracked per driver instance.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-16 21:23:13 -04:00
James Smart
82820f0cf1 scsi: lpfc: Fix NVME I+T not registering NVME as a supported FC4 type
When the driver send the RPA command, it does not send supported FC4
Type NVME to the management server.

Encode NVME (type x28) in the AttribEntry in the RPA command.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-16 21:22:52 -04:00
James Smart
a8cf5dfeb4 scsi: lpfc: Added recovery logic for running out of NVMET IO context resources
Previous logic would just drop the IO.

Added logic to queue the IO to wait for an IO context resource from an
IO thats already in progress.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-16 21:22:22 -04:00
James Smart
6c621a2229 scsi: lpfc: Separate NVMET RQ buffer posting from IO resources SGL/iocbq/context
Currently IO resources are mapped 1 to 1 with RQ buffers posted

Added logic to separate RQE buffers from IO op resources
(sgl/iocbq/context). During initialization, the driver will determine
how many SGLs it will allocate for NVMET (based on what the firmware
reports) and associate a NVMET IOCBq and NVMET context structure with
each one.

Now that hdr/data buffers are immediately reposted back to the RQ, 512
RQEs for each MRQ is sufficient. Also, since NVMET data buffers are now
128 bytes, lpfc_nvmet_mrq_post is not necessary anymore as we will
always post the max (512) buffers per NVMET MRQ.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-16 21:21:47 -04:00
James Smart
3c603be979 scsi: lpfc: Separate NVMET data buffer pool fir ELS/CT.
Using 2048 byte buffer and onle 128 bytes is needed.

Create nee LFPC_NVMET_DATA_BUF_SIZE define to use for NVMET RQ/MRQs.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-16 21:21:17 -04:00
James Smart
7869da183a scsi: lpfc: Fix NMI watchdog assertions when running nvmet IOPS tests
After running IOPS test for 30 second we get kernel:NMI watchdog:
Watchdog detected hard LOCKUP on cpu 0

The driver is speend too much time in its ISR.

In ISR EQ and CQ processing routines, if we hit the entry_repost numbers
of EQE/CQEs just break out of the routine as opposed to hitting the
doorbell with NOARM and continue processing.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-16 21:20:41 -04:00
James Smart
3120046a97 scsi: lpfc: Fix NVMEI driver not decrementing counter causing bad rport state.
During driver boot, a latency in the NVMET driver side causes the
incoming NVMEI PRLI to get rejected by the NVMET driver.  When this
happens, the NVMEI driver runs out of PRLI retries.  Bouncing the link
does not fix the situation.

If the NVMEI driver decides, on PRLI completion failures, to retry the
PRLI, always decrement the fc4_prli_sent counter.  This allows the PRLI
completion to resolve to UNMAPPED when NVMET rejects the PRLI.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-16 21:19:10 -04:00
James Smart
61f3d4bf4f scsi: lpfc: Fix nvmet RQ resource needs for large block writes.
Large block writes to the nvme target were failing because the default
number of RQs posted was insufficient.

Expand the NVMET RQs to 2048 RQEs and ensure a minimum of 512 RQEs are
posted, no matter how many MRQs are configured.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-16 21:18:41 -04:00
James Smart
547077a44b scsi: lpfc: Adding additional stats counters for nvme.
More debug messages added for nvme statistics.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-16 21:18:20 -04:00
James Smart
0c9c6a7514 scsi: lpfc: Fix system crash when port is reset.
The driver panic when using the els_wq during port reset.

Check for NULL els_wq before dereferencing.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-16 21:17:54 -04:00