linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-22 12:54:09 +07:00

Author	SHA1	Message	Date
Raghava Aditya Renukunta	fe5237590b	scsi: aacraid: Skip schedule rescan in case of kdump There is a chance of the driver to be stuck in kdump if drives start acting up in kdump discovery process and the kernel decides to send eh resets, which would prompt rescan to be scheduled. Do not perform a rescan in kdump context, since we do not expect a hotplug event during kdump and all the devices are going to go away anyway. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:43 -05:00
Raghava Aditya Renukunta	8a30e50b72	scsi: aacraid: Fix hang while scanning in eh recovery Add back the ability to scan for hotplug changes while eh was in progress. Schedule a rescan for a later time in the eh recovery code and wait for eh to complete in the rescan worker. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:43 -05:00
Raghava Aditya Renukunta	a1367e4ade	scsi: aacraid: Reschedule host scan in case of failure If the driver fails to retrieve information from the fw (could happen when the fw is not fully in its senses), the driver does nothing and change is not processed correctly by the driver Schedule host rescan in case of failure. This is only for SAFW, since the information retrieval failure will happen on SAFW devices. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:43 -05:00
Raghava Aditya Renukunta	8ebaa67fc2	scsi: aacraid: Use hotplug handling function in place of scsi_scan_host Driver uses scsi_scan_host to add new devices in the driver init path, which adds all the fw exposed devices. The drivers resorts to queue command checks to block out commands to _hidden_ devices. Use the hotplug handler code to add new devices during driver init and other areas, this is only for safw. For ARC scsi_scan_host will still apply. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:43 -05:00
Raghava Aditya Renukunta	3395614e48	scsi: aacraid: Block concurrent hotplug event handling Currently driver will attempt to process hotplug events concurrently based on the FW interrupt. Protect safw update function with a scan mutex. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:43 -05:00
Raghava Aditya Renukunta	6f44a22b2c	scsi: aacraid: Merge adapter setup with resolve luns The device hotplug events are processed only after retrieving the updated lun information from the fw. Does not make sense to keep them separate. Merge both the hotplug handling and safw adapter setup code into single function. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:43 -05:00
Raghava Aditya Renukunta	3031c6565f	scsi: aacraid: Refactor resolve luns code and scsi functions Resolve luns checks the if a sdev is already present in the os to figure out if it needs to be removed. Internally the driver exposes HBA on bus 2 even though its bus 1 in the fw. Its mildly confusing. Refactor out the sdev lookup into its function to check if sdev has been added to the kernel or not. Add helper functions to add, remove and put devices based on their fw bus and target number. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:43 -05:00
Raghava Aditya Renukunta	2290678fed	scsi: aacraid: Added macros to help loop through known buses and targets Added macros to loop through the MAX SUPPORTED Buses and Targets. This will make the code a bit easier to read. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:43 -05:00
Raghava Aditya Renukunta	f2d2cabadb	scsi: aacraid: Process hba and container hot plug events in single function The hotplug handler code is duplicated for hba handling and container handling. Merged function to handle hba and container hot plug events into the resolve luns functions. Added a bunch of helper functions to check the validity of a given target and to check if bus, target is container device. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:42 -05:00
Raghava Aditya Renukunta	1d1fec53dc	scsi: aacraid: Merge func to get container information Merge aac_get_containers to setup target function, so that information about all the present devices can be retrieved in one shot. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:42 -05:00
Raghava Aditya Renukunta	0bcb45fb20	scsi: aacraid: Add helper function to set queue depth Add helper function to set queue depth from information retrieved from the bmic phy structure. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:42 -05:00
Raghava Aditya Renukunta	e2ee8c9480	scsi: aacraid: Save bmic phy information for each phy Save the bmic information for each phy, so that it can processed in target setup function. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:42 -05:00
Raghava Aditya Renukunta	4b00022753	scsi: aacraid: Create helper functions to get lun info Created inline function to retrieve lun info for each device from the phy luns structure. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:42 -05:00
Raghava Aditya Renukunta	a25b6ca1a9	scsi: aacraid: Move function around to match existing code Move the function to get phy luns information to the top of function to set target information Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:42 -05:00
Raghava Aditya Renukunta	3edfb8b2e2	scsi: aacraid: Untangle targets setup from report phy luns Remove function call to process targets from the report phy luns function and make it a function in its own right. This will help understand the flow of the code. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:42 -05:00
Raghava Aditya Renukunta	fc0fdd9abc	scsi: aacraid: Add target setup helper function Add helper function to setup targets devices and create the base for the upcoming patches Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:42 -05:00
Raghava Aditya Renukunta	b5a475e944	scsi: aacraid: Refactor and rename to make mirror existing changes Rename variables and functions to make bmic identify, report phy luns to make them consistent across code internal existing code bases Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:42 -05:00
Raghava Aditya Renukunta	5480aa1837	scsi: aacraid: Change phy luns function to use common bmic function Edit function that retrieves phy lun information to use common bmic function Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:42 -05:00
Raghava Aditya Renukunta	8fb391827f	scsi: aacraid: Create bmic submission function from bmic identify safw command submission is duplicated across many functions. Move the safw submission code from bmic identify into its own function for common use Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:42 -05:00
Raghava Aditya Renukunta	216ced02fa	scsi: aacraid: Move code to wait for IO completion to shutdown func Ideally driver needs to wait for IO to be submitted or responded to before shutdown. Move code to wait for IO completion into shutdown path Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:42 -05:00
Raghava Aditya Renukunta	97a4e8ac3f	scsi: aacraid: Refactor reset_host store function Refactored the reset_host store function to make consistent across code bases Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:41 -05:00
Raghava Aditya Renukunta	d1471eb0fa	scsi: aacraid: Allow reset_host sysfs var to recover Panicked Fw It is possible to restart the controller via the use of the reset_host sysfs variable. This does work for controllers that can no longer respond, since driver will attempt to send down a shutdown in this path. Check if the controller is able to receive commands before sending down a shutdown Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:41 -05:00
Raghava Aditya Renukunta	f3a2327725	scsi: aacraid: Fix ioctl reset hang Driver would hang when attempting to send reset from the ioctl interface, since it would wait to retrieve the ioctl mutex at send shutdown. Set adapter shutdown and unlock mutex before sending down reset request. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:41 -05:00
Raghava Aditya Renukunta	95900629fa	scsi: aacraid: Do not remove offlined devices As part of the recovery process, the drivers removes offline devices ( done by the kernel) and then tries to add them back in the rescan code. Removing the device is like taking a sledgehammer to a nail. Set the device as running if it is marked offline. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:41 -05:00
Raghava Aditya Renukunta	c5313ae8e4	scsi: aacraid: Fix hang in kdump Driver attempts to perform a device scan and device add after coming out of reset. At times when the kdump kernel loads and it tries to perform eh recovery, the device scan hangs since its commands are blocked because of the eh recovery. This should have shown up in normal eh recovery path (Should have been obvious) Remove the code that performs scanning.I can live without the rescanning support in the stable kernels but a hanging kdump/eh recovery needs to be fixed. Fixes: `a2d0321dd5` (scsi: aacraid: Reload offlined drives after controller reset) Cc: <stable@vger.kernel.org> Reported-by: Douglas Miller <dougmill@linux.vnet.ibm.com> Tested-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Fixes: `a2d0321dd5` (scsi: aacraid: Reload offlined drives after controller reset) Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:41 -05:00
Raghava Aditya Renukunta	dfb92a1f93	scsi: aacraid: Do not attempt abort when Fw panicked Check if the adapter can receive abort requests, before sending aborts Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:41 -05:00
Raghava Aditya Renukunta	f4e8708d31	scsi: aacraid: Fix udev inquiry race condition When udev requests for a devices inquiry string, it might create multiple threads causing a race condition on the shared inquiry resource string. Created a buffer with the string for each thread. Cc: <stable@vger.kernel.org> Fixes: `3bc8070fb7` ([SCSI] aacraid: SMC vendor identification) Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-01-03 23:26:41 -05:00
Prasad B Munirathnam	5771cfffdf	scsi: aacraid: Fix I/O drop during reset "FIB_CONTEXT_FLAG_TIMEDOUT" flag is set in aac_eh_abort to indicate command timeout. Using the same flag in reset handler causes the command to time out and the I/Os were dropped. Define a new flag "FIB_CONTEXT_FLAG_EH_RESET" to make sure I/O is properly handled in eh_reset handler. [mkp: tweaked commit message] Signed-off-by: Prasad B Munirathnam <prasad.munirathnam@microsemi.com> Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-12-14 22:34:28 -05:00
Colin Ian King	efbbbb1023	scsi: aacraid: remove unused variable managed_request_id Variable managed_request_id is being assigned but it is never read, hence it is redundant and can be removed. Cleans up clang warning: drivers/scsi/aacraid/linit.c:706:5: warning: Value stored to 'managed_request_id' is never read Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-12-04 20:32:53 -05:00
Arnd Bergmann	d18539754d	scsi: aacraid: address UBSAN warning regression As reported by Meelis Roos, my previous patch causes an incorrect calculation of the timeout, through an undefined signed integer overflow: [ 12.228155] UBSAN: Undefined behaviour in drivers/scsi/aacraid/commsup.c:2514:49 [ 12.228229] signed integer overflow: [ 12.228283] 964297611 * 250 cannot be represented in type 'long int' The problem is that doing a multiplication with HZ first and then dividing by USEC_PER_SEC worked correctly for 32-bit microseconds, but not for 32-bit nanoseconds, which would require up to 41 bits. This reworks the calculation to first convert the nanoseconds into jiffies, which should give us the same result as before and not overflow. Unfortunately I did not understand the exact intention of the algorithm, in particular the part where we add half a second, so it's possible that there is still a preexisting problem in this function. I added a comment that this would be handled more nicely using usleep_range(), which generally works better for waking up at a particular time than the current schedule_timeout() based implementation. I did not feel comfortable trying to implement that without being sure what the intent is here though. Fixes: `820f188659` ("scsi: aacraid: use timespec64 instead of timeval") Tested-by: Meelis Roos <mroos@linux.ee> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-11-29 00:07:20 -05:00
Guilherme G. Piccoli	e4717292dd	scsi: aacraid: Prevent crash in case of free interrupt during scsi EH path As part of the scsi EH path, aacraid performs a reinitialization of the adapter, which encompass freeing resources and IRQs, NULLifying lots of pointers, and then initialize it all over again. We've identified a problem during the free IRQ portion of this path if CONFIG_DEBUG_SHIRQ is enabled on kernel config file. Happens that, in case this flag was set, right after free_irq() effectively clears the interrupt, it checks if it was requested as IRQF_SHARED. In positive case, it performs another call to the IRQ handler on driver. Problem is: since aacraid currently free some resources before freeing the IRQ, once free_irq() path calls the handler again (due to CONFIG_DEBUG_SHIRQ), aacraid crashes due to NULL pointer dereference with the following trace: aac_src_intr_message+0xf8/0x740 [aacraid] __free_irq+0x33c/0x4a0 free_irq+0x78/0xb0 aac_free_irq+0x13c/0x150 [aacraid] aac_reset_adapter+0x2e8/0x970 [aacraid] aac_eh_reset+0x3a8/0x5d0 [aacraid] scsi_try_host_reset+0x74/0x180 scsi_eh_ready_devs+0xc70/0x1510 scsi_error_handler+0x624/0xa20 This patch prevents the crash by changing the order of the deinitialization in this path of aacraid: first we clear the IRQ, then we free other resources. No functional change intended. Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-11-20 22:33:09 -05:00
Guilherme G. Piccoli	d9b6d85a38	scsi: aacraid: Perform initialization reset only once Currently the driver accepts two ways of requesting an initialization reset on the adapter: by passing aac_reset_devices module parameter, or the generic kernel parameter reset_devices. It's working as intended...but if we end up reaching a scsi hang and the scsi EH mechanism takes place, aacraid performs resets as part of the scsi error recovery procedure. These EH routines might reinitialize the device, and if we have provided some of the reset parameters in the kernel command-line, we again perform an "initialization" reset. So, to avoid this duplication of resets in case of scsi EH path, this patch adds a field to aac_dev struct to keep per-adapter track of the init reset request - once it's done, we set it to false and don't proactively reset anymore in case of reinitializations. Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-11-20 22:32:00 -05:00
Guilherme G. Piccoli	bd257b2f3b	scsi: aacraid: Check for PCI state of device in a generic way Commit `16ae9dd35d` ("scsi: aacraid: Fix for excessive prints on EEH") introduced checks about the state of device before any PCI operations in the driver. Basically, this prevents it to perform PCI accesses when device is in the process of recover from a PCI error. In PowerPC, such mechanism is called EEH, and the aforementioned commit introduced checks that are based on EEH-specific primitives for that. The potential problems with this approach are three: first, these checks are "locked" to powerpc only - another archs could have error recovery methods too, like AER in Intel. Also, the powerpc primitives perform expensive FW accesses to validate the precise PCI state of a device. Finally, code becomes more complicated and needs ifdef validation based on arch config being set. So, this patch makes use of generic PCI state checks, which are lightweight and non-dependent of arch configs - also, it makes the code cleaner. Fixes: `16ae9dd35d` ("scsi: aacraid: Fix for excessive prints on EEH") Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Reviewed-by: Dave Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-11-20 22:29:10 -05:00
Linus Torvalds	670ffccb2f	SCSI misc on 20171114 This is mostly updates of the usual suspects: lpfc, qla2xxx, hisi_sas, megaraid_sas, pm80xx, mpt3sas, be2iscsi, hpsa. and a host of minor updates. There's no major behaviour change or additions to the core in all of this, so the potential for regressions should be small (biggest potential being in the scsi error handler changes). Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABAgAGBQJaCxtCAAoJEAVr7HOZEZN4d9EQAI+OHP6ss6zjKKC21c9jNPcH NhLrNv37gHg/LA2VXeUEL9RGUjCGLIUrI4HsrxzkFAMLKP4TkshMs8/2RvczY+Sa VpayPqVybEKLIS6ipQyM1SLIQff2nvtDVcN/T+8z1lkk45TrbA6ZGuwUwd2aJyEA 2V2wtg51ObnL0Nr9QPPll0JrtL1AnCZyRlu9XrwTZuuSBZwk93opIuuvbZm/3dVg Ir4GSS4Y+PuHIfu4cxqdsPMdzRdY9I2me1YiE4jeFSn1/VTAjL4HBz7fO9eITT42 VhXSpDz1XvFsa9dJ0ubkqoALpJzCfOcBw+EuGvSydLEvOBoEVwMccdfaD9lT1zc5 L9e1Z5qqJoq7hTA6xTXCYfWG73I9HYvljtmc8yudKHhADOdnSTUXhaO6uBF0RNqD OxPSA1RZwRx3c6lDOcK6BTtvLAkTEuYKdrWSKJi0w+QXJAyQ6etqbmsKpmPdRim7 Z4ZSpJFro2gyo9gcdJO0ykTG+z3U7Z/ay1sNgnuprsv+eU/QjUdlAPl18o79EkRf H54zZggZ4wC6q/cFVVt4Vx+V+oqIeu38s7NDXS9UltLoTZPm2EzDW6pXd/38Z4Tf a1oBAUET8kYLC90P8sVZxUIHZjITlpgDbyE2Lq00PMYXhk8S4IxF0aMN5RvVqzUv +7N2HrHkSSgG1nhw1t+E =3O85 -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "This is mostly updates of the usual suspects: lpfc, qla2xxx, hisi_sas, megaraid_sas, pm80xx, mpt3sas, be2iscsi, hpsa. and a host of minor updates. There's no major behaviour change or additions to the core in all of this, so the potential for regressions should be small (biggest potential being in the scsi error handler changes)" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (203 commits) scsi: lpfc: Fix hard lock up NMI in els timeout handling. scsi: mpt3sas: remove a stray KERN_INFO scsi: mpt3sas: cleanup _scsih_pcie_enumeration_event() scsi: aacraid: use timespec64 instead of timeval scsi: scsi_transport_fc: add 64GBIT and 128GBIT port speed definitions scsi: qla2xxx: Suppress a kernel complaint in qla_init_base_qpair() scsi: mpt3sas: fix dma_addr_t casts scsi: be2iscsi: Use kasprintf scsi: storvsc: Avoid excessive host scan on controller change scsi: lpfc: fix kzalloc-simple.cocci warnings scsi: mpt3sas: Update mpt3sas driver version. scsi: mpt3sas: Fix sparse warnings scsi: mpt3sas: Fix nvme drives checking for tlr. scsi: mpt3sas: NVMe drive support for BTDHMAPPING ioctl command and log info scsi: mpt3sas: Add-Task-management-debug-info-for-NVMe-drives. scsi: mpt3sas: scan and add nvme device after controller reset scsi: mpt3sas: Set NVMe device queue depth as 128 scsi: mpt3sas: Handle NVMe PCIe device related events generated from firmware. scsi: mpt3sas: API's to remove nvme drive from sml scsi: mpt3sas: API 's to support NVMe drive addition to SML ...	2017-11-14 16:23:44 -08:00
Arnd Bergmann	820f188659	scsi: aacraid: use timespec64 instead of timeval aacraid passes the current time to the firmware in one of two ways, either as year/month/day/... or as 32-bit unsigned seconds. The first one is broken on 32-bit architectures as it cannot go past year 2038. Using timespec64 here makes it behave properly on both 32-bit and 64-bit architectures, and avoids relying on signed integer overflow to pass times into the second interface. The interface used in aac_send_hosttime() however is still problematic in year 2106 when 32-bit seconds overflow. Hopefully we don't have to worry about aacraid by that time. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Dave Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-11-08 18:08:01 -05:00
Raghava Aditya Renukunta	45348de2c8	scsi: aacraid: Fix controller initialization failure This is a fix to an issue where the driver sends its periodic WELLNESS command to the controller after the driver shut it down.This causes the controller to crash. The window where this can happen is small, but it can be hit at around 4 hours of constant resets. Cc: <stable@vger.kernel.org> Fixes: `fbd185986e` (aacraid: Fix AIF triggered IOP_RESET) Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: Dave Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-10-16 23:17:52 -04:00
Guilherme G. Piccoli	d1b490939d	scsi: aacraid: Add a small delay after IOP reset Commit `0e9973ed33` ("scsi: aacraid: Add periodic checks to see IOP reset status") changed the way driver checks if a reset succeeded. Now, after an IOP reset, aacraid immediately start polling a register to verify the reset is complete. This behavior cause regressions on the reset path in PowerPC (at least). Since the delay after the IOP reset was removed by the aforementioned patch, the fact driver just starts to read a register instantly after the reset was issued (by writing in another register) "corrupts" the reset procedure, which ends up failing all the time. The issue highly impacted kdump on PowerPC, since on kdump path we proactively issue a reset in adapter (through the reset_devices kernel parameter). This patch (re-)adds a delay right after IOP reset is issued. Empirically we measured that 3 seconds is enough, but for safety reasons we delay for 5s (and since it was 30s before, 5s is still a small amount). For reference, without this patch we observe the following messages on kdump kernel boot process: [ 76.294] aacraid 0003:01:00.0: IOP reset failed [ 76.294] aacraid 0003:01:00.0: ARC Reset attempt failed [ 86.524] aacraid 0003:01:00.0: adapter kernel panic'd ff. [ 86.524] aacraid 0003:01:00.0: Controller reset type is 3 [ 86.524] aacraid 0003:01:00.0: Issuing IOP reset [146.534] aacraid 0003:01:00.0: IOP reset failed [146.534] aacraid 0003:01:00.0: ARC Reset attempt failed Fixes: `0e9973ed33` ("scsi: aacraid: Add periodic checks to see IOP reset status") Cc: stable@vger.kernel.org # v4.13+ Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Acked-by: Dave Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-09-27 21:42:14 -04:00
Nikola Pajkovsky	4cb433e856	scsi: aacraid: error: testing array offset 'bus' after use Fix possible indexing array of bound for &aac->hba_map[bus][cid], where bus and cid boundary check happens later. Fixes: `0d643ff3c3` ("scsi: aacraid: use aac_tmf_callback for reset fib") Signed-off-by: Nikola Pajkovsky <npajkovsky@suse.cz> Reviewed-by: Dave Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-09-15 21:46:07 -04:00
Dave Carroll	6c92f7dbf2	scsi: aacraid: Fix 2T+ drives on SmartIOC-2000 The logic for supporting large drives was previously tied to 4Kn support for SmartIOC-2000. As SmartIOC-2000 does not support volumes using 4Kn drives, use the intended option flag AAC_OPT_NEW_COMM_64 to determine support for volumes greater than 2T. Cc: <stable@vger.kernel.org> Signed-off-by: Dave Carroll <david.carroll@microsemi.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-09-15 15:49:43 -04:00
James Bottomley	2441500a41	Merge branch 'fixes' into misc	2017-09-07 12:12:43 -07:00
Nikola Pajkovsky	9667624698	scsi: aacraid: report -ENOMEM to upper layer from aac_convert_sgraw2() aac_convert_sgraw2() kmalloc memory and return -1 on error, which should be -ENOMEM. However, nobody is checking return value, so with this change, -ENOMEM is propagated to upper layer. Signed-off-by: Nikola Pajkovsky <npajkovsky@suse.cz> Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-08-30 22:02:23 -04:00
Nikola Pajkovsky	a226032398	scsi: aacraid: get rid of one level of indentation unsigned long byte_count = 0; nseg = scsi_dma_map(scsicmd); if (nseg < 0) return nseg; if (nseg) { ... } return byte_count; is equal to unsigned long byte_count = 0; nseg = scsi_dma_map(scsicmd); if (nseg <= 0) return nseg; ... return byte_count; No other code has changed. [mkp: fix checkpatch complaints] Signed-off-by: Nikola Pajkovsky <npajkovsky@suse.cz> Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-08-30 22:01:29 -04:00
Nikola Pajkovsky	913e00a5a0	scsi: aacraid: fix indentation errors fix stupid indent error, no rocket science here. Signed-off-by: Nikola Pajkovsky <npajkovsky@suse.cz> Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-08-30 21:56:00 -04:00
Brian King	1ae948fa4f	scsi: aacraid: Fix command send race condition This fixes a potential race condition observed on Power systems. Several places throughout the aacraid driver call aac_fib_send or similar to send a command to the aacraid adapter, then check the return code to determine if the command was actually sent to the adapter, then update the phase field in the scsi command scratch pad area to track that the firmware now owns this command. However, there is nothing that ensures that by the time the aac_fib_send function returns and we go to write to the scsi command, that the command hasn't already completed and the scsi command has been freed. This was causing random crashes in the TCP stack which was tracked down to be caused by memory that had been a struct request + scsi_cmnd being now used for an skbuff. Memory poisoning was enabled in the kernel to debug this which showed that the last owner of the memory that had been freed was aacraid and that it was a struct request. The memory that was corrupted was the exact data pattern of AAC_OWNER_FIRMWARE and it was at the same offset that aacraid writes, which is scsicmd->SCp.phase. The patch below resolves this issue. Cc: <stable@vger.kernel.org> Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Tested-by: Wen Xiong <wenxiong@linux.vnet.ibm.com> Reviewed-by: Dave Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-08-29 21:34:04 -04:00
Raghava Aditya Renukunta	c802673249	scsi: aacraid: Fix out of bounds in aac_get_name_resp We terminate the aac_get_name_resp on a byte that is outside the bounds of the structure. Extend the return response by one byte to remove the out of bounds reference. Fixes: `b836439faf` ("aacraid: 4KB sector support") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David Carroll <david.carroll@microsemi.com> Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-08-16 20:01:31 -04:00
Hannes Reinecke	e7e99d60ce	scsi: aacraid: complete all commands during bus reset When issuing a bus reset we should complete all commands, not just the command triggering the reset. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-08-07 14:04:00 -04:00
Hannes Reinecke	c323eab7a6	scsi: aacraid: add fib flag to mark scsi command callback To correctly identify which fib has a scsi command callback this patch implements a flag FIB_CONTEXT_FLAG_SCSI_CMD. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-08-07 14:04:00 -04:00
Hannes Reinecke	b60710ec7d	scsi: aacraid: enable sending of TMFs from aac_hba_send() aac_hba_send() will return FAILED for any non-SCSI command requests, failing any TMFs. This patch updates the check to allow TMFs. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-08-07 14:04:00 -04:00
Hannes Reinecke	0d643ff3c3	scsi: aacraid: use aac_tmf_callback for reset fib When sending a reset fib we shouldn't rely on the scsi command, but rather set the TMF status in the map_info->reset_state variable. That allows us to send a TMF independent on a scsi command. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-08-07 14:04:00 -04:00
Hannes Reinecke	f799319e07	scsi: aacraid: split off device, target, and bus reset Split off device, target, and bus reset functionality into individual functions. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-08-07 14:04:00 -04:00
Hannes Reinecke	25188423d4	scsi: aacraid: split off host reset Split off the host reset parts of aac_eh_reset() into a separate host reset function. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-08-07 14:04:00 -04:00
Hannes Reinecke	5115c8c01d	scsi: aacraid: split off functions to generate reset FIB Split off reset FIB generation into separate functions. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-08-07 14:04:00 -04:00
Dan Carpenter	e6fd916a62	scsi: aacraid: reading out of bounds "qd.id" comes directly from the copy_from_user() on the line before so we should verify that it's within bounds. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-07-26 22:09:21 -04:00
Seth Forshee	342ffc2669	scsi: aacraid: Don't copy uninitialized stack memory to userspace Both aac_send_raw_srb() and aac_get_hba_info() may copy stack allocated structs to userspace without initializing all members of these structs. Clear out this memory to prevent information leaks. Fixes: `423400e64d` ("scsi: aacraid: Include HBA direct interface") Fixes: `c799d519bf` ("scsi: aacraid: Retrieve HBA host information ioctl") Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-06-26 15:01:03 -04:00
Colin Ian King	5cc973f09e	scsi: aacraid: fix leak of data from stack back to userspace The fields sense_data_size and sense_data are unitialized garbage from the stack and are being copied back to userspace. Fix this leak of stack information by ensuring they are zero'd. Detected by CoverityScan, CID#1435473 ("Uninitialized scalar variable") Fixes: `423400e64d` ("scsi: aacraid: Include HBA direct interface") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Dave Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-06-26 12:32:15 -04:00
Raghava Aditya Renukunta	216e80ff78	scsi: aacraid: Update driver version to 50834 Update the driver version to 50834 Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-06-12 20:48:00 -04:00
Raghava Aditya Renukunta	395e5df79a	scsi: aacraid: Remove reference to Series-9 Remove reference to Series-9 HBA and created arc ctrl check function. Signed-off-by: Prasad B Munirathnam <prasad.munirathnam@microsemi.com> Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-06-12 20:48:00 -04:00
Raghava Aditya Renukunta	4a76be0dc5	scsi: aacraid: Add reset debugging statements Added info and error messages in controller reset function to log information about the status of the IOP/SOFT reset. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-06-12 20:48:00 -04:00
Raghava Aditya Renukunta	786e898c86	scsi: aacraid: Enable ctrl reset for both hba and arc Make sure that IOP and SOFT reset are enabled for both for both arc and hba1000 controllers. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-06-12 20:48:00 -04:00
Raghava Aditya Renukunta	8c41b9b798	scsi: aacraid: Make sure ioctl returns on controller reset Made sure that ioctl commands return in case of a controller reset. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: Dave Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-06-12 20:48:00 -04:00
Raghava Aditya Renukunta	9473ddb2b0	scsi: aacraid: Use correct function to get ctrl health The command thread checks the ctrl health periodically before sending updates to the controller. The function that it uses is aac_check_health which does more than get the health status. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-06-12 20:48:00 -04:00
Raghava Aditya Renukunta	5aa6073252	scsi: aacraid: Rework aac_src_restart Removed switch case and replaced with if mask checks. Moved KERNEL_PANIC check to when bled is less than 0. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-06-12 20:48:00 -04:00
Raghava Aditya Renukunta	77cb6d5ea6	scsi: aacraid: Rework SOFT reset code Now the driver issues a soft reset and waits for the controller to be up and running by periodically checking on the status of the controller health registers. Also prevents ARC adapters from issuing soft reset if IOP resets failed. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-06-12 20:48:00 -04:00
Raghava Aditya Renukunta	0e9973ed33	scsi: aacraid: Add periodic checks to see IOP reset status Added function that waits with a timeout for the ctrl to be up and running after triggering an IOP reset. Also removed 30 sec sleep as it is not needed. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-06-12 20:47:59 -04:00
Raghava Aditya Renukunta	80c7d8a5cf	scsi: aacraid: Rework IOP reset Reworked IOP reset to remove unneeded variable and created a helper function to notify fw of an imminent IOP reset. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-06-12 20:47:59 -04:00
Raghava Aditya Renukunta	6b24d42588	scsi: aacraid: Using single reset mask for IOP reset The driver can now trigger IOP reset with a single reset mask. Removed code that retrieves a reset_mask from the firmware. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-06-12 20:47:59 -04:00
Raghava Aditya Renukunta	144ecd41f0	scsi: aacraid: Print ctrl status before eh reset Log the status of the controller before issuing a reset. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-06-12 20:47:59 -04:00
Raghava Aditya Renukunta	895dc759cf	scsi: aacraid: Log count info of scsi cmds before reset Log the location of the scsi cmds before triggering a reset. This information is useful for debugging. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-06-12 20:47:59 -04:00
Raghava Aditya Renukunta	2a4a62c03f	scsi: aacraid: Change wait time for fib completion Change the completion wait time for the fibs in the reset and abort callback from 2 minutes to 15 seconds. 2 minutes is too long for waiting for completion. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-06-12 20:47:59 -04:00
Raghava Aditya Renukunta	fed820073f	scsi: aacraid: Remove reset support from check_health Check health does not need to reset the ctrl but just return the controller health status. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-06-12 20:47:59 -04:00
Raghava Aditya Renukunta	58eaffe54b	scsi: aacraid: Set correct Queue Depth for HBA1000 RAW disks The default queue depth for non NATIVE RAW disks is calculated from the number of fibs and number of disks or a max of 256. This causes poor disk IO performance. The fix is to set default qd based on the type of disks (SATA -32 and SAS -64) Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-06-12 20:47:59 -04:00
Raghava Aditya Renukunta	d58129c96b	scsi: aacraid: Added 32 and 64 queue depth for arc natives The qd for ARC Native disks is calculated by dividing the max IO 1024 by the number of disks or 256 which ever is lower. This causes poor disk IO performance. The fix is set the qd based on the type of disk (SAS - 64 and SATA - 32). Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-06-12 20:47:59 -04:00
Raghava Aditya Renukunta	8105d39d0e	scsi: aacraid: Fix DMAR issues with iommu=pt The driver changed the DMA consistent map after consistent memory was allocated, this invalidated the IOMMU identity mapping. The fix was to make sure that we set the DMA consistent mask setting once depending on the controller card. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: Dave Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-06-12 20:47:59 -04:00
Raghava Aditya Renukunta	c831a4a086	scsi: aacraid: Remove __GFP_DMA for raw srb memory The raw srb commands do not requires memory that in the ZONE_DMA memory space. For 32bit srb commands use GFP_DMA32 to limit the memory to 32bit memory range (4GB). Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: Dave Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-06-12 20:47:59 -04:00
Linus Torvalds	8d5e72dfdf	SCSI misc on 20170503 This update includes the usual round of major driver updates (hisi_sas, ufs, fnic, cxlflash, be2iscsi, ipr, stex). There's also the usual amount of cosmetic and spelling stuff. Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABAgAGBQJZClQkAAoJEAVr7HOZEZN4OmkP/j/JJx2ImGzTgil5S8yeSWPY 5Gqb8IK9rCQ+OJgCZYCz3JsLZZnwY4ODZ9tC1lO/3he6VfjIhcEs2/eXbTnEfsZx D3EwWEVR3wYBNZN0d4hQoudVbdCf6UuvsUvM1hDFO7by10qFEs0DqsufccpDlpG/ us96BWf7PgiNzHYSvZIlmsfEDzNDRRg7Dm1NuLOQvXw56zFGsrysCO6Tqg7/ScJm Unz/VlEe1DE7zE9QotsKNCht7xHkmn1vfuva1wqG2wMp7EHf0rKnavRYrWUrxiEy 2ig6GpR7mIHmVHS8PAMNhyS6iNxGQ3e50sAvZdqDlq42P73AEwbrOo5YhgsTJxWT vCpRAzSuHwPOPY3W2Aa1yJ10iOpoPKxXs2xSZuzpcz8XJ3RjHy+l90Y0VT4Jrvzv +dSY1cynshFccZmw2HQanlt1Ly9G3U8xmx8KIbnsIPCdSIQaQQD27H+Ip0YZ0fKt aLmMcQzffma3UP/LPmRAQ45bwx8rLi9M3DWbWOGmSkIRY3etPCXqNuDcC6h5p9TF 4W74oVcELTql/u8ATZNSbdHBsWAg3GATIkAgdqwLTk/CU/0OgGY8epILr3EM2bc6 vVbglwP9DiyVOikTLhVNJdZA97qHjZ1WXNo03eefFTBfPDcUlkZw4j2gufGuNFh2 5vA4C/aSl9uxaLInr3aC =kj7u -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "This update includes the usual round of major driver updates (hisi_sas, ufs, fnic, cxlflash, be2iscsi, ipr, stex). There's also the usual amount of cosmetic and spelling stuff" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (155 commits) scsi: qla4xxx: fix spelling mistake: "Tempalate" -> "Template" scsi: stex: make S6flag static scsi: mac_esp: fix to pass correct device identity to free_irq() scsi: aacraid: pci_alloc_consistent() failures on ARM64 scsi: ufs: make ufshcd_get_lists_status() register operation obvious scsi: ufs: use MASK_EE_STATUS scsi: mac_esp: Replace bogus memory barrier with spinlock scsi: fcoe: make fcoe_e_d_tov and fcoe_r_a_tov static scsi: sd_zbc: Do not write lock zones for reset scsi: sd_zbc: Remove superfluous assignments scsi: sd: sd_zbc: Rename sd_zbc_setup_write_cmnd scsi: Improve scsi_get_sense_info_fld scsi: sd: Cleanup sd_done sense data handling scsi: sd: Improve sd_completed_bytes scsi: sd: Fix function descriptions scsi: mpt3sas: remove redundant wmb scsi: mpt: Move scsi_remove_host() out of mptscsih_remove_host() scsi: sg: reset 'res_in_use' after unlinking reserved array scsi: mvumi: remove code handling zero scsi_sg_count(scmd) case scsi: fusion: fix spelling mistake: "Persistancy" -> "Persistency" ...	2017-05-04 12:19:44 -07:00
Mahesh Rajashekhara	f481973d5e	scsi: aacraid: pci_alloc_consistent() failures on ARM64 There were pci_alloc_consistent() failures on ARM64 platform. Use dma_alloc_coherent() with GFP_KERNEL flag DMA memory allocations. Signed-off-by: Mahesh Rajashekhara <mahesh.rajashekhara@microsemi.com> [hch: tweaked indentation, removed memsets] Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Dave Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-04-26 18:28:06 -04:00
James Bottomley	0e1bfea999	Merge remote-tracking branch 'mkp-scsi/4.11/scsi-fixes' into fixes	2017-04-12 07:29:17 -07:00
Guilherme G. Piccoli	911e572e98	scsi: aacraid: fix PCI error recovery path During a PCI error recovery, if aac_check_health() is not aware that a PCI error happened and we have an offline PCI channel, it might trigger some errors (like NULL pointer dereference) and inhibit the error recovery process to complete. This patch makes the health check procedure aware of PCI channel issues, and in case of error recovery process, the function aac_adapter_check_health() returns -1 and let the recovery process to complete successfully. This patch was tested on upstream kernel v4.11-rc5 in PowerPC ppc64le architecture with adapter 9005:028d (VID:DID) - the error recovery procedure was able to recover fine. Fixes: `5c63f7f710` ("aacraid: Added EEH support") Cc: stable@vger.kernel.org # v4.6+ Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Reviewed-by: Dave Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-04-11 20:45:59 -04:00
James Bottomley	0917ac4f53	Merge remote-tracking branch 'mkp-scsi/4.11/scsi-fixes' into fixes	2017-03-29 10:10:30 -04:00
Raghava Aditya Renukunta	e498520ede	scsi: aacraid: Fix potential null access Currently, command threads fails to return ioctls commands for older controller versions, since it returns when all the fibs have been allocated. Another issue is even all the fibs have not been allocated, the correct allocated fibs is not updated nor freed. Fixes: `113156bcea` (scsi: aacraid: Reworked aac_command_thread) Reported-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: Dave Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-03-15 19:00:36 -04:00
James Bottomley	e2a3a67302	Merge remote-tracking branch 'mkp-scsi/fixes' into fixes	2017-03-07 15:13:02 -08:00
Raghava Aditya Renukunta	934767c56b	scsi: aacraid: Fix typo in blink status The return status of the adapter check on KERNEL_PANIC is supposed to be the upper 16 bits of the OMR status register. Fixes: `c421530bf8` (scsi: aacraid: Reorder Adpater status check) Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: Dave Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-03-06 22:23:27 -05:00
Linus Torvalds	a3b4924b02	SCSI misc on 20170303 This is the set of stuff that didn't quite make the initial pull and a set of fixes for stuff which did. The new stuff is basically lpfc (nvme), qedi and aacraid. The fixes cover a lot of previously submitted stuff, the most important of which probably covers some of the failing irq vectors allocation and other fallout from having the SCSI command allocated as part of the block allocation functions. Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABAgAGBQJYug/3AAoJEAVr7HOZEZN4bRwP/RvAW/NWlvs812gUeHNahUHx HF3EtY5YHneZev1L4fa1Myd8cNcB3gic53e1WvQhN5f+LczeI7iXHH8xIQK/B4GT OI5W7ZCGlhIamgsU8tUFGrtIOM6N/LIMZPMsaE62dcev0jC0Db4p/5yv5bUg1yuL JOLWqIaTYTx7b5FwR2cXKv7KGYpXonvcoVUDV6s8aMYToJhJiQ6JtWmjaAA+/OJ7 HTcqNX/KSWQBG3ERjE3A/rGBFxPR9KPq8kpqVJlw39KEN9tMjXb0QAGvAeauNye2 xaIjV41tXH1Ys3aoh6ZU+TpzN3HLlH/gegRe8671fUwqu/RnSzFC8Eidoddzgqne z5BF0Dy36FA5ol2HTFEwS5WM1gRM+z6YGnbhpE/XfyTL2sLHRynSXKxb02IdjRNi gCdV89AL8xdfPPFJjsx4hGWUJhalW/RwuPayAupePKGQHePabHa19McI2ssg0WBg OC6uuEk7vT95xuTZVJlrwXTJPnc9dxWJwzh21oEnvMfWM0VMW06EKT4AcX8FOsjL RswiYp9wl8JDmfNxyJnQJj0+QXeIE/gh35vBQCGsLVKY4+xzbi3yZfb1sZB0XOum yLK2LFQ+Ltk1TaPPSZsgXQZfodot0dFJPTnWr+whOfY1kepNlFRwQHltXnCWzgRs QG//WHsk0u1Mj2gehwmQ =e+6v -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull more SCSI updates from James Bottomley: "This is the set of stuff that didn't quite make the initial pull and a set of fixes for stuff which did. The new stuff is basically lpfc (nvme), qedi and aacraid. The fixes cover a lot of previously submitted stuff, the most important of which probably covers some of the failing irq vectors allocation and other fallout from having the SCSI command allocated as part of the block allocation functions" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (59 commits) scsi: qedi: Fix memory leak in tmf response processing. scsi: aacraid: remove redundant zero check on ret scsi: lpfc: use proper format string for dma_addr_t scsi: lpfc: use div_u64 for 64-bit division scsi: mac_scsi: Fix MAC_SCSI=m option when SCSI=m scsi: cciss: correct check map error. scsi: qla2xxx: fix spelling mistake: "seperator" -> "separator" scsi: aacraid: Fixed expander hotplug for SMART family scsi: mpt3sas: switch to pci_alloc_irq_vectors scsi: qedf: fixup compilation warning about atomic_t usage scsi: remove scsi_execute_req_flags scsi: merge __scsi_execute into scsi_execute scsi: simplify scsi_execute_req_flags scsi: make the sense header argument to scsi_test_unit_ready mandatory scsi: sd: improve TUR handling in sd_check_events scsi: always zero sshdr in scsi_normalize_sense scsi: scsi_dh_emc: return success in clariion_std_inquiry() scsi: fix memory leak of sdpk on when gd fails to allocate scsi: sd: make sd_devt_release() static scsi: qedf: Add QLogic FastLinQ offload FCoE driver framework. ...	2017-03-03 21:36:56 -08:00
Colin Ian King	fbdab3e7fd	scsi: aacraid: remove redundant zero check on ret The check for ret being zero is redundant as a few statements earlier we break out of the while loop if ret is non-zero. Thus we can remove the zero check and also the dead-code non-zero case too. Detected by CoverityScan, CID#1411632 ("Logically Dead Code") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Dave Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-27 22:19:43 -05:00
Masahiro Yamada	608595ed9b	scripts/spelling.txt: add "therfore" pattern and fix typo instances Fix typos and add the following to the scripts/spelling.txt: therfore\|\|therefore Besides, tidy up comment blocks for 80-col wrapping. Link: http://lkml.kernel.org/r/1481573103-11329-31-git-send-email-yamada.masahiro@socionext.com Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-02-27 18:43:47 -08:00
Raghava Aditya Renukunta	a56e574067	scsi: aacraid: Fixed expander hotplug for SMART family Current driver Hotplug processing code skips over Enclosure channel, therefore any addition/removal of expander enclosure is not processed. Additionally device addition code relies on older device type, which prevents the hotplug of adapter expanders. Fixed by removing code that skips over Enclosure channels and using the latest device type for addition or removal or enclosure expanders. Fixes: `6223a39fe6` (scsi: aacraid: Added support for hotplug) Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: Dave Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-23 17:02:15 -05:00
Raghava Aditya Renukunta	0662cc968a	scsi: aacraid: Update driver version Updated driver version to 50792 Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <David.Carroll@microsemi.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-22 18:41:42 -05:00
Raghava Aditya Renukunta	d844752e18	scsi: aacraid: Fix a potential spinlock double unlock bug The driver does not unlock the reply queue spin lock after handling SMART adapter events. Instead it might attempt to unlock an already unlocked spin lock. Fixed by making sure the driver locks the spin lock before freeing it. Thank you dan for finding this issue out. Fixes: `6223a39fe6` (scsi: aacraid: Added support for hotplug) Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <David.Carroll@microsemi.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-22 18:41:42 -05:00
Raghava Aditya Renukunta	09624645e1	scsi: aacraid: Save adapter fib log before an IOP reset Currently the adapter firmware does not save outstanding I/O's log information when an IOP reset is triggered. This is problematic when trying to root cause and debug issues. Fixed by adding sync command to trigger I/O log file save in the adapter firmware before issuing an IOP reset. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <David.Carroll@microsemi.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-22 18:41:42 -05:00
Raghava Aditya Renukunta	c421530bf8	scsi: aacraid: Reorder Adapter status check The driver currently checks the SELF_TEST_FAILED first and then KERNEL_PANIC next. Under error conditions(boot code failure) both SELF_TEST_FAILED and KERNEL_PANIC can be set at the same time. The driver has the capability to reset the controller on an KERNEL_PANIC, but not on SELF_TEST_FAILED. Fixed by first checking KERNEL_PANIC and then the others. Cc: stable@vger.kernel.org Fixes: `e8b12f0fb8` ([SCSI] aacraid: Add new code for PMC-Sierra's SRC base controller family) Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <David.Carroll@microsemi.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-22 18:41:42 -05:00
Raghava Aditya Renukunta	146aa1786d	scsi: aacraid: Skip IOP reset on controller panic(SMART Family) When the SMART family of controller panic (KERNEL_PANIC) , they do not honor IOP resets. So better to skip it and directly perform a IWBR reset. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <David.Carroll@microsemi.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-22 18:41:42 -05:00
Raghava Aditya Renukunta	11da1b7c48	scsi: aacraid: Decrease adapter health check interval Currently driver checks the health status of the adapter once every 24 hours. When that happens the driver becomes dependent on the kernel to figure out if the adapter is misbehaving. This might take some time (when the adapter is idle). The driver currently has support to restart/recover the controller when it fails, and decreasing the time interval will help. Fixed by decreasing check interval from 24 hours to 1 minute Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <David.Carroll@microsemi.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-22 18:41:42 -05:00
Raghava Aditya Renukunta	a2d0321dd5	scsi: aacraid: Reload offlined drives after controller reset During the IOP reset stress testing, it was found that the drives can be marked offline when the adapter controller crashes and IO's are running in parallel. When the controller does come back from the reset, the drive that is marked offline is not exposed. Fixed by removing and adding drives that are marked offline. In addition invoke a scsi host bus rescan to capture any additional configuration changes. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <David.Carroll@microsemi.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-22 18:41:42 -05:00
Raghava Aditya Renukunta	849ac6a591	scsi: aacraid: Skip wellness sync on controller failure aac_command_thread checks on the health of controller periodically, using aac_check_health. If the status is an error state KERNEL_PANIC or anything else. The driver will attempt to restart the adapter, but the response is not checked in aac_command_thread. This allows the periodic sync to go thru and lead the driver to a hung state. Fixed by terminating the periodic loop(intended per original design), if the controller is not restored to a healthy state. Cc: stable@vger.kernel.org Fixes: `3d77d84044` (scsi: aacraid: Added support for periodic wellness sync) Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <David.Carroll@microsemi.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-22 18:41:42 -05:00
Raghava Aditya Renukunta	a7e2c64284	scsi: aacraid: Fix sync fibs time out on controller reset After controller shutdown, all sync fibs time out due to not knowing about the switch to INT-x mode Fixed by replacing aac_src_access_devreg() to aac_set_intx_mode() call. Cc: stable@vger.kernel.org Fixes: `495c021767` (aacraid: MSI-x support) Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <David.Carroll@microsemi.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-22 18:41:42 -05:00
Raghava Aditya Renukunta	30202e067a	scsi: aacraid: Added sysfs for driver version Added support to retrieve driver version from a new sysfs variable called driver_version. It makes it easier for the user to figure out the driver version that is currently running. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <David.Carroll@microsemi.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-22 18:41:42 -05:00
Raghava Aditya Renukunta	1bff5abca6	scsi: aacraid: Fix memory leak in fib init path aac_fib_map_free frees misaligned fib dma memory, additionally it does not free up the whole memory. Fixed by changing the code to free up the correct and full memory allocation. Cc: stable@vger.kernel.org Fixes: `e8b12f0fb8` ([SCSI] aacraid: Add new code for PMC-Sierra's SRC based controller family) Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <David.Carroll@microsemi.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-22 18:41:41 -05:00
Raghava Aditya Renukunta	a0c6143e95	scsi: aacraid: Prevent E3 lockup when deleting units Arrconf management utility at times sends fibs with AdapterProcessed set in its fibs. This causes the controller to panic and lockup. Fixed by failing the commands that have AdapterProcessed set in its flag. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <David.Carroll@microsemi.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-22 18:41:41 -05:00
Raghava Aditya Renukunta	16ae9dd35d	scsi: aacraid: Fix for excessive prints on EEH This issue showed up on a kdump debug(single CPU on powerkvm), when EEH errors rendered the adapter unusable. The driver correctly detected the issue and attempted to restart the controller, in doing so the driver attempted to read the status registers of the controller. This triggered additional eeh errors which continued for a good 6 minutes. Fixed by returning without waiting when EEH error is reported. Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <David.Carroll@microsemi.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-22 18:41:41 -05:00
Raghava Aditya Renukunta	f3ef4a74dc	scsi: aacraid: Use correct channel number for raw srb The channel being used for raw srb commands is retrieved from the utility sent fibs and is converted into physical channel id. The driver does not need to to do this since the management utility sends the correct channel id in the first place and in addition the driver sets inaccurate information in the cmd sent to the firmware and gets an invalid response. Fixed by using channel id from srb command. Cc: stable@vger.kernel.org Fixes: `423400e64d` ("scsi: aacraid: Include HBA direct interface") Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <David.Carroll@microsemi.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-02-22 18:41:41 -05:00

1 2 3 4 5 ...

515 Commits