The __get_free_pages can fail, so the return value should be checked.
Spotted thanks to Stanislaw.
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Acked-by: "Nandigama, Nagalakshmi" <Nagalakshmi.Nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
If ioc->pci_error_recovery is set, goto out in mpt2sas_base_hard_reset_handler()
leads to unlock unheld ioc->reset_in_progress_mutex.
The patch fixes the issue by jumping afer mutex_unlock() call.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Acked-by: "Nandigama, Nagalakshmi" <Nagalakshmi.Nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Removed redundant calling of _scsih_probe_devices() from _scsih_probe as
it is getting called from _scsih_scan_finished.
Also moved the function scsi_scan_host(shost) to get called after the
volumes on warp drive are reported to the OS. Otherwise by the time
the (ioc->hide_drives) flags is set, the volumes on warp drive
are reported to the OS already.
Also modified the initialization of reply queues only in case of driver load
time in the function _base_make_ioc_operational().
Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Commit 921cd8024b ("[SCSI] mpt2sas: New feature - Fast Load
Support") moved handling of the diag_buffer_enable module parameter
from mpt2sas_base.c to mpt2sas_scsih.c, but it left an old copy of the
parameter in mpt2sas_base.c. Remove the unused stub.
Signed-off-by: Roland Dreier <roland@purestorage.com>
Acked-by: "Nandigama, Nagalakshmi" <Nagalakshmi.Nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
When computing reply_queue_count (the number of MSI-X vectors to use),
the driver does
ioc->reply_queue_count = min_t(u8, ioc->cpu_count,
ioc->msix_vector_count);
However, on a big machine, ioc->cpu_count could be outside the range
that fits in a u8; eg a system with 256 CPUs will end up
reply_queue_count set to 0.
Fix this by calculating the minimum as ints and then letting the
assignment to reply_queue_count handle integer demotion.
Signed-off-by: Roland Dreier <roland@purestorage.com>
Acked-by: "Nandigama, Nagalakshmi" <Nagalakshmi.Nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Commit 911ae9434f ("[SCSI] mpt2sas: Added NUNA IO support in driver
which uses multi-reply queue support of the HBA") added new
allocations to the beginning of mpt2sas_base_attach(), which means
directly returning an error on failure of mpt2sas_base_map_resources()
will leak those allocations.
Fix this by doing "goto out_free_resources" in this place too, as the
rest of the function does.
Signed-off-by: Roland Dreier <roland@purestorage.com>
Acked-by: "Nandigama, Nagalakshmi" <Nagalakshmi.Nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
The amount of memory required for tracking chain buffers is rather
large, and when the host credit count is big, memory allocation
failure occurs inside __get_free_pages.
The fix is to limit the number of chains to 100,000. In addition,
the number of host credits is limited to 30,000 IOs. However this
limitation can be overridden this using the command line option
max_queue_depth. The algorithm for calculating the
reply_post_queue_depth is changed so that it is equal to
(reply_free_queue_depth + 16), previously it was (reply_free_queue_depth * 2).
Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
When an I/O request to a WarpDrive is timed out by SML and if the
I/O request to the WarpDrive is sent as direct I/O then the aborted
direct I/O will be retried as normal Volume I/O and which results
in failure of Target Reset and results in host reset.
The fix is to not retry a failed IO to volume when the original
IO was sent as direct IO with an ioc status
MPI2_IOCSTATUS_SCSI_TASK_TERMINATED.
Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Added code to release the spinlock that is used to protect the
raid device list before calling a function that can block. The
blocking was causing a reschedule, and subsequently it is tried
to acquire the same lock, resulting in a panic (NMI Watchdog
detecting a CPU lockup).
Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
1)Removed Power Management Control option for PCIe link.
2)Added RAID Action for performing a compatibility check. Added
product-specific range to RAID Action values.
3)Added PhysicalPort field to SAS Device Status Change Event data.
4)Added SpinupFlags field containing a Disable Spin-up bit to the
SpinupGroupParameters fields of SAS IO Unit Page 4.
Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Increase max transfer support from 4MB to 16MB.
This is done by changing the shost->max_sector from 8192 to 32767
Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
The driver is modified to allow access to the greater than 2TB WarpDrive
and properly handle direct-io mapping for WarpDrive volumes greater than 2TB.
Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
If the sas_device->starget to NULL from slave_destroy callback for LUN=1
even though LUN=0 exist, results in entire target getting deleted.
To resolve the issue, the driver should only set sas_device->starget to
NULL when all the LUNS have been deleted from the slave_destroy.
Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
1) Added product specific range of ImageType macros for the Extended
Image Header.
2) Added Flags field and related defines to
MPI2_TOOLBOX_ISTWI_READ_WRITE_REQUEST to support automatic
reserve/release and page addressing.
Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Detection of Dead IOC has been done in fault_reset_work thread.
If IOC Doorbell is 0xFFFFFFFF, it will be detected as non-operation/DEAD IOC.
When a DEAD IOC is detected, the code is modified to remove that IOC and
all its attached devices from OS.
The PCI layer API pci_remove_bus_device() is called to remove the dead IOC.
Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
_scsih_smart_predicted_fault is called in an interrupt and therefore
must allocate memory using GFP_ATOMIC.
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
There was supposed to be a kzalloc() here and the compiler complained
about it.
mpt2sas_scsih.c: In function ‘mpt2sas_scsih_reset_handler’:
mpt2sas_scsih.c:2807:21: warning: ‘fw_event’ may be used uninitialized in this function [-Wuninitialized]
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: "Nandigama, Nagalakshmi" <Nagalakshmi.Nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
The driver was setting the action to MPI2_CONFIG_ACTION_PAGE_READ_CURRENT,
which only returns active volumes. In order to get info on inactive volumes,
the driver needs to change the action to
MPI2_RAID_PGAD_FORM_GET_NEXT_CONFIGNUM, and traverse each config till the
iocstatus is MPI2_IOCSTATUS_CONFIG_INVALID_PAGE returned.
Added a change in the driver to remove the instance of
sas_device object when the driver returns "1" from the slave_configure callback.
Also fixed code to report the hot spares to the operating system with a /dev/sg
assigned.
Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This is due to the slave_configuration routine is getting called when
host reset is active, and config page reads are failing, and driver
attempts to added device with stale config data.
To fix the issue, added error checking in slave_configure to check
for configuration pages failing, and return "1" so the device is
not configured. The config pages are failing if raid volume is
configured while issuing a host reset, thus driver is reading stale
data and proceeding to attempt to add. The fix is to return error
so the volume is not configured.
Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This is due to driver reporting a device missing to the OS then the OS sending
a SYNC_CACHE request to driver while the IO queues are locked due to host reset.
To fix the issue, the driver will be waking up the port enable context
immediately when the driver receives the reply message, instead of waiting
on the hot plug worker threads.
Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Fix for dead lock occurring between host_lock and sas_device_lock.
The deadlock is between two spin locks, between the shost->host_lock
and driver ioc->sas_device_lock.
The fix is to rearrange the code in the FW/Driver device removal
handshake so the ioc->sas_device_lock is not occurring when the
shost->host_lock is taken.
[jejb: zero initialise sas_address to fix spurious compiler warning]
Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
The fix is in the driver-firmware handshake device removal code. We
need to read the controller ioc_state to see if controller is OPERATIONAL
prior to sending target reset and OP_REMOVE. Previously it was checking
the flag ioc->shost_recovery flag, which is always set when host reset is
active, thus preventing drives from getting properly deleted.
Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
The fix is to inhibit the warning message in _scsih_get_sas_address
when the MPI2_IOCSTATUS_CONFIG_INVALID_PAGE ioc status is returned.
Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Fix for issue : While discovery is in progress, hot unplug and hot plug of
enclosure connected to the controller card is causing system to hang.
When a device is in the process of being detected at driver load time then
if it is removed, the device that is no longer present will not be added
to the list. So the code in _scsih_probe_sas() is rearranged as such so
the devices that failed to be detected are not added to the list.
Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
New feature Fast Load Support.
(1)Asynchronous SCSI scanning: This will allow the drivers to scan
for devices in parallel while other device drivers are loading at
the same time. This will improve the amount of time it takes for the
OS to load.
(2) Reporting Devices while port enable is active: This feature will
allow devices to be reported to OS immediately while port enable is
active. The previous implementation waits for port enable to complete,
and then report devices. This feature is only enabled on IT firmware
configurations when there are no boot device configured in BIOS Configuration
Utility, else the driver will wait till port enable completes reporting
devices. For IR firmware, this feature is turned off. This feature is to
address large SAS topologies (>100 drives) when the boot OS is using onboard
SATA device, in other words, the boot devices is not
connected to our controller.
(3) Scanning for devices after diagnostic reset completes: A new routine
_scsih_scan_start is added. This will scan the expander pages, IR pages,
and sas device pages, then reporting new devices to SCSI Mid layer. It
seems the driver is not supporting adding devices while diagnostic reset
is active. Apparently this is due to the sanity checks on
ioc->shost_recovery flag throughout the context of kernel work thread FIFO,
and the mpt2sas_fw_work.
Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
1)Added ProxyVF_ID field to Configuration Request message.
2)Added IO Unit Page 8, IO Unit Page 9,and IO Unit Page 10.
3)Added SASNotifyPrimitiveMasks field to IOC Page 7.
4)Added SAS NOTIFY Primitive event.
5)Added Temperature Threshold Event.
6)Added Host Message Event.
7)Added Send Host Message request and reply.
Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Sizeof a pointer-typed expression returns the size of the pointer, not that
of the pointed data.
The semantic patch that fixes this problem is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@@
expression *e;
type T;
identifier f;
@@
f(...,(T)e,...,
-sizeof(e)
+sizeof(*e)
,...)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Support added for controllers capable of multi reply queues.
The following are the modifications to the driver to support NUMA.
1) Create the new structure adapter_reply_queue to contain the reply queue
info for every msix vector. This object will contain a
reply_post_host_index, reply_post_free for each instance, msix_index, among
other parameters. We will track all the reply queues on a link list called
ioc->reply_queue_list. Each reply queue is aligned with each IRQ, and is
passed to the interrupt via the bus_id parameter.
(2) The driver will figure out the msix_vector_count from the PCIe MSIX
capabilities register instead of the IOC Facts->MaxMSIxVectors. This is
because the firmware is not filling in this field until the driver has
already registered MSIX support.
(3) If the ioc_facts reports that the controller is MSIX compatible in the
capabilities, then the driver will request for multiple irqs. This count
is calculated based on the minimum between the online cpus available and
the ioc->msix_vector_count. This count is reported to firmware in the
ioc_init request.
(4) New routines were added _base_free_irq and _base_request_irq, so
registering and freeing msix vectors were done thru simple function API.
(5) The new routine _base_assign_reply_queues was added to align the msix
indexes across cpus. This will initialize the array called
ioc->cpu_msix_table. This array is looked up on every MPI request so the
MSIxIndex is set appropriately.
(6) A new shost sysfs attribute was added to report the reply_queue_count.
(7) User needs to set the affinity cpu mask, so the interrupts occur on the
same cpu that sent the original request.
Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
It was pointed out by 'make versioncheck' that some includes of
linux/version.h are not needed in drivers/scsi/.
This patch removes them.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
mpt2sas_base_detach() call was removed from _scsih_remove() while
doing some code shuffling. Mainly when we work on adding code for
scsih_shutdown(). I have added back mpt2sas_base_detach() which will
get callled from _scsih_remove().
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Issue:
This issue is seen on LSI H/W WarpDrive SSS6200 When filed direct I/O
is tried as volume I/O the scmd field in internal lookup table get
cleared and because of that the retried volume I/O never gets reported
as completed to SML.
Result:
I/O timeout and Error handling thread will kicking off
Fix:
Setting back the scmd in the lookup table before retrying the failed
direct i/o
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Driver should not call shutdown call from _scsih_remove otherwise,
The scsi midlayer can be deadlocked when devices are removed from the driver
pci_driver->shutdown handler.
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Properly handling of target reset in multi-initiator environment
Clean up in broadcast change handling:
(1) Need to look at the status of each task management request, and retry
the TM when there are failures.
(2) Need quiescence IO so the driver doesn't take on more IO request while
it's in the middle of sending TM request to firmware
(3) Add support to keep track of how many pending broadcast AEN events
are received while the broadcast handling is active, then loop back at
the end of this routine if there were any events received.
Clean up in mpt2sas_scsih_issue_tm routine:
(1) Make sure proper status is returned when host reset fails
(2) Clean up sanity checks near end of routine, insuring all outstanding
IOs were completed.
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This feature is to override the default
max_sectors setting at load time, taking max_sectors as an
command line option when loading the driver. The setting is
currently hard-coded in the driver to 8192 sectors (4MB transfers).
If max_sectors is specified at load time, minimum specified
setting will be 64, and the maximum is 8192. The driver will
modify the setting to be on even boundary. If max_sectors is not
specified, the driver will default to 8192.
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
mpt2sas driver revision q header update:
(1) Modified the descriptions of the LocalAddress bit in the
Flags field of the MPI SGE Format description and the MPI
Simple Element.
(2) Modified Data Location Address Space bits in the Flags field
of the IEEE Chain Element.
(3) Added more detail to the description of the DataLength field
for the SCSI IO Request and Target Assist Request. Removed
restriction on using chained SGLs when using multicast or
bidirectional support.
(4) In Manufacturing Page 7, added ReceptacleID field to
ConnectorInfo, and reworked how the Pinout field is used.
(5) In IO Unit Page 7, added BoardTemperature and
BoardTemperatureUnits fields.
(6) In IOC Page 1, changed CoalescingTimeout to units of
half-microsecond and updated descriptions.
(7) Modified descriptions of SATASlumberTimeout and
SASSlumberTimeout fields in SAS IO Unit Page 5 to indicate
the timers start after partial mode is entered.
(8) Added Extended Manufacturing configuration pages.
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This patch addresses many endian issues solved by runing sparse with the
option __CHECK_ENDIAN__ turned on.
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Ensure that the initial reference tag is passed on to the HBA firmware
for DIF Type 2 devices.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Kashyap Desai <Kashyap.Desai@lsi.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>