A return value of 1 is interpreted as an error. See pci_driver.
in local_pci_probe(). If you're wondering how this ever could
have worked, it's because it used to be the case that only return
values less than zero were interpreted as failure. But even in
the current kernel if the driver registers its various entry
points with the kernel, and then returns a value which is
interpreted as failure, those registrations aren't undone, so
the driver still mostly works. However, the driver's remove
function wouldn't be called on rmmod, and pci power management
functions wouldn't work. In the case of Smart Array, since it
has a battery backed cache (or else no cache) even if the driver
is not shut down properly as long as there is no outstanding
i/o, nothing too bad happens, which is why it took so long to
notice.
Requesting backport to stable because the change to pci-driver.c
which requires driver probe functions to return 0 occurred between
2.6.35 and 2.6.36 (the pci power management breakage) and again
between 3.7 and 3.8 (pci_dev->driver getting set to NULL in
local_pci_probe() preventing driver remove function from being
called on rmmod.)
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
We inadvertantly discarded the scsi status for aborted commands.
For some commands (e.g. reads from tape drives) these can't be retried,
and if we discarded the scsi status, the scsi mid layer couldn't notice
anything was wrong and the error was not reported.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Some host adapters do not pass commands through to the target disk
directly. Instead they provide an emulated target which may or may not
accurately report its capabilities. In some cases the physical device
characteristics are reported even when the host adapter is processing
commands on the device's behalf. This can lead to adapter firmware hangs
or excessive I/O errors.
This patch disables WRITE SAME for devices connected to host adapters
that provide an emulated target. Driver writers can disable WRITE SAME
by setting the no_write_same flag in the host adapter template.
[jejb: fix up rejections due to eh_deadline patch]
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Pull trivial tree updates from Jiri Kosina:
"Usual earth-shaking, news-breaking, rocket science pile from
trivial.git"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (23 commits)
doc: usb: Fix typo in Documentation/usb/gadget_configs.txt
doc: add missing files to timers/00-INDEX
timekeeping: Fix some trivial typos in comments
mm: Fix some trivial typos in comments
irq: Fix some trivial typos in comments
NUMA: fix typos in Kconfig help text
mm: update 00-INDEX
doc: Documentation/DMA-attributes.txt fix typo
DRM: comment: `halve' -> `half'
Docs: Kconfig: `devlopers' -> `developers'
doc: typo on word accounting in kprobes.c in mutliple architectures
treewide: fix "usefull" typo
treewide: fix "distingush" typo
mm/Kconfig: Grammar s/an/a/
kexec: Typo s/the/then/
Documentation/kvm: Update cpuid documentation for steal time and pv eoi
treewide: Fix common typo in "identify"
__page_to_pfn: Fix typo in comment
Correct some typos for word frequency
clk: fixed-factor: Fix a trivial typo
...
Since commit 0998d06310
(device-core: Ensure drvdata = NULL when no driver is bound),
the driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.
Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Cc: James Bottomley <JBottomley@parallels.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
This patch set is a set of driver updates (megaraid_sas, fnic, lpfc, ufs,
hpsa) we also have a couple of bug fixes (sd out of bounds and ibmvfc error
handling) and the first round of esas2r checker fixes and finally the much
anticipated big endian additions for megaraid_sas.
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQEcBAABAgAGBQJSNheiAAoJEDeqqVYsXL0MueMIAKD1kaB0oooRawE1+0vpKmyV
eE2M6trA8ofTeq0z1eNfRsVMkRsUuG9exW0CKS2z6mHiWwQ/zGbqT7ukveW+dMi3
mjKD0yO5ODk6bohWX/LiwZ6NGZSwC0dbIacXNy5ZsXKEizqwo1Jcc7qC/0AWn+o7
WpIL48XLPH0HqjQZ3dvgC6TWeFZOn9cKOWvQQq0S3ENALOx/eLZ+C7VrJLx5Magv
myNOUkTLzdlYglQfjaNO6et98k2oHTrzKwH7U2X6U75q7L8Pkj4RbNzce/Ge301V
u+R1w+BlbeTPdHopTBoTJupsvqDYBZxVwS7rr8nhSvfKduQppHnN6jX8yR4XNeM=
=RG3j
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull misc SCSI driver updates from James Bottomley:
"This patch set is a set of driver updates (megaraid_sas, fnic, lpfc,
ufs, hpsa) we also have a couple of bug fixes (sd out of bounds and
ibmvfc error handling) and the first round of esas2r checker fixes and
finally the much anticipated big endian additions for megaraid_sas"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (47 commits)
[SCSI] fnic: fnic Driver Tuneables Exposed through CLI
[SCSI] fnic: Kernel panic while running sh/nosh with max lun cfg
[SCSI] fnic: Hitting BUG_ON(io_req->abts_done) in fnic_rport_exch_reset
[SCSI] fnic: Remove QUEUE_FULL handling code
[SCSI] fnic: On system with >1.1TB RAM, VIC fails multipath after boot up
[SCSI] fnic: FC stat param seconds_since_last_reset not getting updated
[SCSI] sd: Fix potential out-of-bounds access
[SCSI] lpfc 8.3.42: Update lpfc version to driver version 8.3.42
[SCSI] lpfc 8.3.42: Fixed issue of task management commands having a fixed timeout
[SCSI] lpfc 8.3.42: Fixed inconsistent spin lock usage.
[SCSI] lpfc 8.3.42: Fix driver's abort loop functionality to skip IOs already getting aborted
[SCSI] lpfc 8.3.42: Fixed failure to allocate SCSI buffer on PPC64 platform for SLI4 devices
[SCSI] lpfc 8.3.42: Fix WARN_ON when driver unloads
[SCSI] lpfc 8.3.42: Avoided making pci bar ioremap call during dual-chute WQ/RQ pci bar selection
[SCSI] lpfc 8.3.42: Fixed driver iocbq structure's iocb_flag field running out of space
[SCSI] lpfc 8.3.42: Fix crash on driver load due to cpu affinity logic
[SCSI] lpfc 8.3.42: Fixed logging format of setting driver sysfs attributes hard to interpret
[SCSI] lpfc 8.3.42: Fixed back to back RSCNs discovery failure.
[SCSI] lpfc 8.3.42: Fixed race condition between BSG I/O dispatch and timeout handling
[SCSI] lpfc 8.3.42: Fixed function mode field defined too small for not recognizing dual-chute mode
...
Changes the version of hpsa so we know something has changed. Please consider
this for inclusion.
Signed-off-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This patch does a bit of housekeeping for hpsa. Change lowercase alpha hex
digits to uppercase for consistency within the driver. Also moves the P822se
in the tables to keep controllers of each family grouped together.
Signed-off-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Add the marketing names for HP Smart Array Gen8 controllers. Also removes an
unused ID. Please consider this for inclusion.
Signed-off-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This patch adds the PCI ID's for HP Smart Array Gen9 controllers. Please
consider this patch for inclusion.
Signed-off-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Pull trivial tree from Jiri Kosina:
"The usual trivial updates all over the tree -- mostly typo fixes and
documentation updates"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (52 commits)
doc: Documentation/cputopology.txt fix typo
treewide: Convert retrun typos to return
Fix comment typo for init_cma_reserved_pageblock
Documentation/trace: Correcting and extending tracepoint documentation
mm/hotplug: fix a typo in Documentation/memory-hotplug.txt
power: Documentation: Update s2ram link
doc: fix a typo in Documentation/00-INDEX
Documentation/printk-formats.txt: No casts needed for u64/s64
doc: Fix typo "is is" in Documentations
treewide: Fix printks with 0x%#
zram: doc fixes
Documentation/kmemcheck: update kmemcheck documentation
doc: documentation/hwspinlock.txt fix typo
PM / Hibernate: add section for resume options
doc: filesystems : Fix typo in Documentations/filesystems
scsi/megaraid fixed several typos in comments
ppc: init_32: Fix error typo "CONFIG_START_KERNEL"
treewide: Add __GFP_NOWARN to k.alloc calls with v.alloc fallbacks
page_isolation: Fix a comment typo in test_pages_isolated()
doc: fix a typo about irq affinity
...
section Signed-off-by: John Kacur <jkacur@redhat.com>
On a 3.6-rt (real-time patch) kernel we are seeing the following BUG
However, it appears to be relevant for non-realtime (mainline) as well.
[ 49.688847] hpsa 0000:03:00.0: hpsa0: <0x323a> at IRQ 67 using DAC
[ 49.749928] scsi0 : hpsa
[ 49.784437] BUG: using smp_processor_id() in preemptible [00000000
00000000] code: kworker/u:0/6
[ 49.784465] caller is enqueue_cmd_and_start_io+0x5a/0x100 [hpsa]
[ 49.784468] Pid: 6, comm: kworker/u:0 Not tainted
3.6.11.5-rt37.52.el6rt.x86_64.debug #1
[ 49.784471] Call Trace:
[ 49.784512] [<ffffffff812abe83>] debug_smp_processor_id+0x123/0x150
[ 49.784520] [<ffffffffa009043a>] enqueue_cmd_and_start_io+0x5a/0x100
[hpsa]
[ 49.784529] [<ffffffffa00905cb>]
hpsa_scsi_do_simple_cmd_core+0xeb/0x110 [hpsa]
[ 49.784537] [<ffffffff812b09c8>] ? swiotlb_dma_mapping_error+0x18/0x30
[ 49.784544] [<ffffffff812b09c8>] ? swiotlb_dma_mapping_error+0x18/0x30
[ 49.784553] [<ffffffffa0090701>]
hpsa_scsi_do_simple_cmd_with_retry+0x91/0x280 [hpsa]
[ 49.784562] [<ffffffffa0093558>]
hpsa_scsi_do_report_luns.clone.2+0xd8/0x130 [hpsa]
[ 49.784571] [<ffffffffa00935ea>]
hpsa_gather_lun_info.clone.3+0x3a/0x1a0 [hpsa]
[ 49.784580] [<ffffffffa00963df>] hpsa_update_scsi_devices+0x11f/0x4f0
[hpsa]
[ 49.784592] [<ffffffff81592019>] ? sub_preempt_count+0xa9/0xe0
[ 49.784601] [<ffffffffa00968ad>] hpsa_scan_start+0xfd/0x150 [hpsa]
[ 49.784613] [<ffffffff8158cba8>] ? rt_spin_lock_slowunlock+0x78/0x90
[ 49.784626] [<ffffffff813b04d7>] do_scsi_scan_host+0x37/0xa0
[ 49.784632] [<ffffffff813b05da>] do_scan_async+0x1a/0x30
[ 49.784643] [<ffffffff8107c4ab>] async_run_entry_fn+0x9b/0x1d0
[ 49.784655] [<ffffffff8106ae92>] process_one_work+0x1f2/0x620
[ 49.784661] [<ffffffff8106ae20>] ? process_one_work+0x180/0x620
[ 49.784668] [<ffffffff8106d4fe>] ? worker_thread+0x5e/0x3a0
[ 49.784674] [<ffffffff8107c410>] ? async_schedule+0x20/0x20
[ 49.784681] [<ffffffff8106d5d3>] worker_thread+0x133/0x3a0
[ 49.784688] [<ffffffff8106d4a0>] ? manage_workers+0x190/0x190
[ 49.784696] [<ffffffff81073236>] kthread+0xa6/0xb0
[ 49.784707] [<ffffffff815970a4>] kernel_thread_helper+0x4/0x10
[ 49.784715] [<ffffffff81082a7c>] ? finish_task_switch+0x8c/0x110
[ 49.784721] [<ffffffff8158e44b>] ? _raw_spin_unlock_irq+0x3b/0x70
[ 49.784727] [<ffffffff8158e85d>] ? retint_restore_args+0xe/0xe
[ 49.784734] [<ffffffff81073190>] ? kthreadd+0x1e0/0x1e0
[ 49.784739] [<ffffffff815970a0>] ? gs_change+0xb/0xb
-------------------
This is caused by
enqueue_cmd_and_start_io()->
set_performant_mode()->
smp_processor_id()
Which if you have debugging enabled calls debug_processor_id() and triggers the warning.
The code here is
c->Header.ReplyQueue = smp_processor_id() % h->nreply_queues;
Since it is not critical that the code complete on the same processor,
but the cpu is a hint used in generating the ReplyQueue and will still work if
the cpu migrates or is preempted, it is safe to use the raw_smp_processor_id()
to surpress the false positve warning.
Signed-off-by: John Kacur <jkacur@redhat.com>
Acked-by: Stephen Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Acked-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
When the driver calls scsi_done and after that frees it's internal
preallocated memory it can happen that a new job is enqueud before
the memory is freed. The allocation fails and the message
"cmd_alloc returned NULL" is shown.
Patch below fixes it by moving cmd->scsi_done after cmd_free.
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Acked-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Shuah Khan <shuah.khan@hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
CONFIG_HOTPLUG is going away as an option. As a result, the __dev*
markings need to be removed.
This change removes the use of __devinit, __devexit_p, __devinitdata,
__devinitconst, and __devexit from these drivers.
Based on patches originally written by Bill Pemberton, but redone by me
in order to handle some of the coding style issues better, by hand.
Cc: Bill Pemberton <wfp5p@virginia.edu>
Cc: Adam Radford <linuxraid@lsi.com>
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This is a large set of updates, mostly for drivers (qla2xxx [including support
for new 83xx based card], qla4xxx, mpt2sas, bfa, zfcp, hpsa, be2iscsi, isci,
lpfc, ipr, ibmvfc, ibmvscsi, megaraid_sas). There's also a rework for tape
adding virtually unlimited numbers of tape drives plus a set of dif fixes for
sd and a fix for a live lock on hot remove of SCSI devices.
This round includes a signed tag pull of isci-for-3.6
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQEcBAABAgAGBQJQaqCFAAoJEDeqqVYsXL0MKJ4IALg/Obnk0/fNvBUNIrh5zRmj
r9UlXFJnlEDT03qRGdn8okgWMChbgaD1ZrwDTQnjNsabVQoTXI6oO6/uL2c8crpY
BFBwJvkNJS99nbcZv10CpJ3K7ykmRnKlkYon12iknhGwdtU+XJ14Z4PUcZkI9jmg
sBQQ6uNVWyosaONNE+k6o+dw6OTttJkzRX8e9in3thstxNTcG+h9iB1zZ/ETkSEj
tD4MyOgDiPf8kPV2awQThQGpni9Tu3SQr5dEn/iUUktUjiYsDNQuyaAk+QzyhUU7
D35iIJnIHlXTSTMQkrG4qpJHBvqPkWlYJzaOmheQryQ3vzp2C5Ly/hS9il45uIQ=
=49u9
-----END PGP SIGNATURE-----
Merge tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull first round of SCSI updates from James Bottomley:
"This is a large set of updates, mostly for drivers (qla2xxx [including
support for new 83xx based card], qla4xxx, mpt2sas, bfa, zfcp, hpsa,
be2iscsi, isci, lpfc, ipr, ibmvfc, ibmvscsi, megaraid_sas).
There's also a rework for tape adding virtually unlimited numbers of
tape drives plus a set of dif fixes for sd and a fix for a live lock
on hot remove of SCSI devices.
This round includes a signed tag pull of isci-for-3.6
Signed-off-by: James Bottomley <JBottomley@Parallels.com>"
Fix up trivial conflict in drivers/scsi/qla2xxx/qla_nx.c due to new PCI
helper function use in a function that was removed by this pull.
* tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (198 commits)
[SCSI] st: remove st_mutex
[SCSI] sd: Ensure we correctly disable devices with unknown protection type
[SCSI] hpsa: gen8plus Smart Array IDs
[SCSI] qla4xxx: Update driver version to 5.03.00-k1
[SCSI] qla4xxx: Disable generating pause frames for ISP83XX
[SCSI] qla4xxx: Fix double clearing of risc_intr for ISP83XX
[SCSI] qla4xxx: IDC implementation for Loopback
[SCSI] qla4xxx: update copyrights in LICENSE.qla4xxx
[SCSI] qla4xxx: Fix panic while rmmod
[SCSI] qla4xxx: Fail probe_adapter if IRQ allocation fails
[SCSI] qla4xxx: Prevent MSI/MSI-X falling back to INTx for ISP82XX
[SCSI] qla4xxx: Update idc reg in case of PCI AER
[SCSI] qla4xxx: Fix double IDC locking in qla4_8xxx_error_recovery
[SCSI] qla4xxx: Clear interrupt while unloading driver for ISP83XX
[SCSI] qla4xxx: Print correct IDC version
[SCSI] qla4xxx: Added new mbox cmd to pass driver version to FW
[SCSI] scsi_dh_alua: Enable STPG for unavailable ports
[SCSI] scsi_remove_target: fix softlockup regression on hot remove
[SCSI] ibmvscsi: Fix host config length field overflow
[SCSI] ibmvscsi: Remove backend abstraction
...
If a command status of CMD_PROTOCOL_ERR is received, this
information should be conveyed to the SCSI mid layer, not
dropped on the floor. CMD_PROTOCOL_ERR may be received
from the Smart Array for any commands destined for an external
RAID controller such as a P2000, or commands destined for tape
drives or CD/DVD-ROM drives, if for instance a cable is
disconnected. This mostly affects multipath configurations, as
disconnecting a cable on a non-multipath configuration is not
going to do anything good regardless of whether CMD_PROTOCOL_ERR
is handled correctly or not. Not handling CMD_PROTOCOL_ERR
correctly in a multipath configaration involving external RAID
controllers may cause data corruption, so this is quite a serious
bug. This bug should not normally cause a problem for direct
attached disk storage.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
I think ioremap() ends up being equivalent to ioremap_nocache
by default, but we should signal our intent that these mappings
should be non-cacheable.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
In the abort handler, when asked to abort a command which
is not known to the driver, SUCCESS is returned, but the
diagnostic message incorrectly indicates the abort failed.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
It turns out Smart Array logical drives do not support target
reset and when the target reset fails, the logical drive will
be taken off line. Symptoms look like this:
hpsa 0000:03:00.0: Abort request on C1:B0:T0:L0
hpsa 0000:03:00.0: resetting device 1:0:0:0
hpsa 0000:03:00.0: cp ffff880037c56000 is reported invalid (probably means target device no longer present)
hpsa 0000:03:00.0: resetting device failed.
sd 1:0:0:0: Device offlined - not ready after error recovery
sd 1:0:0:0: rejecting I/O to offline device
EXT3-fs error (device sdb1): read_block_bitmap:
LUN reset is supported though, and is what we should be using.
Target reset is also disruptive in shared SAS situations,
for example, an external MSA1210m which does support target
reset attached to Smart Arrays in multiple hosts -- a target
reset from one host is disruptive to other hosts as all LUNs
on the target will be reset and will abort all outstanding i/os
back to all the attached hosts. So we should use LUN reset,
not target reset.
Tested this with Smart Array logical drives and with tape drives.
Not sure how this bug survived since 2009, except it must be very
rare for a Smart Array to require more than 30s to complete a request.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Dial back the aggressiveness of the controller lockup detection thread.
Currently it will declare the controller to be locked up if it goes
for 10 seconds with no interrupts and no change in the heartbeat
register. Dial back this to 30 seconds with no heartbeat change, and
also snoop the ioctl path and if a firmware flash command is detected,
dial it back further to 4 minutes until the firmware flash command
completes. The reason for this is that during the firmware flash
operation, the controller apparently doesn't update the heartbeat
register as frequently as it is supposed to, and we can get a false
positive.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Mike Miller <mikem@beardog.cce.hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Use spinlocks with finer granularity in the submission and
completion paths to allow concurrent execution for multiple
reply queues. In particular, do not hold a spin lock while
submitting a request to the device, nor during most of the
interrupt handler.
Signed-off-by: Matt Gates <matthew.gates@hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Smart Arrays can support multiple reply queues onto which command
completions may be deposited. It can help performance quite a bit
to arrange for command completions to be processed on the same CPU
from which they were submitted to increase the likelihood of cache
hits.
Signed-off-by: Matt Gates <matthew.gates@hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This is in order to smooth the way for upcoming changes to allow use of
multiple reply queues for command completions.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Matt Gates <matthew.gates@hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
When aborting a command, the tag is supposed to be
specified as 64-bit little endian. However, some smart
arrays expect the tag of the command to be aborted to be
specified in a strange byte order. How to tell which sort
of Smart Array firmware we're dealing with is not obvious.
However, because of the way we construct our tags, the values
of any outstanding tag when specified with the "strange" byte
order will not collide with the value specified in the correct
order. That means we can safely attempt the abort both ways.
Signed-off-by: Stephen M. Cameron <stephenmcameron@gmail.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Instead of giving up after 3 immediate retries of driver initiated
commands, back off the rate of retries and retry a bunch more times.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Reviewed-by: Andi Shyti <andi.shyti@gmail.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
In shared SAS configurations we might get a busy status
during driver initiated commands (e.g. during rescan for
devices). We should retry the command in such cases rather
than giving up.
Signed-off-by: Matt Bondurant <Matthew.dav.bondurant@hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Default behavior for any CHECK CONDITION excepting a few special cases is to
print out certain parts of the sense buffer and the CDB. Default behavior
should be to print nothing and let the upper layers or applications decide what
to do about these. The same information is already available by setting the
appropriate bits of the scsi_logging_level kernel parameter or via
/proc/sys/dev/scsi/logging_level.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
pci_disable_device() disables the bus master bit and pci_enable_device does
not re-enable it. It needs to be enabled.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
There was code to skip "disabled" devices which was intended to
skip devices disabled in the BIOS, but it really just checks to
see if the device can write to host memory, which this is disabled
by pci_disable_device on driver unload, so this check has the effect
of preventing subsequent load of the driver. And devices disabled in
the BIOS don't show up at all anyway, so this check never made any
sense to begin with, and should be removed.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
As Jenx Axboe explained to me: "In earlier times (2.6.18 and pre, iirc), Linux
disabled IO and mem bars on pci_disable_device(). Now in newer kernel it does
not. And in the newer kernels you run into problems if you DON'T disable the
device on exit, since when it later loads the device is already in the enabled
state - and pci_enable_device() then does nothing. This typically screws
MSI/MSI-X." This is what the big scary comment that says pci_disable_device
does "something nasty" to smart arrays was evidently referring to.
If pci_disable_device is not called on driver rmmod, subsequently insmod'ing
the driver may in result in some cases fail to be able to receive interrupts,
esp. if other drivers are loaded between unloading and loading hpsa.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Use check_signature to find a signature in the mmio address.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Some other older controllers also do have problems to perform a kdump.
Adding controllers to this list means that the driver will signal
this non-ability via a resettable flag correctly.
The unsupported list was created after a consultation with HP.
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Use find_first_zero_bit to find the first cleared bit in a memory region.
This also includes the following minor changes.
- Use bitmap_zero
- Reduce unnecessary atomic bitops usage
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Certain types of changes to devices should not be interpreted as a device
change that would cause the device to be removed and re-added. These include
RAID level and Firmware revision changes. However, these attribute changes DO
need to be reflected in the controller info structure's dev structure list, so
that sysfs and /proc info files for the devices will reflect the new values.
Signed-off-by: Scott Teel <scott.stacy.teel@hp.com>
Acked-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Reduce confusion and inaccuracy caused by dated naming of vars and functions
referring to external target devices.
CURRENT NAMING: PROPOSED NAMING:
"MSA2xxx devices" "external target devices"
msa2xxx_model ext_target_model
is_msa2xxx is_ext_target
add_msa2xxx_enclosure add_ext_target_dev
nmsa2xxx_enclosures n_ext_target_devs
Signed-off-by: Scott Teel <scott.teel@hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Driver limits SAS external target IDs to range 1-8.
Need to increase limit and clean up overlapping concepts of targets and paths
in the code.
There are several defined constants that control this:
HPSA_MAX_TARGETS_PER_CTLR 16
MAX_MSA2XXX_ENCLOSURES 32
HPSA_MAX_PATHS 8
We can condense this to one constant:
MAX_EXT_TARGETS 32
SAS switches allow for 8 connections, and there is capacity for 4 switches per
enclosure in largest blade enclosure type.
Signed-off-by: Scott Teel <scott.teel@hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
It should call hpsa_set_bus_target_lun rather
than individually setting bus, target and lun.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>