sas nexus loss support for systems that suport failover.
Signed-off-by: Eric Moore <Eric.Moore@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Fix panic for when mptctl is loading at the same time
when one of the fusion llds (mptsas/mptfc/mptspi) is loading.
Signed-off-by: Eric Moore <Eric.Moore@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Adding support for sas enclosures with smart drives.
Signed-off-by: Eric Moore <Eric.Moore@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Using the port_id for the channel is completely unnecessary since the
host_id/target_id are constructed to be globally unique. Also move
the mptsas driver on to virtual channel 1 for its raid devices.
Acked-by: "Moore, Eric" <Eric.Moore@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This allows us to be rid of the machinery in mptsas for creating and
tracking port numbers. Since mptsas is merely inventing the numbers,
the SAS transport class may as well do it instead.
Signed-off-by: Eric Moore <Eric.Moore@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Conflicts:
drivers/scsi/nsp32.c
drivers/scsi/pcmcia/nsp_cs.c
Removal of randomness flag conflicts with SA_ -> IRQF_ global
replacement.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
One of the current problems the mptsas driver has is that of "ghost"
devices (these are devices the firmware reports as existing, but what
they actually represent are the parents of a lower device), so for
example in my dual expander configuration, three expanders actually show
up, two for the real expanders but a third is created because the
firmware reports that the lower expander also has another expander
connected (which is simply the port going back to the upper expander).
The attached patch eliminates all these ghosts by not allocating any
devices for them if the SAS address is the SAS address of the parent.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The way mpt_interrupt() was coded, it was impossible for the unhandled
interrupt detection logic to ever trigger. All interrupt handlers should
return IRQ_NONE when they have nothing to do.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andrew Morton <akpm@osdl.com>
Signed-off-by: Eric Moore <Eric.Moore@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Make two needlessly global functions static.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.com>
Signed-off-by: Eric Moore <Eric.Moore@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Conflicts:
drivers/scsi/aacraid/comminit.c
Fixed up by removing the now renamed CONFIG_IOMMU option from
aacraid
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
* Adding 1078 ROC (Raid On Chip) Support - New host adapter
* Moving all PCI Vendor/Device ids to using internal defines; a request
from Christoph/James B. some time ago for when the next chip was added.
* Removing SAS 1066/1066E Vendor/Device IDs, as there are no plans to
manufacture that controller.
Signed-off-by: Eric Moore <Eric.Moore@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
* Wide port support added - using James Bottomley's new SAS wide port API.
(There is a known problem in sas transport layer reported yesterday to
James. The Kobject dev.bus_ids for end devices are not unique across
expanders. I have added a work around in this patch, where I asigning
an unique port identifier for every port within the host - this solves
the problem, but I expect a fix from James in the sas transport).
* Adding target_alloc and target_destroy entry points, and moving code over
from the slave entry points.
* The renaming of some mptscsih_xxx functions declared in mptsas.c,
to mptsas_xxx.
* Target Reset moved from slave_destroy to hotplug work thread
handling (with regard to device removal). Also inhibit IO to end device
while device is being broken down . Talked to James Smart about this
at Linux Expo (with questions of how the fc transport handles this).
* Cleaning up the kzalloc's, and kfree's
Signed-off-by: Eric Moore <Eric.Moore@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This ugly hack was long overdue to die.
It was a way to print out Sparc interrupts in a more freindly format,
since IRQ numbers were arbitrary opaque 32-bit integers which vectored
into PIL levels. These 32-bit integers were not necessarily in the
0-->NR_IRQS range, but the PILs they vectored to were.
The idea now is that we will increase NR_IRQS a little bit and use a
virtual<-->real IRQ number mapping scheme similar to PowerPC.
That makes this IRQ printing hack irrelevant, and furthermore only a
handful of drivers actually used __irq_itoa() making it even less
useful.
Signed-off-by: David S. Miller <davem@davemloft.net>
Bump driver version number to reflect addition of various
fibre channel patches.
Signed-off-by: Michael Reed <mdr@sgi.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The driver uses msleep_interruptible() in the code path responsible
for resetting the card's ports via the lsiutil command. If a
<ctrl-c> is received during the reset it can leave a port in such
a state that the only way to regain its use is to reboot the system.
Changing from msleep_interruptible() to msleep() corrects the problem.
Signed-off-by: Michael Reed <mdr@sgi.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
While doing board reset testing I was able to put the system in
an infinite request/response loop between the scsi layer and
mptscsih_qcmd() by aborting the reset. This patch installs
a "SETUP RESET" handler which calls fc_remote_port_delete()
for all registered rports. This blocks the target which
prevents the loop. Additionally, should the reset fail to
complete, the transport will now terminate i/o to the target.
Signed-off-by: Michael Reed <mdr@sgi.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The fibre channel firmware provides a timer which is similar in purpose
to the fibre channel transport's device loss timer. The effect of this
timer is to extend the total time that a target will be missing beyond
the value associated with the transport's timer. This patch changes
the firmware timer to a default of one second which significantly reduces
the lag between when a target goes missing and the notification of the
fibre channel transport.
Signed-off-by: Michael Reed <mdr@sgi.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Move fibre channel event and reset handling to mptfc. This will
result in fewer changes over time that need to be applied to
either mptbase.c or mptscsih.c.
Signed-off-by: Michael Reed <mdr@sgi.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
MPT fusion driver initialization fails while second kernel is booting,
after a system crash (if kdump kernel is configured). Oops message is
pasted below.
*****************************************************************************
Fusion MPT base driver 3.03.08
Copyright (c) 1999-2005 LSI Logic Corporation
Fusion MPT SAS Host driver 3.03.08 ACPI: PCI Interrupt 0000:01:00.0[A] -> Link [LNKA] -> GSI 5 (level, low) -> IRQ 5
mptbase: Initiating ioc0 bringup
BUG: unable to handle kernel paging request at virtual address 00002608
printing eip:
c11782fd
*pde = 00000000
Oops: 0000 [#1]
Modules linked in:
CPU: 0
EIP: 0060:[<c11782fd>] Not tainted VLI
EFLAGS: 00010046 (2.6.17-rc1-16M #2)
EIP is at mptscsih_io_done+0x27/0x3a3
eax: c4fed000 ebx: c4fed000 ecx: 00002600 edx: 00000298
esi: c11782d6 edi: 00002600 ebp: 00000000 esp: c1332f74
ds: 007b es: 007b ss: 0068
Process swapper (pid: 0, threadinfo=c1332000 task=c128f9c0) Stack: <0>0000006c 00000020 00000298 00002600 c4fed000 c4fed000 c11782d6 0000260 0
00000000 c1172c49 c4fed000 c1305b40 00000005 00000000 c1172d75 c48877e0
c1029687 00000000 c1307fb8 00000000 c1305a00 00000001 00000000 c1307fb8
Call Trace:
<c11782d6> mptscsih_io_done+0x0/0x3a3 <c1172c49> mpt_turbo_reply+0xbb/0xd3
<c1172d75> mpt_interrupt+0x22/0x2b <c1029687> misrouted_irq+0x63/0xcb
<c10297b3> note_interrupt+0x43/0x98 <c10292f9> __do_IRQ+0x68/0x8f
<c1003fac> do_IRQ+0x36/0x4e
=======================
<c1002aa6> common_interrupt+0x1a/0x20 <c1001150> mwait_idle+0x1a/0x2a
<c10010bf> cpu_idle+0x40/0x5c <c1308610> start_kernel+0x17a/0x17c Code: 5e 5f 5d c3 55 89 cd 57 56 53 83 ec 14 89 54 24 0c 89 44 24 10 8b 90 cc 00 00 00 8b 4c 24 0c 81 c2 98 02 00 00 85 ed 89 54 24 08 <0f> b7 79 08 89 fe 74 04 0f b7 75 08 66 39 f7 75 0d 8b 44 24 0c
*******************************************************************************
o Kdump capture kernel boot fails during initialization of MPT fusion driver.
(LSI Logic / Symbios Logic SAS1064E PCI-Express Fusion-MPT SAS (rev 01))
o Problem is easily reproducible, if system crashed while some disk activity
like cp operation was going on.
o After a system crash, devices are not shutdown and capture kernel starts
booting while skipping BIOS. Hence underlying device is left in operational
state. In this case scsi contoller was left with interrupt line asserted
reply FIFO was not empty. When driver starts initializing in the second
kernel, it receives the interrupt the moment request_irq() is called.
Interrupt handler, reads the message from reply FIFO and tries to access
the associated message frame and panics, as in the new kernel's context
that message frame is not valid at all.
o In this scenario, probably we should delay the request_irq() call. First
bring up the IOC, reset it if needed and then should register for irq.
o I have tested the patch with SAS1064E and 53c1030 controllers.
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Acked-by: "Moore, Eric Dean" <Eric.Moore@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
All registered reset callback handlers are called during reset processing.
The mptspi modules has its own reset callback handler, just recently
added for issuing domain validation after host reset. If either the mptsas or
mptfc driver are loaded, this callback could be called. Thus resulting
in domain validation being issued for sas or fibre end devices.
Fix this by having mptbase.c check the bus type against the driver
type and only call the reset handler if they match (or if it's a
non-bus specific reset handler).
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
A race condition exists in mptfc between the thread registering a device
with the fc transport and the scan work generated by the transport.
This race existed prior to the application of the mptfc bug fix patch.
mptfc_register_dev() calls fc_remote_port_add() with the FC_RPORT_ROLE_TARGET
bit set in the rport ids passed to the function. Having this bit set causes
fc_remote_port_add() to schedule a scan of the device.
This scan can execute before mptfc_register_dev() can fill in the dd_data
in the rport structure. When this happens, mptfc_target_alloc() will fail
because dd_data is null.
Attached is a patch which fixes the problem. The patch changes the rport ids
passed to fc_remote_port_add() to not have the TARGET bit set. This prevents
the scan from being scheduled. After mptfc_register_dev() fills in the rport
dd_data field, fc_remote_port_rolechg() is called, changing the role of the
rport to TARGET. Thus, the scan is scheduled after dd_data is filled
in which prevents the failure in mptfc_target_alloc().
Signed-off-by: Michael Reed <mdr@sgi.com>
Signed-off-by: Eric Moore <Eric.Moore@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This is a bug fix for mptspi driver, where after a host reset or
resume, we revalidate the negotiation parameters for all devices.
This bug was introduced when the driver was ported to use the spi
transport layer.
Signed-off-by: Eric Moore <Eric.Moore@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Bug fix for stack overflow in EventDescriptionStr, (a function
for debuging firmware events). We allocated 50 bytes on local stack
for buff[], however there are places in the code where we've attempted
copying in greater than 50 bytes into buff[].
Signed-off-by: Eric Moore <Eric.Moore@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
mptbase.h
bump version number to 3.03.09
remove unneeded flags
define workq and remove old fc specific locks
mptbase.c
initialize new lock and don't initialize two removed locks
mptscsih.c
when firmware reports target is no longer there, return
DID_REQUEUE for fc hosts so that i/o doesn't get killed until
the transport has an opportunity to manage the loss via its
dev loss timer
when the "eh_abort" routine is called, check to see if the
driver has the command or not before looking to see if a reset
is pending. James Smart and I talked about this and believe
that the API for this routine is: if driver doesn't have
command, return SUCCESS. This change helps prevent a target
from being taken offline. SUCCESS is returned because it's
likely that the command completed after error recovery timed
it out but before it could be aborted.
provide a routine to queue work to newly created workq, and
use it.
remove "ioc" from mptscsih_abort() it was only used one time.
the other references were via hd->ioc, so I just moved it....
net change in references to ioc via hd->ioc is zero
move hd->resetPending test and hd->timeouts increment to after
the test for whether the command to be aborted remains known
to the driver
Make certain that the workq exists before queuing work to it.
mptfc.c
no longer need to lock rport data structures as I was able to
single thread the code! I fixed up the debug code to
eliminate compilation messages due to type mismatch in the
printk. Got rid of some no longer needed rport flags.
Initialize and destroy the workq used for the rescan work.
simplify the logic regarding the increment of
fc_rescan_work_count. use post increment and test for zero
vs. pre increment and test for one; eliminate work_count
variable: queue_work can be called with the work_lock held as
it doesn't sleep
Signed-off-by: Michael Reed <mdr@sgi.com>
Signed-off-by: Eric Moore <Eric.Moore@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch handles case where raid hidden components
are not being removed when power turned off to device
attached to expander, as well as the case of
exposing raid components when power is turned back on
to devices attached to an expander. (This is a repost
of this patch, with mptsas_is_end_device declared
further up in the code.)
This patch contains some other miscellaneous bug fix's.
Signed-off-by: Eric Moore <Eric.Moore@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Driver panic when RAID logical volume was present when driver
loaded, or when a RAID logical volume was created on the fly.
This issue was created in due to recent scsi_transport_sas change,
when sas_read_port_mode_page was added into the mptsas drivers
slave_config entry point.
This new API expects that all sdev's to be assocated to an rphy, however
that is not the case for logical volumes, as they are created using
scsi_add_device, instead of sas_rphy_add().
Signed-off-by: Eric Moore <Eric.Moore@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The conversion of mptsas should allow the elimination of the contained
flag in the sas transport class.
Acked-by: "Moore, Eric" <Eric.Moore@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This adds support for hot adding and removing
expanders, and its associated attached devices.
When there is a change in topology,
the fusion firmware sends the
MPI_EVENT_SAS_DISCOVERY event to the driver.
The driver will read firmware config pages
to determine what changes took place, and refresh
drivers view of the world stored in ioc->sas_topology.
Here is the details of the action the driver does:
(1) Expander Added : The mptsas_discovery_work
workqueue is called. Config pages read, and
ioc->sas_topology is refreshed. The sas_phy_add()
is called for each phy of the expander. The
expanders attached devices are added via
sas_rphy_add(). Added end devices are handled within
the MPT_ADD_DEVICE logic in mptsas_hotplug_work
workqueue.
(2) Expander Delete : The sas_rphy_delete() will be
called for the top most compenent of the parent that the
expander is attached to. The sas_rphy_delete call
will delete all the children phys, rphys, and end devices.
This is handled from mptsas_discovery_work workqueue.
Signed-off-by: Eric Moore <Eric.Moore@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Suppport for exposing hidden raid components
for sg interface. The sdev->no_uld_attach flag
will set set accordingly.
The sas module supports adding/removing raid
volumes using online storage management application
interface.
This patch rely's on patch's provided to me
by Christoph Hellwig, that exports device_reprobe.
I will post those patch's on behalf of Christoph.
Signed-off-by: Eric Moore <Eric.Moore@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Changelog:
(1) fix memory leak: p->phy_info
(2) initialize device_info and port_info data fields
(3) initialize the hba firmware handle
(4) initialize phy_id for attached phy_info data fields
(5) initialize attached phy_info data fields
Signed-off-by: Eric Moore <Eric.Moore@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Cleanup of mptsas firmware event handlers.
Signed-off-by: Eric Moore <Eric.Moore@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
It makes no sense in keeping the target_id and bus_id
in the VirtDevice structure, when it can be obtained
from the VirtTarget structure.
In addition, this patch fix's couple compilation bugs
in mptfc.c when MPT_DEBUG_FC is enabled. This
provided by Micheal Reed.
Signed-off-by: Eric Moore <Eric.Moore@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Patch previously provided from Adrian Bunk <bunk@stusta.de>,
moving some functions to static. This is already in
the -mm tree.
Signed-off-by: Eric Moore <Eric.Moore@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Created a debug level MPT_DEBUG_VERBOSE_EVENTS.
Moving some of the more vebose debug messages
for firwmare events into new debug level. Also
added some more firmware events descriptions.
Signed-off-by: Eric Moore <Eric.Moore@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This header is provided to better understand
loginfo codes returned by the mpt fusion firmware.
Signed-off-by: Eric Moore <Eric.Moore@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
It was actually rendered unused by the move to the spi transport
class, but never taken out.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This is the first half of a patch to add the generic domain validation
to mptspi. It also creates a secondary "virtual" channel for raid
component devices since these are now exported with no_uld_attach.
What Eric and I would have really liked is to export all physical
components on channel 0 and all raid components on channel 1.
Unfortunately, this would result in device renumbering on platforms with
mixed RAID/Physical devices which was considered unacceptable for
userland stability reasons.
Still to be done is to plug back the extra parameter setting and DV
pieces on reset and hotplug.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Adds support to retrieve the enclosure and bay identifiers. This patch
is from Eric with minor modifications from me, rewritten from a buggy
patch of mine, based on the earlier CSMI implementation from Eric..
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch makes two needlessly global functions static.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>