The ATAPI specific and STP general RNC suspension code had been
incorrectly removed from the remote device code.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In the case of TMF execution, or device resets, wait for the RNC to fully
resume before returning to the caller. This ensures that the remote
device will not fail I/O requests while waiting for the RNC resumption to
complete.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Completion of I/Os during the one of the abort path interface calls
from libsas can drive remote device state changes and the resumption
of the device RNC. This is a problem when the abort path is
attempting to cleanup outstanding I/O at the same time - the resumption
can prevent the termination from occuring correctly.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In order to prevent a device from receiving an I/O request while still
in an RNC suspending or resuming state (and therefore failing that
I/O back to libsas with a reset required status) wait for the RNC state
change before proceding in the abort path.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The LLHANG timer should be enabled once per device. This patch corrects
both the timer enable and the timer disable for the remote device.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This commit changes the means by which outstanding I/Os are handled
for cleanup.
The likelihood is that this commit will be broken into smaller pieces,
however that will be a later revision. Among the changes:
- All completion structures have been removed from the tmf and
abort paths.
- Now using one completed I/O list, with the I/O completed in host bit being
used to select error or normal callback paths.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
TCs must be terminated only while the RNC is suspended. This commit
adds remote device suspensions and resumptions in the abort, reset and
termination paths.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
TCs must only be terminated when RNCs are suspended.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Add comprehensive decode for all TC completions that generate RNC
suspensions.
Note that this commit also removes unconditional resumptions of ATAPI
devices when in the SCI_STP_DEV_ATAPI_ERROR state, and STP devices
when in the SCI_STP_DEV_IDLE state. This is because the SCI_STP_DEV_IDLE
and SCI_STP_DEV_ATAPI state entry functions manage the RNC resumption.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
For STP devices under certain protocol conditions, an RNC will not
suspend until the current transfer state is broken with a SYNC/ESC
sequence from the SCU. The SYNC/ESC driven by expiration of the
SCU link layer hang detect timer, which has too small a dynamic
range to support slow SATA devices, so normally it is disabled.
This change enables the timer with the minimum period at the point
when the suspension is requested.
Note that there is potential collateral damage to other open
connections to slow SATA devices on the same port, since there
is no alternative but to enable the LLHANG timer on every phy in
the port for the current suspension request - there is no way to
tell on which phy the RNC in question is currently active.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
domain_device ->parent conveys the same information.
Occurrences of ->is_direct_attached appear next to incomplete open-coded
versions of dev_is_sata(), clean those up as well.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Debugging the driver requires tracing the state transtions and tracing
state names is less work than decoding numbers.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Prior to commit 61aaff49 "isci: filter broadcast change notifications
during SMP phy resets" we borrowed the MVS_DEV_EH approach from the
mvsas driver for preventing ->lldd_I_T_nexus_reset() events during ata
discovery. This hack was protecting against the old ->phy_reset() in
ata_bus_probe(), but since the conversion to the new error handling this
hack is preventing resets from reaching ata devices.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
The lldd does not need to look at or manage the pending device
reset bit in pending sas_tasks.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
libsas uses the LLDD abort task interface to handle I/O timeouts
in the SATA/STP and SMP discovery paths, so this change will terminate
STP/SMP requests. Also, if the device is gone, the lldd will prevent
libsas from further escalations in the error handler.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Based on original implementation from Jiangbi Liu and Maciej Trela.
ATAPI transfers happen in two-to-three stages. The two stage atapi
commands are those that include a dma data transfer. The data transfer
portion of these operations is handled by the hardware packet-dma
acceleration. The three-stage commands do not have a data transfer and
are handled without hardware assistance in raw frame mode.
stage1: transmit host-to-device fis to notify the device of an incoming
atapi cdb. Upon reception of the pio-setup-fis repost the task_context
to perform the dma transfer of the cdb+data (go to stage3), or repost
the task_context to transmit the cdb as a raw frame (go to stage 2).
stage2: wait for hardware notification of the cdb transmission and then
go to stage 3.
stage3: wait for the arrival of the terminating device-to-host fis and
terminate the command.
To keep the implementation simple we only support ATAPI packet-dma
protocol (for commands with data) to avoid needing to handle the data
transfer manually (like we do for SATA-PIO). This may affect
compatibility for a small number of devices (see
ATA_HORKAGE_ATAPI_MOD16_DMA).
If the data-transfer underruns, or encounters an error the
device-to-host fis is expected to arrive in the unsolicited frame queue
to pass to libata for disposition. However, in the DONE_UNEXP_FIS (data
underrun) case it appears we need to craft a response. In the
DONE_REG_ERR case we do receive the UF and propagate it to libsas.
Signed-off-by: Maciej Trela <maciej.trela@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Most of these simple dereference macros are longer than their open coded
equivalent. Deleting enum sci_controller_mode is thrown in for good
measure.
Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The distinction between scic_sds_ scic_ and sci_ are no longer relevant
so just unify the prefixes on sci_. The distinction between isci_ and
sci_ is historically significant, and useful for comparing the old
'core' to the current Linux driver. 'sci_' represents the former core as
well as the routines that are closer to the hardware and protocol than
their 'isci_' brethren. sci == sas controller interface.
Also unwind the 'sds1' out of the parameter structs.
Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Remove the distinction between these two implementations and unify on
isci_host (local instances named ihost). Hmmm, we had two
'oem_parameters' instances, one was unused... nice.
Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Remove the distinction between these two implementations and unify on
isci_remote_device (local instances named idev).
Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Remove the distinction between these two implementations and unify on
isci_port (local instances named iport). The duplicate '->owning_port' and
'->isci_port' in both isci_phy and isci_remote_device will be fixed in a later
patch... this is just the straightforward rename/unification.
Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
They are one in the same object so remove the distinction. The near
duplicate fields (owning_controller, and isci_host) will be cleaned up
after the scic_sds_contoller isci_host unification.
Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
When the remote device transitions to a not-ready state because of
an NCQ error condition, all outstanding requests to that device
are terminated and completed to libsas on the normal path. The
device then waits for a READ LOG EXT command to issue on the task
management path.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Now that we have upleveled device reassignment protection to the
isci_remote_device reference count we no longer need this level of
self-defense.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Now that "stopping/stopped" are one in the same and signalled by a NULL device
pointer the rest of the device status infrastructure can be removed (->status
and ->state_lock). The "not ready for i/o state" is replaced with a state
flag, and is evaluated under scic_lock so that we don't see transients from
taking the device reference to submitting the i/o.
This also fixes a potential leakage of can_queue slots in the rare case that
SAS_TASK_ABORTED is set at submission.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
We have unsafe references to remote devices that are notified to
disappear at lldd_dev_gone. In order to clean this up we need a single
canonical source for device lookups and stable references once a lookup
succeeds. Towards that end guarantee that domain_device.lldd_dev is
NULL as soon as we start the process of stopping a device. Any code
path that wants to safely lookup a remote device must do so through
task->dev->lldd_dev (isci_lookup_device()).
For in-flight references outside of scic_lock we need reference counting
to ensure that the device is not recycled before we are done with it.
Simplify device back references to just scic_sds_request.target_device
which is now the only permissible internal reference that is maintained
relative to the reference count.
There were two occasions where we wanted new i/o's to be treated as
SAS_TASK_UNDELIVERED but where the domain_dev->lldd_dev link is still
intact. Introduce a 'gone' flag to prevent i/o while waiting for libsas
to take action on the port down event.
One 'core' leftover is that we currently call
scic_remote_device_destruct() from isci_remote_device_deconstruct()
which is called when the 'core' says the device is stopped. It would be
more natural for the final put to trigger
isci_remote_device_deconstruct() but this implementation is deferred as
it requires other changes.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This cleans up several areas of the state machine mechanism:
o Rename sci_base_state_machine_change_state to sci_change_state
o Remove sci_base_state_machine_get_state function
o Rename 'state_machine' struct member to 'sm' in client structs
o Shorten the name of request states
o Shorten state machine state names as follows:
SCI_BASE_CONTROLLER_STATE_xxx to SCIC_xxx
SCI_BASE_PHY_STATE_xxx to SCI_PHY_xxx
SCIC_SDS_PHY_STARTING_SUBSTATE_xxx to SCI_PHY_SUB_xxx
SCI_BASE_PORT_STATE_xxx to SCI_PORT_xxx and
SCIC_SDS_PORT_READY_SUBSTATE_xxx to SCI_PORT_SUB_xxx
SCI_BASE_REMOTE_DEVICE_STATE_xxx to SCI_DEV_xxx
SCIC_SDS_STP_REMOTE_DEVICE_READY_SUBSTATE_xxx to SCI_STP_DEV_xxx
SCIC_SDS_SMP_REMOTE_DEVICE_READY_SUBSTATE_xxx to SCI_SMP_DEV_xxx
SCIC_SDS_REMOTE_NODE_CONTEXT_xxx_STATE to SCI_RNC_xxx
Signed-off-by: Edmund Nadolski <edmund.nadolski@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
cross driver constants are spread out over multiple header files, consolidate
them into isci.h, and push some includes out to the source files that need
them.
TODO: remove SCI_MODE_SIZE infrastructure.
TODO: task.h is full of inlines that are too large
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This is a requirement for 2.6.39's new libata eh.
Still some questions about lldd_dev_gone racing against dev->lldd_dev
lookups, but we are at least no more broken than mvsas in this regard.
We also short-circuit I_T_nexus_reset invocations from the device
discovery path (IDEV_EH similar to MVS_DEV_EH) to filter out the
resulting domain rediscoveries triggered by the reset.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Remove the now unused state_handler infrastructure for remote_devices.
Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Implement all states in scic_sds_remote_device_frame() and delete
the state handler.
Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Implement all states in scic_sds_remote_device_event() and delete
the state handler.
Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Implement all states in scic_sds_remote_device_suspend() and delete
the state handler.
Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Implement all states in scic_sds_remote_device_start_task() and delete
the state handler.
Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Implement all states in scic_sds_remote_device_complete_io() and delete
the state handler.
Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Implement all states in scic_sds_remote_device_start_io() and delete the
state handler.
Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Implement all states in scic_remote_device_reset_complete() and delete the
state handler.
Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Implement all states in scic_remote_device_reset() and delete the state
handler.
Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Implement all states in scic_remote_device_destruct() and delete the state
handler.
Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Implement all states in scic_remote_device_stop() and delete the state
handlers.
Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Implement all states in scic_remote_device_start() and delete the state
handler.
Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
A substate is just a state, so uplevel the smp and stp device substates.
Three tricks at work here:
1/ scic_sds_remote_device_ready_state_enter: needs to know the the device type
so it can immediately transition to a stp or smp ready substate.
2/ scic_sds_remote_device_ready_state_exit: needs to know the device type. In
the ssp case the device is no longer ready, in the stp, and smp case we have
simply exited to a ready "substate".
3/ scic_sds_remote_device_resume_complete_handler: The one location
where we directly check the current state against
SCI_BASE_REMOTE_DEVICE_STATE_READY needed to comprehend the possible ready
substates.
Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The 'struct sci_base_object' was removed from the struct
scic_sds_remote_device.
Signed-off-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
[cleaned up sci_dev_to_idev]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>