Commit Graph

175 Commits

Author SHA1 Message Date
Linus Torvalds
d3dc366bba Merge branch 'for-3.18/core' of git://git.kernel.dk/linux-block
Pull core block layer changes from Jens Axboe:
 "This is the core block IO pull request for 3.18.  Apart from the new
  and improved flush machinery for blk-mq, this is all mostly bug fixes
  and cleanups.

   - blk-mq timeout updates and fixes from Christoph.

   - Removal of REQ_END, also from Christoph.  We pass it through the
     ->queue_rq() hook for blk-mq instead, freeing up one of the request
     bits.  The space was overly tight on 32-bit, so Martin also killed
     REQ_KERNEL since it's no longer used.

   - blk integrity updates and fixes from Martin and Gu Zheng.

   - Update to the flush machinery for blk-mq from Ming Lei.  Now we
     have a per hardware context flush request, which both cleans up the
     code should scale better for flush intensive workloads on blk-mq.

   - Improve the error printing, from Rob Elliott.

   - Backing device improvements and cleanups from Tejun.

   - Fixup of a misplaced rq_complete() tracepoint from Hannes.

   - Make blk_get_request() return error pointers, fixing up issues
     where we NULL deref when a device goes bad or missing.  From Joe
     Lawrence.

   - Prep work for drastically reducing the memory consumption of dm
     devices from Junichi Nomura.  This allows creating clone bio sets
     without preallocating a lot of memory.

   - Fix a blk-mq hang on certain combinations of queue depths and
     hardware queues from me.

   - Limit memory consumption for blk-mq devices for crash dump
     scenarios and drivers that use crazy high depths (certain SCSI
     shared tag setups).  We now just use a single queue and limited
     depth for that"

* 'for-3.18/core' of git://git.kernel.dk/linux-block: (58 commits)
  block: Remove REQ_KERNEL
  blk-mq: allocate cpumask on the home node
  bio-integrity: remove the needless fail handle of bip_slab creating
  block: include func name in __get_request prints
  block: make blk_update_request print prefix match ratelimited prefix
  blk-merge: don't compute bi_phys_segments from bi_vcnt for cloned bio
  block: fix alignment_offset math that assumes io_min is a power-of-2
  blk-mq: Make bt_clear_tag() easier to read
  blk-mq: fix potential hang if rolling wakeup depth is too high
  block: add bioset_create_nobvec()
  block: use bio_clone_fast() in blk_rq_prep_clone()
  block: misplaced rq_complete tracepoint
  sd: Honor block layer integrity handling flags
  block: Replace strnicmp with strncasecmp
  block: Add T10 Protection Information functions
  block: Don't merge requests if integrity flags differ
  block: Integrity checksum flag
  block: Relocate bio integrity flags
  block: Add a disk flag to block integrity profile
  block: Add prefix to block integrity profile flags
  ...
2014-10-18 11:53:51 -07:00
Randy Dunlap
74cf298fed scsi: fix various kernel-doc problems in scsi_error.c
Convert spaces to tabs in kernel-doc notation.
Correct duplicated (copy-paste) kernel-doc comments that are incorrect.
Fix kernel-doc warning:

Warning(..//drivers/scsi/scsi_error.c:1647): No description found for parameter 'shost'

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-09-15 16:01:57 -07:00
Joe Lawrence
a492f07545 block,scsi: fixup blk_get_request dead queue scenarios
The blk_get_request function may fail in low-memory conditions or during
device removal (even if __GFP_WAIT is set). To distinguish between these
errors, modify the blk_get_request call stack to return the appropriate
ERR_PTR. Verify that all callers check the return status and consider
IS_ERR instead of a simple NULL pointer check.

For consistency, make a similar change to the blk_mq_alloc_request leg
of blk_get_request.  It may fail if the queue is dead, or the caller was
unwilling to wait.

Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com>
Acked-by: Jiri Kosina <jkosina@suse.cz> [for pktdvd]
Acked-by: Boaz Harrosh <bharrosh@panasas.com> [for osd]
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-08-28 10:03:46 -06:00
Joe Lawrence
eb571eeade block,scsi: verify return pointer from blk_get_request
The blk-core dead queue checks introduce an error scenario to
blk_get_request that returns NULL if the request queue has been
shutdown. This affects the behavior for __GFP_WAIT callers, who should
verify the return value before dereferencing.

Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com>
Acked-by: Jiri Kosina <jkosina@suse.cz> [for pktdvd]
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-08-26 15:20:23 -06:00
Christoph Hellwig
7466501608 scsi: convert host_busy to atomic_t
Avoid taking the host-wide host_lock to check the per-host queue limit.
Instead we do an atomic_inc_return early on to grab our slot in the queue,
and if necessary decrement it after finishing all checks.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Webb Scales <webbnh@hp.com>
Acked-by: Jens Axboe <axboe@kernel.dk>
Tested-by: Bart Van Assche <bvanassche@acm.org>
Tested-by: Robert Elliott <elliott@hp.com>
2014-07-25 07:43:43 -04:00
Hannes Reinecke
91921e016a scsi: use dev_printk variants where possible
Using dev_printk variants prefixes the logging message with
the originating device, which makes debugging easier.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-07-17 22:07:42 +02:00
Bart Van Assche
fcc95a7634 scsi: remove two cancel_delayed_work() calls from the mid-layer
scsi_put_command() is either invoked before blk_start_request() or
after block layer processing has completed.  scsi_cmnd.abort_work
is scheduled from inside the SCSI timeout handler.  The block layer
guarantees that either the regular completion handler
(softirq_done_fn()) or the timeout handler (rq_timed_out_fn()) is
invoked but not both. This means that scsi_put_command() is never
invoked while abort_work is scheduled.  Hence remove the
cancel_delayed_work() call from scsi_put_command().

Similarly, scsi_abort_command() is only invoked from the SCSI
timeout handler. If scsi_abort_command() is invoked for a SCSI
command with the SCSI_EH_ABORT_SCHEDULED flag set this means that
scmd_eh_abort_handler() has already invoked scsi_queue_insert() and
hence that scsi_cmnd.abort_work is no longer pending. Hence also
remove the cancel_delayed_work() call from scsi_abort_command().

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-07-17 22:07:28 +02:00
Hannes Reinecke
a33c070bce scsi_error: set DID_TIME_OUT correctly
Any callbacks in scsi_timeout_out() might return BLK_EH_RESET_TIMER,
in which case we should leave the result alone and not set
DID_TIME_OUT, as the command didn't actually timeout.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
2014-06-24 17:00:12 +02:00
Ulrich Obergfell
8922a90890 scsi_error: fix invalid setting of host byte
After scsi_try_to_abort_cmd returns, the eh_abort_handler may have
already found that the command has completed in the device, causing
the host_byte to be nonzero (e.g. it could be DID_ABORT).  When
this happens, ORing DID_TIME_OUT into the host byte will corrupt
the result field and initiate an unwanted command retry.

Fix this by using set_host_byte instead, following the model of
commit 2082ebc45a.

Cc: stable@vger.kernel.org
Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
[Fix all instances according to review comments. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
2014-06-24 17:00:12 +02:00
Linus Torvalds
23d4ed53b7 Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block layer fixes from Jens Axboe:
 "Final small batch of fixes to be included before -rc1.  Some general
  cleanups in here as well, but some of the blk-mq fixes we need for the
  NVMe conversion and/or scsi-mq.  The pull request contains:

   - Support for not merging across a specified "chunk size", if set by
     the driver.  Some NVMe devices perform poorly for IO that crosses
     such a chunk, so we need to support it generically as part of
     request merging avoid having to do complicated split logic.  From
     me.

   - Bump max tag depth to 10Ki tags.  Some scsi devices have a huge
     shared tag space.  Before we failed with EINVAL if a too large tag
     depth was specified, now we truncate it and pass back the actual
     value.  From me.

   - Various blk-mq rq init fixes from me and others.

   - A fix for enter on a dying queue for blk-mq from Keith.  This is
     needed to prevent oopsing on hot device removal.

   - Fixup for blk-mq timer addition from Ming Lei.

   - Small round of performance fixes for mtip32xx from Sam Bradshaw.

   - Minor stack leak fix from Rickard Strandqvist.

   - Two __init annotations from Fabian Frederick"

* 'for-linus' of git://git.kernel.dk/linux-block:
  block: add __init to blkcg_policy_register
  block: add __init to elv_register
  block: ensure that bio_add_page() always accepts a page for an empty bio
  blk-mq: add timer in blk_mq_start_request
  blk-mq: always initialize request->start_time
  block: blk-exec.c: Cleaning up local variable address returnd
  mtip32xx: minor performance enhancements
  blk-mq: ->timeout should be cleared in blk_mq_rq_ctx_init()
  blk-mq: don't allow queue entering for a dying queue
  blk-mq: bump max tag depth to 10K tags
  block: add blk_rq_set_block_pc()
  block: add notion of a chunk size for request merging
2014-06-11 08:41:17 -07:00
Jens Axboe
f27b087b81 block: add blk_rq_set_block_pc()
With the optimizations around not clearing the full request at alloc
time, we are leaving some of the needed init for REQ_TYPE_BLOCK_PC
up to the user allocating the request.

Add a blk_rq_set_block_pc() that sets the command type to
REQ_TYPE_BLOCK_PC, and properly initializes the members associated
with this type of request. Update callers to use this function instead
of manipulating rq->cmd_type directly.

Includes fixes from Christoph Hellwig <hch@lst.de> for my half-assed
attempt.

Signed-off-by: Jens Axboe <axboe@fb.com>
2014-06-06 07:57:37 -06:00
Hannes Reinecke
ac61d19559 scsi: set correct completion code in scsi_send_eh_cmnd()
->queuecommand returns '0' for successful command submission,
so we need to set the correct SCSI midlayer return value
when calling scsi_log_completion().

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reported-by: Robert Elliott <elliott@hp.com>
Cc: Stephen Cameron <scameron@beardog.cce.hp.com>
Tested-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-05-19 12:35:11 +02:00
Christoph Hellwig
95eeb5f588 scsi: handle command allocation failure in scsi_reset_provider
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Nicholas Bellinger <nab@linux-iscsi.org>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Hannes Reinecke <hare@suse.de>
2014-05-19 12:35:10 +02:00
James Bottomley
c69e6f812b [SCSI] More USB deadlock fixes
This patch fixes a corner case in the previous USB Deadlock fix patch (12023e7
[SCSI] Fix USB deadlock caused by SCSI error handling).

The scenario is abort command, set flag, abort completes, send TUR, TUR
doesn't return, so we now try to abort the TUR, but scsi_abort_eh_cmnd()
will skip the abort because the flag is set and move straight to reset.

Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2014-04-21 14:28:40 -07:00
Hannes Reinecke
7daf480483 [SCSI] Fix USB deadlock caused by SCSI error handling
USB requires that every command be aborted first before we escalate to reset.
In particular, USB will deadlock if we try to reset first before aborting the
command.

Unfortunately, the flag we use to tell if a command has already been aborted:
SCSI_EH_ABORT_SCHEDULED is not cleared properly leading to cases where we can
requeue a command with the flag set and proceed immediately to reset if it
fails (thus causing USB to deadlock).

Fix by clearing the SCSI_EH_ABORT_SCHEDULED flag if it has been set.  Which
means this will be the second time scsi_abort_command() has been called for
the same command.  IE the first abort went out, did its thing, but now the
same command has timed out again.

So this flag gets cleared, and scsi_abort_command() returns FAILED, and _no_
asynchronous abort is being scheduled.  scsi_times_out() will then proceed to
call scsi_eh_scmd_add().  But as we've cleared the SCSI_EH_ABORT_SCHEDULED
flag the SCSI_EH_CANCEL_CMD flag will continue to be set, and the command will
be aborted with the main SCSI EH routine.

Reported-by: Alan Stern <stern@rowland.harvard.edu>
Tested-by: Andreas Reis <andreas.reis@gmail.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2014-04-21 14:28:26 -07:00
Alan Stern
644373a421 [SCSI] Fix command result state propagation
We're seeing a case where the contents of scmd->result isn't being reset after
a SCSI command encounters an error, is resubmitted, times out and then gets
handled.  The error handler acts on the stale result of the previous error
instead of the timeout.  Fix this by properly zeroing the scmd->status before
the command is resubmitted.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2014-04-21 14:27:26 -07:00
James Bottomley
d555a2abf3 [SCSI] Fix spurious request sense in error handling
We unconditionally execute scsi_eh_get_sense() to make sure all failed
commands that should have sense attached, do.  However, the routine forgets
that some commands, because of the way they fail, will not have any sense code
... we should not bother them with a REQUEST_SENSE command.  Fix this by
testing to see if we actually got a CHECK_CONDITION return and skip asking for
sense if we don't.

Tested-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2014-04-21 14:27:05 -07:00
Christoph Hellwig
0479633686 [SCSI] do not manipulate device reference counts in scsi_get/put_command
Many callers won't need this and we can optimize them away.  In addition
the handling in the __-prefixed variants was inconsistant to start with.

Based on an earlier patch from Bart Van Assche.

[jejb: fix kerneldoc probelm picked up by Fengguang Wu]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2014-03-15 10:19:24 -07:00
Ren Mingxin
bb3b621a33 [SCSI] Set the minimum valid value of 'eh_deadline' as 0
The former minimum valid value of 'eh_deadline' is 1s, which means
the earliest occasion to shorten EH is 1 second later since a
command is failed or timed out. But if we want to skip EH steps
ASAP, we have to wait until the first EH step is finished. If the
duration of the first EH step is long, this waiting time is
excruciating. So, it is necessary to accept 0 as the minimum valid
value for 'eh_deadline'.

According to my test, with Hannes' patchset 'New EH command timeout
handler' as well, the minimum IO time is improved from 73s
(eh_deadline = 1) to 43s(eh_deadline = 0) when commands are timed
out by disabling RSCN and target port.

Signed-off-by: Ren Mingxin <renmx@cn.fujitsu.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-12-19 07:39:02 -08:00
Hannes Reinecke
76ad3e5956 [SCSI] Unlock accesses to eh_deadline
32bit accesses are guaranteed to be atomic, so we can remove
the spinlock when checking for eh_deadline. We only need to
make sure to catch any updates which might happened during
the call to time_before(); if so we just recheck with the
correct value.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-12-19 07:39:02 -08:00
Hannes Reinecke
e494f6a728 [SCSI] improved eh timeout handler
When a command runs into a timeout we need to send an 'ABORT TASK'
TMF. This is typically done by the 'eh_abort_handler' LLDD callback.

Conceptually, however, this function is a normal SCSI command, so
there is no need to enter the error handler.

This patch implements a new scsi_abort_command() function which
invokes an asynchronous function scsi_eh_abort_handler() to
abort the commands via the usual 'eh_abort_handler'.

If abort succeeds the command is either retried or terminated,
depending on the number of allowed retries. However, 'eh_eflags'
records the abort, so if the retry would fail again the
command is pushed onto the error handler without trying to
abort it (again); it'll be cleared up from SCSI EH.

[hare: smatch detected stray switch fixed]
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-12-19 07:39:02 -08:00
James Bottomley
2451079bc2 [SCSI] Fix erratic device offline during EH
Commit 18a4d0a22e
(Handle disk devices which can not process medium access commands)
was introduced to offline any device which cannot process medium
access commands.
However, commit 3eef6257de
(Reduce error recovery time by reducing use of TURs) reduced
the number of TURs by sending it only on the first failing
command, which might or might not be a medium access command.
So in combination this results in an erratic device offlining
during EH; if the command where the TUR was sent upon happens
to be a medium access command the device will be set offline,
if not everything proceeds as normal.

This patch moves the check to the final test, eliminating
this problem.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-12-19 07:39:02 -08:00
Hannes Reinecke
6fd046f960 [SCSI] scsi_error: Escalate to LUN reset if abort fails
If a command abort fails there is a fair chance that all other
aborts will be failing, too.
So we should be calling LUN reset directly after the first failed
abort and skip aborting the remaining commands.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-10-25 12:18:30 +01:00
Hannes Reinecke
b45620229d [SCSI] Add 'eh_deadline' to limit SCSI EH runtime
This patchs adds an 'eh_deadline' sysfs attribute to the scsi
host which limits the overall runtime of the SCSI EH.
The 'eh_deadline' value is stored in the now obsolete field
'resetting'.
When a command is failed the start time of the EH is stored
in 'last_reset'. If the overall runtime of the SCSI EH is longer
than last_reset + eh_deadline, the EH is short-circuited and
falls through to issue a host reset only.

[jejb: add comments in Scsi_Host about new fields]
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-10-25 12:17:59 +01:00
Ewan D. Milne
279afdfe78 [SCSI] Generate uevents on certain unit attention codes
Generate a uevent when the following Unit Attention ASC/ASCQ
codes are received:

    2A/01  MODE PARAMETERS CHANGED
    2A/09  CAPACITY DATA HAS CHANGED
    38/07  THIN PROVISIONING SOFT THRESHOLD REACHED
    3F/03  INQUIRY DATA HAS CHANGED
    3F/0E  REPORTED LUNS DATA HAS CHANGED

Log kernel messages when the following Unit Attention ASC/ASCQ
codes are received that are not as specific as those above:

    2A/xx  PARAMETERS CHANGED
    3F/xx  TARGET OPERATING CONDITIONS HAVE CHANGED

Added logic to set expecting_lun_change for other LUNs on the target
after REPORTED LUNS DATA HAS CHANGED is received, so that duplicate
uevents are not generated, and clear expecting_lun_change when a
REPORT LUNS command completes, in accordance with the SPC-3
specification regarding reporting of the 3F 0E ASC/ASCQ UA.

[jejb: remove SPC3 test in scsi_report_lun_change and some docbook fixes and
       unused variable fix, both reported by Fengguang Wu]
Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-08-26 18:52:27 +04:00
Hannes Reinecke
7e782af576 [SCSI] Return ENODATA on medium error
When a medium error is detected the SCSI stack should return
ENODATA to the upper layers.

[jejb: fix whitespace error]
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-08-23 12:54:53 -04:00
Hannes Reinecke
a9d6ceb838 [SCSI] return ENOSPC on thin provisioning failure
When the thin provisioning hard threshold is reached we
should return ENOSPC to inform upper layers about this fact.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-08-23 12:43:54 -04:00
Hannes Reinecke
87f14e658f [SCSI] Set hostbyte status in scsi_check_sense()
We should be modifying the host_byte status in scsi_check_sense()
directly; this saves us to introduce a special return code for
each and every condition.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-08-23 12:34:56 -04:00
Linus Torvalds
84cbd7222b SCSI misc on 20130702
The patch set is mostly driver updates (usf, zfcp, lpfc, mpt2sas,
 megaraid_sas, bfa, ipr) and a few bug fixes.  Also of note is that the
 Buslogic driver has been rewritten to a better coding style and 64 bit support
 added.  We also removed the libsas limitation on 16 bytes for the command size
 (currently no drivers make use of this).
 
 Signed-off-by: James Bottomley <JBottomley@Parallels.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQEcBAABAgAGBQJR0ugCAAoJEDeqqVYsXL0MX2sH+gOkWuy5p3igz+VEim8TNaOA
 VV5EIxG1v7Q0ZiXCp/wcF6eqhgQkWvkrKSxWkaN0yzq8LEWfQeY7VmFDbGgFeVUZ
 XMlX5ay8+FLCIK9M76oxwhV7VAXYbeUUZafh+xX6StWCdKrl0eJbicOGoUk/pjsi
 ZjCBpK5BM0SW+s2gMSDQhO2eMsgMp9QrJMiCJHUF1wWPN8Yez6va1tg4b9iW39BZ
 dd3sJq+PuN6yDbYAJIjEpiGF9gDaaYxSE6bTKJuY+oy08+VsP/RRWjorTENs9Aev
 rQXZIC3nwsv26QRSX7RDSj+UE+kFV6FcPMWMU3HN2UG6ttprtOxT8tslVJf7LcA=
 =BxtF
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull first round of SCSI updates from James Bottomley:
 "The patch set is mostly driver updates (usf, zfcp, lpfc, mpt2sas,
  megaraid_sas, bfa, ipr) and a few bug fixes.  Also of note is that the
  Buslogic driver has been rewritten to a better coding style and 64 bit
  support added.  We also removed the libsas limitation on 16 bytes for
  the command size (currently no drivers make use of this)"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (101 commits)
  [SCSI] megaraid: minor cut and paste error fixed.
  [SCSI] ufshcd-pltfrm: remove unnecessary dma_set_coherent_mask() call
  [SCSI] ufs: fix register address in UIC error interrupt handling
  [SCSI] ufshcd-pltfrm: add missing empty slot in ufs_of_match[]
  [SCSI] ufs: use devres functions for ufshcd
  [SCSI] ufs: Fix the response UPIU length setting
  [SCSI] ufs: rework link start-up process
  [SCSI] ufs: remove version check before IS reg clear
  [SCSI] ufs: amend interrupt configuration
  [SCSI] ufs: wrap the i/o access operations
  [SCSI] storvsc: Update the storage protocol to win8 level
  [SCSI] storvsc: Increase the value of scsi timeout for storvsc devices
  [SCSI] MAINTAINERS: Add myself as the maintainer for BusLogic SCSI driver
  [SCSI] BusLogic: Port driver to 64-bit.
  [SCSI] BusLogic: Fix style issues
  [SCSI] libiscsi: Added new boot entries in the session sysfs
  [SCSI] aacraid: Fix for arrays are going offline in the system. System hangs
  [SCSI] ipr: IOA Status Code(IOASC) update
  [SCSI] sd: Update WRITE SAME heuristics
  [SCSI] fnic: potential dead lock in fnic_is_abts_pending()
  ...
2013-07-04 12:30:30 -07:00
Martin K. Petersen
0816c9251a [SCSI] Allow error handling timeout to be specified
Introduce eh_timeout which can be used for error handling purposes. This
was previously hardcoded to 10 seconds in the SCSI error handling
code. However, for some fast-fail scenarios it is necessary to be able
to tune this as it can take several iterations (bus device, target, bus,
controller) before we give up.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-06-04 11:16:24 -07:00
Geert Uytterhoeven
c2b3ebd0d2 scsi: Spelling hsot -> host
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2013-05-28 12:02:12 +02:00
Hannes Reinecke
fc73648a50 [SCSI] Handle MLQUEUE busy response in scsi_send_eh_cmnd
scsi_send_eh_cmnd() is calling queuecommand() directly, so
it needs to check the return value here.
The only valid return codes for queuecommand() are 'busy'
states, so we need to wait for a bit to allow the LLDD
to recover.

Based on an earlier patch from Wen Xiong.

[jejb: fix confusion between msec and jiffies values and other issues]
[bvanassche: correct stall_for interval]
Cc: Wen Xiong <wenxiong@linux.vnet.ibm.com>
Cc: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-05-10 07:47:44 -07:00
Li Zhong
329a402cb0 [SCSI] Shorten the path length of scsi_cmd_to_driver()
This patch tries to shorten the path length of scsi_cmd_to_driver(). As only
REQ_TYPE_BLOCK_PC commands can be submitted without a driver, so we could
avoid the related NULL checking, as long as we make sure we don't use it for
REQ_TYPE_BLOCK_PC type commands. Plus, this fixes a bug where you get
different behaviors from REQ_TYPE_BLOCK_PC commands when a driver is and isn't
attached.

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-10-09 12:04:42 +01:00
James Bottomley
14216561e1 [SCSI] Fix 'Device not ready' issue on mpt2sas
This is a particularly nasty SCSI ATA Translation Layer (SATL) problem.

SAT-2 says (section 8.12.2)

        if the device is in the stopped state as the result of
        processing a START STOP UNIT command (see 9.11), then the SATL
        shall terminate the TEST UNIT READY command with CHECK CONDITION
        status with the sense key set to NOT READY and the additional
        sense code of LOGICAL UNIT NOT READY, INITIALIZING COMMAND
        REQUIRED;

mpt2sas internal SATL seems to implement this.  The result is very confusing
standby behaviour (using hdparm -y).  If you suspend a drive and then send
another command, usually it wakes up.  However, if the next command is a TEST
UNIT READY, the SATL sees that the drive is suspended and proceeds to follow
the SATL rules for this, returning NOT READY to all subsequent commands.  This
means that the ordering of TEST UNIT READY is crucial: if you send TUR and
then a command, you get a NOT READY to both back.  If you send a command and
then a TUR, you get GOOD status because the preceeding command woke the drive.

This bit us badly because

commit 85ef06d1d2
Author: Tejun Heo <tj@kernel.org>
Date:   Fri Jul 1 16:17:47 2011 +0200

    block: flush MEDIA_CHANGE from drivers on close(2)

Changed our ordering on TEST UNIT READY commands meaning that SATA drives
connected to an mpt2sas now suspend and refuse to wake (because the mpt2sas
SATL sees the suspend *before* the drives get awoken by the next ATA command)
resulting in lots of failed commands.

The standard is completely nuts forcing this inconsistent behaviour, but we
have to work around it.

The fix for this is twofold:

   1. Set the allow_restart flag so we wake the drive when we see it has been
      suspended

   2. Return all TEST UNIT READY status directly to the mid layer without any
      further error handling which prevents us causing error handling which
      may offline the device just because of a media check TUR.

Reported-by: Matthias Prager <linux@matthiasprager.de>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-08-22 09:42:54 +04:00
Dan Williams
b9d5c6b7ef [SCSI] cleanup setting task state in scsi_error_handler()
A quick reading of scsi_error_handler() one could come away with the
impression that it does its wakeup event check while the task state is
TASK_RUNNING.  In fact it sets TASK_INTERRUPTIBLE at the bottom of the
loop, but that is ~50 lines down.

Just set TASK_INTERRUPTIBLE at the top of loop and be done.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20 08:58:47 +01:00
Dan Williams
57fc2e335f [SCSI] fix eh wakeup (scsi_schedule_eh vs scsi_restart_operations)
Rapid ata hotplug on a libsas controller results in cases where libsas
is waiting indefinitely on eh to perform an ata probe.

A race exists between scsi_schedule_eh() and scsi_restart_operations()
in the case when scsi_restart_operations() issues i/o to other devices
in the sas domain.  When this happens the host state transitions from
SHOST_RECOVERY (set by scsi_schedule_eh) back to SHOST_RUNNING and
->host_busy is non-zero so we put the eh thread to sleep even though
->host_eh_scheduled is active.

Before putting the error handler to sleep we need to check if the
host_state needs to return to SHOST_RECOVERY for another trip through
eh.  Since i/o that is released by scsi_restart_operations has been
blocked for at least one eh cycle, this implementation allows those
i/o's to run before another eh cycle starts to discourage hung task
timeouts.

Cc: <stable@vger.kernel.org>
Reported-by: Tom Jackson <thomas.p.jackson@intel.com>
Tested-by: Tom Jackson <thomas.p.jackson@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-20 08:58:46 +01:00
Linus Torvalds
e8650a0823 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial updates from Jiri Kosina:
 "As usual, it's mostly typo fixes, redundant code elimination and some
  documentation updates."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (57 commits)
  edac, mips: don't change code that has been removed in edac/mips tree
  xtensa: Change mail addresses of Hannes Weiner and Oskar Schirmer
  lib: Change mail address of Oskar Schirmer
  net: Change mail address of Oskar Schirmer
  arm/m68k: Change mail address of Sebastian Hess
  i2c: Change mail address of Oskar Schirmer
  net: Fix tcp_build_and_update_options comment in struct tcp_sock
  atomic64_32.h: fix parameter naming mismatch
  Kconfig: replace "--- help ---" with "---help---"
  c2port: fix bogus Kconfig "default no"
  edac: Fix spelling errors.
  qla1280: Remove redundant NULL check before release_firmware() call
  remoteproc: remove redundant NULL check before release_firmware()
  qla2xxx: Remove redundant NULL check before release_firmware() call.
  aic94xx: Get rid of redundant NULL check before release_firmware() call
  tehuti: delete redundant NULL check before release_firmware()
  qlogic: get rid of a redundant test for NULL before call to release_firmware()
  bna: remove redundant NULL test before release_firmware()
  tg3: remove redundant NULL test before release_firmware() call
  typhoon: get rid of redundant conditional before all to release_firmware()
  ...
2012-05-22 19:22:50 -07:00
Santosh Y
3b729f7647 scsi: fix comment spelling fix recory->recovery
Signed-off-by: Santosh Y <santoshsy@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-04-16 14:35:30 +02:00
Martin K. Petersen
919f797a4c SCSI: Fix error handling when no ULD is attached
Commit 18a4d0a22e ("[SCSI] Handle disk devices which can not process
medium access commands") introduced a bug in which we would attempt to
dereference the scsi driver even when the device had no ULD attached.

Ensure that a driver is registered and make the driver accessor function
more resilient to errors during device discovery.

Reported-by: Elric Fu <elricfu1@gmail.com>
Reported-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-15 11:08:53 -07:00
Martin K. Petersen
18a4d0a22e [SCSI] Handle disk devices which can not process medium access commands
We have experienced several devices which fail in a fashion we do not
currently handle gracefully in SCSI. After a failure these devices will
respond to the SCSI primary command set (INQUIRY, TEST UNIT READY, etc.)
but any command accessing the storage medium will time out.

The following patch adds an callback that can be used by upper level
drivers to inspect the results of an error handling command. This in
turn has been used to implement additional checking in the SCSI disk
driver.

If a medium access command fails twice but TEST UNIT READY succeeds both
times in the subsequent error handling we will offline the device. The
maximum number of failed commands required to take a device offline can
be tweaked in sysfs.

Also add a new error flag to scsi_debug which allows this scenario to be
easily reproduced.

[jejb: fix up integer parsing to use kstrtouint]
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 10:14:52 -06:00
Mike Snitzer
47ac56db13 [SCSI] scsi_error: classify some ILLEGAL_REQUEST sense as a permanent TARGET_ERROR
Permanent target failures are non-retryable and should be classified as
TARGET_ERROR; otherwise dm-multipath will retry an IO request that will
always fail at the target.

A SCSI command that fails with ILLEGAL_REQUEST sense and Additional
sense 0x20, 0x21, 0x24 or 0x26 represents a permanent TARGET_ERROR.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 09:39:59 -06:00
Moger, Babu
2082ebc45a [SCSI] fix the new host byte settings (DID_TARGET_FAILURE and DID_NEXUS_FAILURE)
This patch fixes the host byte settings DID_TARGET_FAILURE and
DID_NEXUS_FAILURE.  The function __scsi_error_from_host_byte, tries to reset
the host byte to DID_OK. But that does not happen because of the OR operation.

Here is the flow.

scsi_softirq_done-> scsi_decide_disposition -> __scsi_error_from_host_byte

Let's take an example with DID_NEXUS_FAILURE. In scsi_decide_disposition,
result will be set as DID_NEXUS_FAILURE (=0x11). Then in
__scsi_error_from_host_byte, when we do OR with DID_OK.  Purpose is to reset
it back to DID_OK. But that does not happen.  This patch fixes this issue.

Signed-off-by: Babu Moger <babu.moger@netapp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-19 08:08:59 -06:00
Lin Ming
ae0751ffc7 [SCSI] add flag to skip the runtime PM calls on the host
With previous change, now the ata port runtime suspend will happen as:

disk suspend --> scsi target suspend --> scsi host suspend --> ata port
suspend

ata port(parent device) suspend need to schedule scsi EH which will resume
scsi host(child device). Then the child device resume will in turn make
parent device resume first. This is kind of recursive.

This patch adds a new flag Scsi_Host::eh_noresume.
ata port will set this flag to skip the runtime PM calls on scsi host.

Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2012-01-08 19:14:57 -05:00
TARUISI Hiroaki
dfcf777581 [SCSI] Fix out of spec CD-ROM problem with media change
Some CD-ROMs fail to report a media change correctly.  The specific
one for this patch simply fails to respond to commands, then gives a
UNIT ATTENTION after being reset which returns ASC/ASCQ 28/00.  This
is out of spec behaviour, but add a check in the eat CC/UA on reset
path to catch this case so the CD-ROM will function somewhat properly.

[jejb: fixed up white space and accepted without signoff]
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-08-27 08:36:41 -06:00
David Jeffery
3eef6257de [SCSI] Reduce error recovery time by reducing use of TURs
In error recovery, most scsi error recovery stages will send a TUR command
for every bad command when a driver's error handler reports success.  When
several bad commands to the same device, this results in a device
being probed multiple times.

This becomes very problematic if the device or connection is in a state
where the device still doesn't respond to commands even after a recovery
function returns success.  The error handler must wait for the test
commands to time out.  The time waiting for the redundant commands can
drastically lengthen error recovery.

This patch alters the scsi mid-layer's error routines to send test commands
once per device instead of once per bad command.  This can drastically
lower error recovery time.

[jejb: fixed up whitespace and formatting]
Signed-of-by: David Jeffery <djeffery@redhat.com>
Signed-off-by: James Bottomley <jbottomley@parallels.com>
2011-05-24 12:51:53 -04:00
Shyam Iyer
deb1cb63d2 [SCSI] Log thin provisioning threshold event
At least log the message that we received a THIN PROVISIONING SOFT
THRESHOLD REACHED Unit Attention.  Also added it to unit attention
decodes.

Signed-off-by: Shyam Iyer <shyam_iyer@dell.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-04-15 16:29:25 -05:00
Jesper Juhl
0bf8c86970 Reduce sequential pointer derefs in scsi_error.c and reduce size as well
This patch reduces the number of sequential pointer derefs in
drivers/scsi/scsi_error.c

This has been submitted a number of times over a couple of years.  I
believe this version adresses all comments it has gathered over time.
Please apply or reject with a reason.

The benefits are:

 - makes the code easier to read.  Lots of sequential derefs of the same
   pointers is not easy on the eye.

 - theoretically at least, just dereferencing the pointers once can
   allow the compiler to generally slightly faster code, so in theory
   this could also be a micro speed optimization.

 - reduces size of object file (tiny effect: on x86-64, in at least one
   configuration, the text size decreased from 9439 bytes to 9400)

 - removes some pointless (mostly trailing) whitespace.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-21 15:54:35 -07:00
Hannes Reinecke
63583cca74 [SCSI] Add detailed SCSI I/O errors
Instead of just passing 'EIO' for any I/O error we should be
notifying the upper layers with more details about the cause
of this error.

Update the possible I/O errors to:

- ENOLINK: Link failure between host and target
- EIO: Retryable I/O error
- EREMOTEIO: Non-retryable I/O error
- EBADE: I/O error restricted to the I_T_L nexus

'Retryable' in this context means that an I/O error _might_ be
restricted to the I_T_L nexus (vulgo: path), so retrying on another
nexus / path might succeed.

'Non-retryable' in general refers to a target failure, so this
error will always be generated regardless of the I_T_L nexus
it was send on.

I/O errors restricted to the I_T_L nexus might be retried
on another nexus / path, but they should _not_ be queued
if no paths are available.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-02-12 10:33:08 -06:00
James Bottomley
98db519573 [SCSI] fix id computation in scsi_eh_target_reset()
The current code in scsi_eh_target_reset() has an off by one error
that actually sends spurious extra resets.  Since there's no real need
to reset the targets in numerical order, simply chunk up the command
recovery list doing target resets and pulling matching targets out of
the list (that also makes the loop O(N) instead of O(N^2).

[mike christie found and fixed a list_splice -> list_splice_init problem]

Reported-by: Hillf Danton<dhillf@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-12-21 12:23:56 -06:00
James Bottomley
459dbf72e4 [SCSI] Eliminate error handler overload of the SCSI serial number
The error handler is using the test cmd->serial_number == 0 in the
abort routines to signal that the command to be aborted has already
completed normally.  This design was to close a race window in the
original error handler where a command could go through the normal
completion routines after it timed out but before error handling was
started.

Mike Anderson pointed out that when we converted our timeout and
softirq completions, we picked up atomicity here because the block
layer now mediates this with the REQ_ATOM_COMPLETE flag and guarantees
that *either* the command times out or our done routine is called, but
ensures we can't get both occurring.  That makes the serial number
zero check redundant and it can be removed.

Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-12-09 09:41:16 -06:00