We have several points in the SCSI stack (primarily for our device
functions) where we need to guarantee process context, but (given the
place where the last reference was released) we cannot guarantee this.
This API gets around the issue by executing the function directly if
the caller has process context, but scheduling a workqueue to execute
in process context if the caller doesn't have it. Unfortunately, it
requires memory allocation in interrupt context, but it's better than
what we have previously. The true solution will require a bit of
re-engineering, so isn't appropriate for 2.6.16.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
When the scsi_execute_async interface was added it ended up reducing
the flexibility of userspace to send arbitrary scsi commands through
sg using SG_IO. The SG_IO interface allows userspace to specify the
CDB length. This is now ignored in scsi_execute_async and it is
guessed using the COMMAND_SIZE macro, which is not always correct,
particularly for vendor specific commands. This patch adds a cmd_len
parameter to the scsi_execute_async interface to allow the caller
to specify the length of the CDB.
Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
LLDDs should never see REQ_BLOCK_PC requests, we can handle them just
fine in the core code. There is a small behaviour change in that some
check in sr's rw_intr are bypassed, but I consider the old behaviour
a bug.
Mike found this cleanup opportunity and provdided early patches, so all
the credit goes to him, even if I redid the patches from scratch beause
that was easier than forward-porting the old patches.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch moves the SCSI softirq handling to the block layer version.
There should be no functional changes.
Signed-off-by: Jens Axboe <axboe@suse.de>
All ordered request related stuff delegated to HLD. Midlayer
now doens't deal with ordered setting or prepare_flush
callback. sd.c updated to deal with blk_queue_ordered
setting. Currently, ordered tag isn't used as SCSI midlayer
cannot guarantee request ordering.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jens Axboe <axboe@suse.de>
add @uptodate argument to end_that_request_last() and @error
to rq_end_io_fn(). there's no generic way to pass error code
to request completion function, making generic error handling
of non-fs request difficult (rq->errors is driver-specific and
each driver uses it differently). this patch adds @uptodate
to end_that_request_last() and @error to rq_end_io_fn().
for fs requests, this doesn't really matter, so just using the
same uptodate argument used in the last call to
end_that_request_first() should suffice. imho, this can also
help the generic command-carrying request jens is working on.
Signed-off-by: tejun heo <htejun@gmail.com>
Signed-Off-By: Jens Axboe <axboe@suse.de>
- export __blk_put_request and blk_execute_rq_nowait
needed for async REQ_BLOCK_PC requests
- seperate max_hw_sectors and max_sectors for block/scsi_ioctl.c and
SG_IO bio.c helpers per Jens's last comments. Since block/scsi_ioctl.c SG_IO was
already testing against max_sectors and SCSI-ml was setting max_sectors and
max_hw_sectors to the same value this does not change any scsi SG_IO behavior. It only
prepares ll_rw_blk.c, scsi_ioctl.c and bio.c for when SCSI-ml begins to set
a valid max_hw_sectors for all LLDs. Today if a LLD does not set it
SCSI-ml sets it to a safe default and some LLDs set it to a artificial low
value to overcome memory and feedback issues.
Note: Since we now cap max_sectors to BLK_DEF_MAX_SECTORS, which is 1024,
drivers that used to call blk_queue_max_sectors with a large value of
max_sectors will now see the fs requests capped to BLK_DEF_MAX_SECTORS.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Add kmemcache of scsi io contexts.
In the future when we finalize on where these functions will live
we can add a mempool for it and do a bioset for out REQ_BLOCK_PC
bios. This is needed becuase the dm-multipath handlers will
want to use the scsi_exectute* functions for failover and we cannot
have them and the bio device allocating from the same mempool.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
sd does not allow scsi_io_completion to retry commands for
SG_IO requests, and it make sense that it should not happen for st
SG_IO commands too. If for st we hit the bottom of scsi_io_completion
we will probably screw things up pretty bad. This patch returns to the
block layer that the whole command completed and relies on the caller to check
the request errors field. For initialization commands like in sd, this adds
the previous behavior where scsi_io_completion did not process the error.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
For tape we need to control the retries. This patch adds a retries
counter on the request for REQ_BLOCK_PC commands originating from
scsi_execute* to use. REQ_BLOCK_PC commands comming from the block
layer SG_IO path continue to use the retires set in the ULD init_command.
(scsi_execute* does not set the gendisk so we do not execute
the init_command in that path).
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Add scsi helpers to create really-large-requests and convert
scsi-ml to scsi_execute_async().
Per Jens's previous comments, I placed this function in scsi_lib.c.
I made it follow all the queue's limits - I think I did at least :), so
I removed the warning on the function header.
I think the scsi_execute_* functions should eventually take a request_queue
and be placed some place where the dm-multipath hw_handler can use them
if that failover code is going to stay in the kernel. That conversion
patch will be sent in another mail though.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This reverts commit 1b0997f561, which in
turn reverted 34ea80ec6a (which is thus
re-instated).
Quoth James Bottomley:
"All it's doing is deferring the device_put() from the
scsi_put_command() to after the scsi_run_queue(), which doesn't fix
the sleep while atomic problem of the device release method. In both
cases we still get the semaphore in atomic context problem which is
caused by scsi_reap_target() doing a device_del(), which I assumed
(wrongly) was valid from atomic context."
who also promised to fix scsi_reap_target().
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This reverts commit 34ea80ec6a.
It does a put_device() from softirq context, which is bad since it gets
a semaphore for reading.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The problem is that scsi_run_queue is called from scsi_next_command()
after doing a scsi_put_command. If the command was the only thing
holding the reference on the scsi_device then the resulting device put
will tear down the block queue. Fix this by taking a reference to the
device and holding it around scsi_run_queue()
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This function has been superceeded by the block request based interfaces
and is unused (except for the uncompilable cpqfc driver).
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This should eliminate (at least in the mid layer) to make numeric
assumptions about any of the enumeration variables. As a side effect,
it will also make all the messages consistent and line us up nicely for
the error logging strategy (if it ever shows itself again).
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
When a request is deferred in scsi_init_io because the sg table could not
be allocated, the associated scsi_cmnd is not released and the request is
not marked with REQ_DONTPREP. When the command is retried, if
scsi_prep_fn decides to kill it then the scsi_cmnd will never be released.
This patch (as573) changes scsi_init_io so that it calls scsi_put_command
before deferring a request.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
We fix the oops by enforcing the host state model. There have also
been two extra states added: SHOST_CANCEL_RECOVERY and
SHOST_DEL_RECOVERY so we can take the model through host removal while
the recovery thread is active.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
I found one other thing that needs to be fixed. The call to
scsi_release_buffers in scsi_unprep_request causes an oops, because the
sgtable has already been freed in scsi_io_completion. The following patch
is needed.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
On Wed, 2005-09-14 at 18:06 +1000, Anton Blanchard wrote:
> And in particular it looks like the scsi_unprep_request in
> scsi_queue_insert is causing it. The following patch fixes the boot
> problems on the vscsi machine:
OK, my fault. Your fix is almost correct .. I was going to do this
eventually, honest, because there's no need to unprep and reprep a
command that comes in through scsi_queue_insert().
However, I decided to leave it in to exercise the scsi_unprep_request()
path just to make sure it was working. What's happening, I think, is
that we also use this path for retries. Since we kill and reget the
command each time, the retries decrement is never seen, so we're
retrying forever.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This fixes an issue in scsi command initialization from a request
where sd, sr, st, and scsi_lib all fail to copy the request's
cmd_len to the scsi command's cmd_len field.
Signed-off-by: Timothy Thelin <timothy.thelin@wdc.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
set DID_NO_CONNECT for the BLKPREP_KILL case and correct a few
BLKPREP_DEFER cases that weren't checking for the need to plug the
queue.
Signed-Off-By: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Actually, just one problem and one cosmetic fix:
1) We need to dequeue for the loop and kill case (it seems easiest
simply to dequeue in the scsi_kill_request() routine)
2) There's no real need to drop the queue lock. __scsi_done() is lock
agnostic, so since there's no requirement, let's just leave it in to
avoid any locking issues.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
From: Alan Stern <stern@rowland.harvard.edu>
This patch (as559b) adds a new routine, scsi_unprep_request, which
gets called every place a request is requeued. (That includes
scsi_queue_insert as well as scsi_requeue_command.) It also changes
scsi_kill_requests to make it call __scsi_done with result equal to
DID_NO_CONNECT << 16. (I'm not sure if it's necessary to call
scsi_init_cmd_errh here; maybe you can check on that.) Finally, the
patch changes the return value from scsi_end_request, to avoid
returning a stale pointer in the case where the request was requeued.
Fortunately the return value is used in only place, and the change
actually simplified it.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Rejections fixed up and
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
If a filesystem, while writing out data, decides that it is good
to issue a cache flush on a SCSI drive (or other 'sd' device), it will
call blkdev_issue_flush which calls ->issue_flush_fn which is
scsi_issue_flush_fn.
This calls sd_issue_flush which calls sd_sync_cache, which calls
scsi_execute_request.
This will (as sshdr != NULL) call
kmalloc(SCSI_SENSE_BUFFERSIZE, GFP_KERNEL)
If memory is tight, the presence of GFP_KERNEL may cause write
requests to be sent to some filesystem to free up memory, however if
that filesystem is waiting for the issue_flush_fn to complete, you
could get a deadlock.
I wonder if it might be more appropriate to use GFP_NOIO as in the
following patch.
I wonder if it might be even more appropriate to cope better with a
kmalloc failure, especially as in this use, sd_sync_cache only will
use the sense information to print out a more informative error
message.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
scsi_io_completion() can be a bit noisy about certain conditions.
Previously this wasn't a problem for internally generated commands,
since they never hit it. However, since we do all SCSI commands via
bios, now they do. user CD testers like magicdev are now getting not
ready messages every time they touch the CD to see if there's anything
in it.
Fix this by making all scsi_execute commands REQ_QUIET and making
scsi_finish_io() not say anything for REQ_QUIET.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The new bio code was incorrectly converted from stack allocated to
kmalloc'd buffer handling. There are two places where it incorrectly
uses sizeof(*sense) to get the size of the sense buffer. This
actually produces one, so no sense data was ever getting back, causing
failure in things like disk spin up.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Older gcc's require variable definitions at the beginning of a block.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This one removes struct scsi_request entirely from sd. In the process,
I noticed we have no callers of scsi_wait_req who don't immediately
normalise the sense, so I updated the API to make it take a struct
scsi_sense_hdr instead of simply a big sense buffer.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This one's slightly more difficult. The transport class uses
REQ_FAILFAST, so another interface (scsi_execute) had to be invented to
take the extra flag. Also, the sense functions are shifted around to
allow spi_execute to place data directly into a struct scsi_sense_hdr.
With this change, there's probably a lot of unnecessary sense buffer
allocation going on which we can fix later.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
After this, we just have some drivers, all the ULDs and the SPI
transport class using scsi_wait_req().
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Original From: Mike Christie <michaelc@cs.wisc.edu>
Add scsi_execute_req() as a replacement for scsi_wait_req()
Fixed up various pieces (added REQ_SPECIAL and caught req use after
free)
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Here's the proof of concept for this one. It converts scsi_wait_req to
do correct REQ_BLOCK_PC submission (and works nicely in my setup).
The final goal should be to eliminate struct scsi_request, but that
can't be done until the character submission paths of sg and st are also
modified.
There's some loss of functionality to this: retries are no longer
controllable (except by setting REQ_FASTFAIL) and the wait_req API needs
to be altered, but it looks very nice.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Migrate the current SCSI host state model to a model like SCSI
device is using.
Signed-off-by: Mike Anderson <andmike@us.ibm.com>
Rejections fixed up and
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
scsi_init_io calls scsi_alloc_sgtable and then calls blk_rq_map_sg
to initialize the scatterlist structure. blk_rq_map_sg() already
memset the structure for every new segment. That makes the memset
in scsi_alloc_sgtable unnecessary.
Patch to delete the extra memset in scsi_alloc_sgtable. Tested on
a x86_64 machine. Looks stable to me.
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The check in
627 BUG_ON(index > SG_MEMPOOL_NR);
with SG_MEMPOOL_NR defined in
32 #define SG_MEMPOOL_NR (sizeof(scsi_sg_pools)/sizeof(struct scsi_host_sg_pool))
was not sufficient.
sgp, set in
629 sgp = scsi_sg_pools + index;
is dereferenced in
630 mempool_free(sgl, sgp->pool);
Signed-off-by: Zaur Kambarov <zkambarov@coverity.com>
Cc: <linux-scsi@vger.kernel.org>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
We never look at it except for the old megaraid driver that abuses it
for sending internal commands. That usage can be fixed easily because
those internal commands are single-threaded by a mutex and we can easily
use a completion there.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Another rollup of patches which give various symbols static scope
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
scsi_queue_insert() has four callers. Three callers call with
timer disabled and one (the second invocation in
scsi_dispatch_cmd()) calls with timer activated.
scsi_queue_insert() used to always call scsi_delete_timer()
and ignore the return value. This results in race with timer
expiration. Remove scsi_delete_timer() call from
scsi_queue_insert() and make the caller delete timer and check
the return value.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
scsi_queue_insert() used to use blk_insert_request() for requeueing
requests. This depends on the unobvious behavior of
blk_insert_request() setting REQ_SPECIAL and REQ_SOFTBARRIER when
requeueing. This patch makes scsi_queue_insert() use
blk_requeue_request(). As REQ_SPECIAL means special requests and
REQ_SOFTBARRIER is automatically handled by blk layer now, no flag
needs to be set.
Note that scsi_queue_insert() now calls scsi_run_queue() itself, and
the prototype of the function is added right above
scsi_queue_insert(). This is temporary, as later requeue path
consolidation patchset removes scsi_queue_insert(). By adding
temporary prototype, we can do away with unnecessarily moving
functions.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
scsi_requeue_request() used to use blk_insert_request() for requeueing
requests. This depends on the unobvious behavior of
blk_insert_request() setting REQ_SPECIAL and REQ_SOFTBARRIER when
requeueing. This patch makes scsi_queue_insert() use
blk_requeue_request(). As REQ_SPECIAL means special requests and
REQ_SOFTBARRIER is automatically handled by blk layer now, no flag
needs to be set.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
blk_insert_request() has a unobivous feature of requeuing a
request setting REQ_SPECIAL|REQ_SOFTBARRIER. SCSI midlayer
was the only user and as previous patches removed the usage,
remove the feature from blk_insert_request(). Only special
requests should be queued with blk_insert_request(). All
requeueing should go through blk_requeue_request().
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
scsi_init_io() used to set REQ_SPECIAL when it fails sg
allocation before requeueing the request by returning
BLKPREP_DEFER. REQ_SPECIAL is being updated to mean special
requests. So, remove REQ_SPECIAL setting.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
scsi_cmnd->serial_number_at_timeout doesn't serve any purpose
anymore. All serial_number == serial_number_at_timeout tests
are always true in abort callbacks. Kill the field. Also, as
->pid always equals ->serial_number and ->serial_number
doesn't have any special meaning anymore, update comments
above ->serial_number accordingly. Once we remove all uses of
this field from all lldd's, this field should go.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
scsi_cmnd->internal_timeout field doesn't have any meaning
anymore. Kill the field.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The current problem seen is that the queue lock is actually in the
SCSI device structure, so when that structure is freed on device
release, we go boom if the queue tries to access the lock again.
The fix here is to move the lock from the scsi_device to the queue.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!