The current dc395x driver uses PIO to transfer up to 4 bytes which do not
get transferred by DMA (under unclear circumstances). For this the driver
uses page_address() which is broken on highmem. Apart from this the
actual calculation of the virtual address is wrong (even without highmem).
So, e.g., for reading it reads bytes from the driver to a wrong address
and returns wrong data, I guess, for writing it would just output random
data to the device.
The proper fix, as suggested by many, is to dynamically map data using
kmap_atomic(page, KM_BIO_SRC_IRQ) / kunmap_atomic(virt). The reason why it
has not been done until now, although I've done some preliminary patches
more than a year ago was that nobody interested in fixing this problem was
able to reliably reproduce it. Now it changed - with the help from
Sebastian Frei (CC'ed) I was able to trigger the PIO path. Thus, I was
also able to test and debug it.
There are 4 cases when PIO is used in dc395x - data-in / -out with and
without scatter-gather. I was able to reproduce and test only data-in with
and without SG. So, the data-out path is still untested, but it is also
somewhat simpler than the data-in. Fredrik Roubert (also CC'ed) also had
PIO triggering on his system, and in his case it was data-out without SG.
It would be great if he could test the attached patch on his system, but
even if he cannot, I would still request to apply the patch and just wait
if anybody cries...
Implementation: I put 2 new functions in scsi_lib.c and their declarations
in scsi_cmnd.h. I exported them without _GPL, although, I don't feel
strongly about that - not many drivers are likely to use them. But there
is at least one more - I want to use them in tmscsim.c. Whether these are
the right files for the functions and their declarations - not sure
either. Actually, they are not scsi-specific, so, might go somewhere
around other scattergather magic? They are not platform specific either,
and most SG functions are defined under arch/*/... As these issues were
discussed previously there were some more routines suggested to manipulate
scattergather buffers, I think, some of them were needed around
crypto code... So, might be a common place reasonable, like
lib/scattergather.c? I am open here.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
scsi_kill_request() completes requests via normal SCSI completion path
which decrements busy counts; however, requests which get passed to
scsi_kill_request() aren't holding busy counts and scsi_kill_request()
don't increment them before invoking completion path resulting in
incorrect busy counts. Bump up busy counts before invoking completion
path.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Modify well over a dozen mempool users to call mempool_create_slab_pool()
rather than calling mempool_create() with extra arguments, saving about 30
lines of code and increasing readability.
Signed-off-by: Matthew Dobson <colpatch@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
In order to use the new execute_in_process_context() API, you have to
provide it with the work storage, which I do in SCSI in scsi_device and
scsi_target, but which also means that we can no longer queue up the
target reaps, so instead I moved the target to a state model which
allows target_alloc to detect if we've received a dying target and wait
for it to be gone. Hopefully, this should also solve the target
namespace race.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Regardless what mode page was asked for, Initio INIC-14x0 and
INIC-2430 always return page 6 without mode page headers. Try to
recognise this as a special case in scsi_mode_sense and setting the
mode sense headers accordingly.
Signed-off-by: Al Viro <viro@ftp.linux.org.uk>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Change the core SCSI code to use kzalloc rather than kmalloc+memset
where possible.
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Fix up an off by one error in calculating retries for scsi
commands. This bug was discovered when an SG_IO request
was sent to scsi core with retries = 0, causing the overall
timeout check to go off in scsi_softirq_done.
Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
We have several points in the SCSI stack (primarily for our device
functions) where we need to guarantee process context, but (given the
place where the last reference was released) we cannot guarantee this.
This API gets around the issue by executing the function directly if
the caller has process context, but scheduling a workqueue to execute
in process context if the caller doesn't have it. Unfortunately, it
requires memory allocation in interrupt context, but it's better than
what we have previously. The true solution will require a bit of
re-engineering, so isn't appropriate for 2.6.16.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
When the scsi_execute_async interface was added it ended up reducing
the flexibility of userspace to send arbitrary scsi commands through
sg using SG_IO. The SG_IO interface allows userspace to specify the
CDB length. This is now ignored in scsi_execute_async and it is
guessed using the COMMAND_SIZE macro, which is not always correct,
particularly for vendor specific commands. This patch adds a cmd_len
parameter to the scsi_execute_async interface to allow the caller
to specify the length of the CDB.
Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
LLDDs should never see REQ_BLOCK_PC requests, we can handle them just
fine in the core code. There is a small behaviour change in that some
check in sr's rw_intr are bypassed, but I consider the old behaviour
a bug.
Mike found this cleanup opportunity and provdided early patches, so all
the credit goes to him, even if I redid the patches from scratch beause
that was easier than forward-porting the old patches.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch moves the SCSI softirq handling to the block layer version.
There should be no functional changes.
Signed-off-by: Jens Axboe <axboe@suse.de>
All ordered request related stuff delegated to HLD. Midlayer
now doens't deal with ordered setting or prepare_flush
callback. sd.c updated to deal with blk_queue_ordered
setting. Currently, ordered tag isn't used as SCSI midlayer
cannot guarantee request ordering.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jens Axboe <axboe@suse.de>
add @uptodate argument to end_that_request_last() and @error
to rq_end_io_fn(). there's no generic way to pass error code
to request completion function, making generic error handling
of non-fs request difficult (rq->errors is driver-specific and
each driver uses it differently). this patch adds @uptodate
to end_that_request_last() and @error to rq_end_io_fn().
for fs requests, this doesn't really matter, so just using the
same uptodate argument used in the last call to
end_that_request_first() should suffice. imho, this can also
help the generic command-carrying request jens is working on.
Signed-off-by: tejun heo <htejun@gmail.com>
Signed-Off-By: Jens Axboe <axboe@suse.de>
- export __blk_put_request and blk_execute_rq_nowait
needed for async REQ_BLOCK_PC requests
- seperate max_hw_sectors and max_sectors for block/scsi_ioctl.c and
SG_IO bio.c helpers per Jens's last comments. Since block/scsi_ioctl.c SG_IO was
already testing against max_sectors and SCSI-ml was setting max_sectors and
max_hw_sectors to the same value this does not change any scsi SG_IO behavior. It only
prepares ll_rw_blk.c, scsi_ioctl.c and bio.c for when SCSI-ml begins to set
a valid max_hw_sectors for all LLDs. Today if a LLD does not set it
SCSI-ml sets it to a safe default and some LLDs set it to a artificial low
value to overcome memory and feedback issues.
Note: Since we now cap max_sectors to BLK_DEF_MAX_SECTORS, which is 1024,
drivers that used to call blk_queue_max_sectors with a large value of
max_sectors will now see the fs requests capped to BLK_DEF_MAX_SECTORS.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Add kmemcache of scsi io contexts.
In the future when we finalize on where these functions will live
we can add a mempool for it and do a bioset for out REQ_BLOCK_PC
bios. This is needed becuase the dm-multipath handlers will
want to use the scsi_exectute* functions for failover and we cannot
have them and the bio device allocating from the same mempool.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
sd does not allow scsi_io_completion to retry commands for
SG_IO requests, and it make sense that it should not happen for st
SG_IO commands too. If for st we hit the bottom of scsi_io_completion
we will probably screw things up pretty bad. This patch returns to the
block layer that the whole command completed and relies on the caller to check
the request errors field. For initialization commands like in sd, this adds
the previous behavior where scsi_io_completion did not process the error.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
For tape we need to control the retries. This patch adds a retries
counter on the request for REQ_BLOCK_PC commands originating from
scsi_execute* to use. REQ_BLOCK_PC commands comming from the block
layer SG_IO path continue to use the retires set in the ULD init_command.
(scsi_execute* does not set the gendisk so we do not execute
the init_command in that path).
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Add scsi helpers to create really-large-requests and convert
scsi-ml to scsi_execute_async().
Per Jens's previous comments, I placed this function in scsi_lib.c.
I made it follow all the queue's limits - I think I did at least :), so
I removed the warning on the function header.
I think the scsi_execute_* functions should eventually take a request_queue
and be placed some place where the dm-multipath hw_handler can use them
if that failover code is going to stay in the kernel. That conversion
patch will be sent in another mail though.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This reverts commit 1b0997f561, which in
turn reverted 34ea80ec6a (which is thus
re-instated).
Quoth James Bottomley:
"All it's doing is deferring the device_put() from the
scsi_put_command() to after the scsi_run_queue(), which doesn't fix
the sleep while atomic problem of the device release method. In both
cases we still get the semaphore in atomic context problem which is
caused by scsi_reap_target() doing a device_del(), which I assumed
(wrongly) was valid from atomic context."
who also promised to fix scsi_reap_target().
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This reverts commit 34ea80ec6a.
It does a put_device() from softirq context, which is bad since it gets
a semaphore for reading.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The problem is that scsi_run_queue is called from scsi_next_command()
after doing a scsi_put_command. If the command was the only thing
holding the reference on the scsi_device then the resulting device put
will tear down the block queue. Fix this by taking a reference to the
device and holding it around scsi_run_queue()
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This function has been superceeded by the block request based interfaces
and is unused (except for the uncompilable cpqfc driver).
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This should eliminate (at least in the mid layer) to make numeric
assumptions about any of the enumeration variables. As a side effect,
it will also make all the messages consistent and line us up nicely for
the error logging strategy (if it ever shows itself again).
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
When a request is deferred in scsi_init_io because the sg table could not
be allocated, the associated scsi_cmnd is not released and the request is
not marked with REQ_DONTPREP. When the command is retried, if
scsi_prep_fn decides to kill it then the scsi_cmnd will never be released.
This patch (as573) changes scsi_init_io so that it calls scsi_put_command
before deferring a request.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
We fix the oops by enforcing the host state model. There have also
been two extra states added: SHOST_CANCEL_RECOVERY and
SHOST_DEL_RECOVERY so we can take the model through host removal while
the recovery thread is active.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
I found one other thing that needs to be fixed. The call to
scsi_release_buffers in scsi_unprep_request causes an oops, because the
sgtable has already been freed in scsi_io_completion. The following patch
is needed.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
On Wed, 2005-09-14 at 18:06 +1000, Anton Blanchard wrote:
> And in particular it looks like the scsi_unprep_request in
> scsi_queue_insert is causing it. The following patch fixes the boot
> problems on the vscsi machine:
OK, my fault. Your fix is almost correct .. I was going to do this
eventually, honest, because there's no need to unprep and reprep a
command that comes in through scsi_queue_insert().
However, I decided to leave it in to exercise the scsi_unprep_request()
path just to make sure it was working. What's happening, I think, is
that we also use this path for retries. Since we kill and reget the
command each time, the retries decrement is never seen, so we're
retrying forever.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This fixes an issue in scsi command initialization from a request
where sd, sr, st, and scsi_lib all fail to copy the request's
cmd_len to the scsi command's cmd_len field.
Signed-off-by: Timothy Thelin <timothy.thelin@wdc.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
set DID_NO_CONNECT for the BLKPREP_KILL case and correct a few
BLKPREP_DEFER cases that weren't checking for the need to plug the
queue.
Signed-Off-By: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Actually, just one problem and one cosmetic fix:
1) We need to dequeue for the loop and kill case (it seems easiest
simply to dequeue in the scsi_kill_request() routine)
2) There's no real need to drop the queue lock. __scsi_done() is lock
agnostic, so since there's no requirement, let's just leave it in to
avoid any locking issues.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
From: Alan Stern <stern@rowland.harvard.edu>
This patch (as559b) adds a new routine, scsi_unprep_request, which
gets called every place a request is requeued. (That includes
scsi_queue_insert as well as scsi_requeue_command.) It also changes
scsi_kill_requests to make it call __scsi_done with result equal to
DID_NO_CONNECT << 16. (I'm not sure if it's necessary to call
scsi_init_cmd_errh here; maybe you can check on that.) Finally, the
patch changes the return value from scsi_end_request, to avoid
returning a stale pointer in the case where the request was requeued.
Fortunately the return value is used in only place, and the change
actually simplified it.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Rejections fixed up and
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
If a filesystem, while writing out data, decides that it is good
to issue a cache flush on a SCSI drive (or other 'sd' device), it will
call blkdev_issue_flush which calls ->issue_flush_fn which is
scsi_issue_flush_fn.
This calls sd_issue_flush which calls sd_sync_cache, which calls
scsi_execute_request.
This will (as sshdr != NULL) call
kmalloc(SCSI_SENSE_BUFFERSIZE, GFP_KERNEL)
If memory is tight, the presence of GFP_KERNEL may cause write
requests to be sent to some filesystem to free up memory, however if
that filesystem is waiting for the issue_flush_fn to complete, you
could get a deadlock.
I wonder if it might be more appropriate to use GFP_NOIO as in the
following patch.
I wonder if it might be even more appropriate to cope better with a
kmalloc failure, especially as in this use, sd_sync_cache only will
use the sense information to print out a more informative error
message.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
scsi_io_completion() can be a bit noisy about certain conditions.
Previously this wasn't a problem for internally generated commands,
since they never hit it. However, since we do all SCSI commands via
bios, now they do. user CD testers like magicdev are now getting not
ready messages every time they touch the CD to see if there's anything
in it.
Fix this by making all scsi_execute commands REQ_QUIET and making
scsi_finish_io() not say anything for REQ_QUIET.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The new bio code was incorrectly converted from stack allocated to
kmalloc'd buffer handling. There are two places where it incorrectly
uses sizeof(*sense) to get the size of the sense buffer. This
actually produces one, so no sense data was ever getting back, causing
failure in things like disk spin up.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Older gcc's require variable definitions at the beginning of a block.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This one removes struct scsi_request entirely from sd. In the process,
I noticed we have no callers of scsi_wait_req who don't immediately
normalise the sense, so I updated the API to make it take a struct
scsi_sense_hdr instead of simply a big sense buffer.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This one's slightly more difficult. The transport class uses
REQ_FAILFAST, so another interface (scsi_execute) had to be invented to
take the extra flag. Also, the sense functions are shifted around to
allow spi_execute to place data directly into a struct scsi_sense_hdr.
With this change, there's probably a lot of unnecessary sense buffer
allocation going on which we can fix later.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
After this, we just have some drivers, all the ULDs and the SPI
transport class using scsi_wait_req().
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Original From: Mike Christie <michaelc@cs.wisc.edu>
Add scsi_execute_req() as a replacement for scsi_wait_req()
Fixed up various pieces (added REQ_SPECIAL and caught req use after
free)
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Here's the proof of concept for this one. It converts scsi_wait_req to
do correct REQ_BLOCK_PC submission (and works nicely in my setup).
The final goal should be to eliminate struct scsi_request, but that
can't be done until the character submission paths of sg and st are also
modified.
There's some loss of functionality to this: retries are no longer
controllable (except by setting REQ_FASTFAIL) and the wait_req API needs
to be altered, but it looks very nice.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Migrate the current SCSI host state model to a model like SCSI
device is using.
Signed-off-by: Mike Anderson <andmike@us.ibm.com>
Rejections fixed up and
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>