Commit Graph

81439 Commits

Author SHA1 Message Date
David Moore
bcee893c6c firewire: fw-ohci: Bug fixes for packet-per-buffer support
This patch corrects a number of bugs in the current OHCI 1.0
packet-per-buffer support:

1. Correctly deal with payloads that cross a page boundary.  The
previous version would not split the descriptor at such a boundary,
potentially corrupting unrelated memory.

2. Allow user-space to specify multiple packets per struct
fw_cdev_iso_packet in the same way that dual-buffer allows.  This is
signaled by header_length being a multiple of header_size.  This
multiple determines the number of packets.  The payload size allocated
per packet is determined by dividing the total payload size by the
number of packets.

3. Make sync support work properly for packet-per-buffer.

I have tested this patch with libdc1394 by forcing my OHCI 1.1
controller to use the packet-per-buffer support instead of dual-buffer.

I would greatly appreciate testing by those who have a DV devices and
other types of iso streamers to make sure I didn't cause any
regressions.

Stefan, with this patch, I'm hoping that libdc1394 will work with all
your OHCI 1.0 controllers now.

The one bit of future work that remains for packet-per-buffer support is
the automatic compaction of short payloads that I discussed with
Kristian.

Signed-off-by: David Moore <dcm@acm.org>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2008-01-30 22:22:23 +01:00
David Moore
0642b6577f firewire: fw-ohci: Fix for dualbuffer three-or-more buffers
This patch fixes the problem where different OHCI 1.1 controllers behave
differently when a received iso packet straddles three or more buffers
when using the dual-buffer receive mode.  Two changes are made in order
to handle this situation:

1. The packet sync DMA descriptor is given a non-zero header length and
non-zero payload length.  This is because zero-payload descriptors are
not discussed in the OHCI 1.1 specs and their behavior is thus
undefined.  Instead we use a header size just large enough for a single
header and a payload length of 4 bytes for this first descriptor.

2. As we process received packets in the context's tasklet, read the
packet length out of the headers.  Keep track of the running total of
the packet length as "excess_bytes", so we can ignore any descriptors
where no packet starts or ends.  These descriptors may not have had
their first_res_count or second_res_count fields updated by the
controller so we cannot rely on those values.

The main drawback of this patch is that the excess_bytes value might get
"out of sync" with the packet descriptors if something strange happens
to the DMA program.  I'm not if such a thing could ever happen, but I
appreciate any suggestions in making it more robust.

Also, the packet-per-buffer support may need a similar fix to deal with
issue 1, but I haven't done any work on that yet.

Stefan, I'm hoping that with this patch, all your OHCI 1.1 controllers
will work properly with an unmodified version of libdc1394.

Signed-off-by: David Moore <dcm@acm.org>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2008-01-30 22:22:23 +01:00
Stefan Richter
4b11ea96a0 firewire: fw-sbp2: remove unused misleading macro
SBP2_MAX_SECTORS is nowhere used in fw-sbp2.
It merely got copied over from sbp2 where it played a role in the past.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2008-01-30 22:22:22 +01:00
Stefan Richter
b7811da2d9 firewire: fw-sbp2: prepare for s/g chaining
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2008-01-30 22:22:22 +01:00
Stefan Richter
285838eb22 firewire: fw-sbp2: refactor workq and kref handling
This somewhat reduces the size of firewire-sbp2.ko.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2008-01-30 22:22:21 +01:00
Stefan Richter
85c5798b09 ieee1394: ohci1394: don't schedule IT tasklets on IR events
Bug noted by Pieter Palmers:  Isochronous transmit tasklets were
scheduled on isochronous receive events, in addition to the proper
isochronous receive tasklets.

http://marc.info/?l=linux1394-devel&m=119783196222802

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2008-01-30 22:22:21 +01:00
Stefan Richter
4e6343a10b ieee1394: sbp2: raise default transfer size limit
This patch speeds up sbp2 a little bit --- but more importantly, it
brings the behavior of sbp2 and fw-sbp2 closer to each other.  Like
fw-sbp2, sbp2 now does not limit the size of single transfers to 255
sectors anymore, unless told so by a blacklist flag or by module load
parameters.

Only very old bridge chips have been known to need the 255 sectors
limit, and we have got one such chip in our hardwired blacklist.  There
certainly is a danger that more bridges need that limit; but I prefer to
have this issue present in both fw-sbp2 and sbp2 rather than just one of
them.

An OXUF922 with 400GB 7200RPM disk on an S400 controller is sped up by
this patch from 22.9 to 23.5 MB/s according to hdparm.  The same effect
could be achieved before by setting a higher max_sectors module
parameter.  On buses which use 1394b beta mode, sbp2 and fw-sbp2 will
now achieve virtually the same bandwidth.  Fw-sbp2 only remains faster
on 1394a buses due to fw-core's gap count optimization.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2008-01-30 22:22:21 +01:00
Stefan Richter
3e75b493fb ieee1394: remove unused code
The code has been in "#if 0 - #endif" since Linux 2.6.12.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2008-01-30 22:22:20 +01:00
Stefan Richter
c7ea990f87 ieee1394: small cleanup after "nopage"
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2008-01-30 22:22:20 +01:00
Nick Piggin
61db81214b ieee1394: nopage
Convert ieee1394 from nopage to fault.
Remove redundant vma range checks (correct resource range check is retained).

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2008-01-30 22:22:19 +01:00
Joe Perches
a5c52df8bc ieee1394: Add missing "space"
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2008-01-30 22:22:19 +01:00
Stefan Richter
825f1df545 ieee1394: sbp2: s/g list access cosmetics
Replace sg->length by sg_dma_len(sg).  Rename a variable for shorter
line lengths and eliminate some superfluous local variables.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2008-01-30 22:22:19 +01:00
Stefan Richter
8c4ac0949f ieee1394: sbp2: prepare for s/g chaining
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2008-01-30 22:22:18 +01:00
James Bottomley
203a512f09 [SCSI] Revert "[SCSI] aacraid: fib context lock for management ioctls"
This reverts commit a119ee8ee3.

Adaptec found this was causing system lockups.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:14:26 -06:00
James Bottomley
40f620286d [SCSI] bsg: copy the cmd_type field to the subordinate request for bidi
This fixes a problem in SCSI where we use the (previously
uninitialised) cmd_type via blk_pc_request() to set up the transfer in
scsi_init_sgtable().

Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:14:26 -06:00
FUJITA Tomonori
3d9dd6eef8 [SCSI] handle scsi_init_queue failure properly
scsi_init_queue is expected to clean up allocated things when it
fails.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:14:25 -06:00
FUJITA Tomonori
b172b6e99e [SCSI] destroy scsi_bidi_sdb_cache in scsi_exit_queue
Needs to call kmem_cache_destroy for scsi_bidi_sdb_cache in
scsi_exit_queue.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:14:25 -06:00
FUJITA Tomonori
c639d14e2f [SCSI] scsi_debug: add XDWRITEREAD_10 support
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Douglas Gilbert <dougg@torque.net>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:14:25 -06:00
FUJITA Tomonori
072d0bb3ce [SCSI] scsi_debug: add bidi data transfer support
This enables fill_from_dev_buffer and fetch_to_dev_buffer to handle
bidi commands.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Douglas Gilbert <dougg@torque.net>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:14:24 -06:00
FUJITA Tomonori
3de9f94479 [SCSI] scsi_debug: add get_data_transfer_info helper function
This adds get_data_transfer_info helper function that get lha and
sectors for READ_* and WRITE_* commands (and XDWRITEREAD_10 later).

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Douglas Gilbert <dougg@torque.net>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:14:24 -06:00
James Bottomley
d3f46f39b7 [SCSI] remove use_sg_chaining
With the sg table code, every SCSI driver is now either chain capable
or broken (or has sg_tablesize set so chaining is never activated), so
there's no need to have a check in the host template.

Also tidy up the code by moving the scatterlist size defines into the
SCSI includes and permit the last entry of the scatterlist pools not
to be a power of two.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:14:02 -06:00
Kiyoshi Ueda
b8de163184 [SCSI] bidirectional: fix up for the new blk_end_request code
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:03:41 -06:00
Boaz Harrosh
6f9a35e2da [SCSI] bidirectional command support
At the block level bidi request uses req->next_rq pointer for a second
bidi_read request.
At Scsi-midlayer a second scsi_data_buffer structure is used for the
bidi_read part. This bidi scsi_data_buffer is put on
request->next_rq->special. Struct scsi_cmnd is not changed.

- Define scsi_bidi_cmnd() to return true if it is a bidi request and a
  second sgtable was allocated.

- Define scsi_in()/scsi_out() to return the in or out scsi_data_buffer
  from this command This API is to isolate users from the mechanics of
  bidi.

- Define scsi_end_bidi_request() to do what scsi_end_request() does but
  for a bidi request. This is necessary because bidi commands are a bit
  tricky here. (See comments in body)

- scsi_release_buffers() will also release the bidi_read scsi_data_buffer

- scsi_io_completion() on bidi commands will now call
  scsi_end_bidi_request() and return.

- The previous work done in scsi_init_io() is now done in a new
  scsi_init_sgtable() (which is 99% identical to old scsi_init_io())
  The new scsi_init_io() will call the above twice if needed also for
  the bidi_read command. Only at this point is a command bidi.

- In scsi_error.c at scsi_eh_prep/restore_cmnd() make sure bidi-lld is not
  confused by a get-sense command that looks like bidi. This is done
  by puting NULL at request->next_rq, and restoring.

[jejb: update to sg_table and resolve conflicts
also update to blk-end-request and resolve conflicts]

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:03:41 -06:00
Boaz Harrosh
30b0c37b27 [SCSI] implement scsi_data_buffer
In preparation for bidi we abstract all IO members of scsi_cmnd,
that will need to duplicate, into a substructure.

- Group all IO members of scsi_cmnd into a scsi_data_buffer
  structure.
- Adjust accessors to new members.
- scsi_{alloc,free}_sgtable receive a scsi_data_buffer instead of
  scsi_cmnd. And work on it.
- Adjust scsi_init_io() and  scsi_release_buffers() for above
  change.
- Fix other parts of scsi_lib/scsi.c to members migration. Use
  accessors where appropriate.

- fix Documentation about scsi_cmnd in scsi_host.h

- scsi_error.c
  * Changed needed members of struct scsi_eh_save.
  * Careful considerations in scsi_eh_prep/restore_cmnd.

- sd.c and sr.c
  * sd and sr would adjust IO size to align on device's block
    size so code needs to change once we move to scsi_data_buff
    implementation.
  * Convert code to use scsi_for_each_sg
  * Use data accessors where appropriate.

- tgt: convert libsrp to use scsi_data_buffer

- isd200: This driver still bangs on scsi_cmnd IO members,
  so need changing

[jejb: rebased on top of sg_table patches fixed up conflicts
and used the synergy to eliminate use_sg and sg_count]

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:03:40 -06:00
Boaz Harrosh
bb52d82f45 [SCSI] tgt: use scsi_init_io instead of scsi_alloc_sgtable
If we export scsi_init_io()/scsi_release_buffers() instead of
scsi_{alloc,free}_sgtable() from scsi_lib than tgt code is much more
insulated from scsi_lib changes. As a bonus it will also gain bidi
capability when it comes.

[jejb: rebase on to sg_table and fix up rejections]

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:03:40 -06:00
FUJITA Tomonori
03e7925d07 [SCSI] aic7xxx: fix warnings with CONFIG_PM disabled
CC [M]  drivers/scsi/aic7xxx/aic7xxx_osm_pci.o
drivers/scsi/aic7xxx/aic7xxx_osm_pci.c:148: warning: 'ahc_linux_pci_dev_suspend' defined but not used
drivers/scsi/aic7xxx/aic7xxx_osm_pci.c:166: warning: 'ahc_linux_pci_dev_resume' defined but not used

This moves aic7xxx_pci_driver struct, removes some forward declarations,
and adds some ifdef CONFIG_PM.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:03:40 -06:00
FUJITA Tomonori
67eb63364e [SCSI] aic79xx: fix warnings with CONFIG_PM disabled
CC [M]  drivers/scsi/aic7xxx/aic79xx_osm_pci.o
drivers/scsi/aic7xxx/aic79xx_osm_pci.c:101: warning: 'ahd_linux_pci_dev_suspend' defined but not used
drivers/scsi/aic7xxx/aic79xx_osm_pci.c:121: warning: 'ahd_linux_pci_dev_resume' defined but not used

This moves aic79xx_pci_driver struct, removes some forward
declarations, and adds some ifdef CONFIG_PM.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:03:40 -06:00
David Milburn
969ceffb66 [SCSI] aic7xxx: fix ahc_done check SCB_ACTIVE for tagged transactions
The driver only needs to check the SCB_ACTIVE flag if the SCB is not
in the untagged queue.

If the driver is in error recovery, you may end panic'ing on a TUR
that is in the untagged queue.

Attempting to queue an ABORT message
CDB: 0x0 0x0 0x0 0x0 0x0 0x0
SCB 3 done'd twice

This patch is included in Adaptec's 6.3.11 driver on their website.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:03:39 -06:00
Thomas Bogendoerfer
2adbfa333a [SCSI] sgiwd93: use cached memory access to make driver work on IP28
SGI IP28 machines would need special treatment (enable adding addtional
wait states) when accessing memory uncached. To avoid this pain I
changed the driver to use only cached access to memory.

Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:03:39 -06:00
FUJITA Tomonori
9d058ecfd4 [SCSI] zfcp: fix sense_buffer access bug
The commit de25deb180 changed
scsi_cmnd.sense_buffer from a static array to a dynamically allocated
buffer. We can't access to sense_buffer in '&cmd->sense_buffer' way.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:03:39 -06:00
FUJITA Tomonori
149d6bafc4 [SCSI] ncr53c8xx: fix sense_buffer access bug
The commit de25deb180 changed
scsi_cmnd.sense_buffer from a static array to a dynamically allocated
buffer. We can't access to sense_buffer in '&cmd->sense_buffer' way.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:03:39 -06:00
FUJITA Tomonori
c1c9ce52c8 [SCSI] aic79xx: fix sense_buffer access bug
The commit de25deb180 changed
scsi_cmnd.sense_buffer from a static array to a dynamically allocated
buffer. We can't access to sense_buffer in '&cmd->sense_buffer' way.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:03:39 -06:00
FUJITA Tomonori
c372f4a82f [SCSI] hptiop: fix sense_buffer access bug
&cmnd->sense_buffer now zeroes the wrong thing.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:03:38 -06:00
Nathan Lynch
de15c2017c [SCSI] sym53c8xx: fix bad memset argument in sym_set_cam_result_error
On a big powerpc box I got the following oops with 2.6.24-git2:

sym0: <1010-66> rev 0x1 at pci 0000:d0:01.0 irq 215
sym0: No NVRAM, ID 7, Fast-80, LVD, parity checking
sym0: SCSI BUS has been reset.
scsi0 : sym-2.2.3
 target0:0:8: FAST-40 WIDE SCSI 80.0 MB/s ST (25 ns, offset 31)
scsi 0:0:8:0: Direct-Access     IBM      ST318305LC       C509 PQ: 0
ANSI: 3
 target0:0:8: tagged command queuing enabled, command queue depth 16.
 target0:0:8: Beginning Domain Validation
 target0:0:8: asynchronous
 target0:0:8: wide asynchronous
 target0:0:8: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 31)
 target0:0:8: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 31)
Unable to handle kernel paging request for data at address 0x00000000
Faulting instruction address: 0xc000000000038460
cpu 0x25: Vector: 300 (Data Access) at [c00000000f567840]
    pc: c000000000038460: .memcpy+0x60/0x280
    lr: d000000000050280: .sym_set_cam_result_error+0xfc/0x1e0 [sym53c8xx]
    sp: c00000000f567ac0
   msr: 8000000000009032
   dar: 0
 dsisr: 42000000
  current = 0xc000006d1e0af0a0
  paca    = 0xc0000000004afc00
    pid   = 0, comm = swapper
enter ? for help
[link register   ] d000000000050280
.sym_set_cam_result_error+0xfc/0x1e0 [sym53c8xx]
[c00000000f567ac0] c00000000f567b80 (unreliable)
[c00000000f567b80] d0000000000552b8 .sym_complete_error+0x12c/0x1bc [sym53c8xx]
[c00000000f567c20] d0000000000561a4 .sym_int_sir+0xaa4/0x1718 [sym53c8xx]
[c00000000f567d00] d000000000057e8c .sym_interrupt+0x4e4/0x6ec [sym53c8xx]
[c00000000f567dc0] d00000000004fdf4 .sym53c8xx_intr+0x6c/0xdc [sym53c8xx]
[c00000000f567e50] c0000000000a83e0 .handle_IRQ_event+0x7c/0xec
[c00000000f567ef0] c0000000000aa344 .handle_fasteoi_irq+0x130/0x1f0
[c00000000f567f90] c00000000002a538 .call_handle_irq+0x1c/0x2c
[c000004d5e0b3a90] c00000000000c320 .do_IRQ+0x108/0x1d0
[c000004d5e0b3b20] c000000000004790 hardware_interrupt_entry+0x18/0x1c

The memset() in sym_set_cam_result_error() would appear to be trashing
the scsi_cmnd struct instead of clearing sense_buffer.

Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-30 13:03:38 -06:00
Denis Cheng
0fe410d3f3 dlm: static initialization improvements
also change name_prefix from char pointer to char array.

Signed-off-by: Denis Cheng <crquan@gmail.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2008-01-30 11:04:43 -06:00
David Teigland
dbcfc34733 dlm: clean ups
A couple small clean-ups.  Remove unnecessary wrapper-functions in
rcom.c, and remove unnecessary casting and an unnecessary ASSERT in
util.c.

Signed-off-by: David Teigland <teigland@redhat.com>
2008-01-30 11:04:43 -06:00
Patrick Caulfeld
2a79289e87 dlm: Sanity check namelen before copying it
The 32/64 compatibility code in the DLM does not check the validity of
the lock name length passed into it, so it can easily overwrite memory
if the value is rubbish (as early versions of libdlm can cause with
unlock calls, it doesn't zero the field).

This patch restricts the length of the name to the amount of data
actually passed into the call.

Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2008-01-30 11:04:43 -06:00
David Teigland
85f0379aa0 dlm: keep cached master rsbs during recovery
To prevent the master of an rsb from changing rapidly, an unused rsb is kept
on the "toss list" for a period of time to be reused.  The toss list was
being cleared completely for each recovery, which is unnecessary.  Much of
the benefit of the toss list can be maintained if nodes keep rsb's in their
toss list that they are the master of.  These rsb's need to be included
when the resource directory is rebuilt during recovery.

Signed-off-by: David Teigland <teigland@redhat.com>
2008-01-30 11:04:43 -06:00
David Teigland
594199ebaa dlm: change error message to debug
The invalid lockspace messages are normal and can appear relatively
often.  They should be suppressed without debugging enabled.

Signed-off-by: David Teigland <teigland@redhat.com>
2008-01-30 11:04:43 -06:00
David Teigland
ce5246b972 dlm: fix possible use-after-free
The dlm_put_lkb() can free the lkb and its associated ua structure,
so we can't depend on using the ua struct after the put.

Signed-off-by: David Teigland <teigland@redhat.com>
2008-01-30 11:04:42 -06:00
David Teigland
755b5eb8ba dlm: limit dir lookup loop
In a rare case we may need to repeat a local resource directory lookup
due to a race with removing the rsb and removing the resdir record.
We'll never need to do more than a single additional lookup, though,
so the infinite loop around the lookup can be removed.  In addition
to being unnecessary, the infinite loop is dangerous since some other
unknown condition may appear causing the loop to never break.

Signed-off-by: David Teigland <teigland@redhat.com>
2008-01-30 11:04:42 -06:00
David Teigland
42dc1601a9 dlm: reject normal unlock when lock is waiting for lookup
Non-forced unlocks should be rejected if the lock is waiting on the
rsb_lookup list for another lock to establish the master node.

Signed-off-by: David Teigland <teigland@redhat.com>
2008-01-30 11:04:42 -06:00
David Teigland
c54e04b00f dlm: validate messages before processing
There was some hit and miss validation of messages that has now been
cleaned up and unified.  Before processing a message, the new
validate_message() function checks that the lkb is the appropriate type,
process-copy or master-copy, and that the message is from the correct
nodeid for the the given lkb.  Other checks and assertions on the
lkb type and nodeid have been removed.  The assertions were particularly
bad since they would panic the machine instead of just ignoring the bad
message.

Although other recent patches have made processing old message unlikely,
it still may be possible for an old message to be processed and caught
by these checks.

Signed-off-by: David Teigland <teigland@redhat.com>
2008-01-30 11:04:42 -06:00
David Teigland
46b43eed70 dlm: reject messages from non-members
Messages from nodes that are no longer members of the lockspace should be
ignored.  When nodes are removed from the lockspace, recovery can
sometimes complete quickly enough that messages arrive from a removed node
after recovery has completed.  When processed, these messages would often
cause an error message, and could in some cases change some state, causing
problems.

Signed-off-by: David Teigland <teigland@redhat.com>
2008-01-30 11:04:42 -06:00
David Teigland
aec64e1be2 dlm: another call to confirm_master in receive_request_reply
When a failed request (EBADR or ENOTBLK) is unlocked/canceled instead of
retried, there may be other lkb's waiting on the rsb_lookup list for it
to complete.  A call to confirm_master() is needed to move on to the next
waiting lkb since the current one won't be retried.

Signed-off-by: David Teigland <teigland@redhat.com>
2008-01-30 11:04:42 -06:00
David Teigland
601342ce02 dlm: recover locks waiting for overlap replies
When recovery looks at locks waiting for replies, it fails to consider
locks that have already received a reply for their first remote operation,
but not received a reply for secondary, overlapping unlock/cancel.  The
appropriate stub reply needs to be called for these waiters.

Appears when we start doing recovery in the presence of a many overlapping
unlock/cancel ops.

Signed-off-by: David Teigland <teigland@redhat.com>
2008-01-30 11:04:42 -06:00
David Teigland
8a358ca8e7 dlm: clear ast_type when removing from astqueue
The lkb_ast_type field indicates whether the lkb is on the astqueue list.
When clearing locks for a process, lkb's were being removed from the astqueue
list without clearing the field.  If release_lockspace then happened
immediately afterward, it could try to remove the lkb from the list a second
time.

Appears when process calls libdlm dlm_release_lockspace() which first
closes the ls dev triggering clear_proc_locks, and then removes the ls
(a write to control dev) causing release_lockspace().

Signed-off-by: David Teigland <teigland@redhat.com>
2008-01-30 11:04:42 -06:00
David Teigland
861e2369e9 dlm: use fixed errno values in messages
Some errno values differ across platforms. So if we return things like
-EINPROGRESS from one node it can get misinterpreted or rejected on
another one.

This patch fixes up the errno values passed on the wire so that they
match the x86 ones (so as not to break the protocol), and re-instates
the platform-specific ones at the other end.

Many thanks to Fabio for testing this patch.
Initial patch from Patrick.

Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Fabio M. Di Nitto <fabbione@ubuntu.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2008-01-30 11:04:42 -06:00
Fabio M. Di Nitto
550283e30c dlm: swap bytes for rcom lock reply
DLM_RCOM_LOCK_REPLY messages need byte swapping.

Signed-off-by: Fabio M. Di Nitto <fabbione@ubuntu.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2008-01-30 11:04:41 -06:00
Fabio M. Di Nitto
e7847d35ac dlm: align midcomms message buffer
gcc does not guarantee that an auto buffer is 64bit aligned.
This change allows sparc64 to work.

Signed-off-by: Fabio M. Di Nitto <fabbione@ubuntu.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2008-01-30 11:04:25 -06:00