Commit Graph

6 Commits

Author SHA1 Message Date
Mike Marciniszyn
c0af2c057d IB/qib: Prevent double completions after a timeout or RNR error
There is a double completion associated with error handling for RC QPs.

The sequence is:

 - The do_rc_ack() routine fields an RNR nack and there are 0
   rnr_retries configured on the QP.
 - qib_error_qp() stops the pending timer
 - qib_rc_send_complete() is called from sdma_complete()
 - qib_rc_send_complete() starts the timer because the msb of the psn
   just completed says an ack is needed.
 - a bunch of flushes occur as ipoib posts WQEs to an error'ed QP
 - rc_timeout() calls qib_restart_rc()
 - qib_restart_rc() calls qib_send_complete() with a
   IB_WC_RETRY_EXC_ERR on a wqe that has already been completed in the
   past

The fix avoids starting the timer since another packet will never
arrive.

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-02-17 14:04:50 -08:00
Mike Marciniszyn
414ed90cee IB/qib: Fix double add_timer()
The following panic BUG_ON occurs during qib testing:

    Kernel BUG at include/linux/timer.h:82

    RIP  [<ffffffff881f7109>] :ib_qib:start_timer+0x73/0x89
     RSP <ffffffff80425bd0>
     <0>Kernel panic - not syncing: Fatal exception
     <0>Dumping qib trace buffer from panic
    qib_set_lid INFO: IB0:1 got a lid: 0xf8
    Done dumping qib trace buffer
    BUG: warning at kernel/panic.c:137/panic() (Tainted: G

The flaw is due to a missing state test when processing responses that
results in an add_timer() call when the same timer is already queued.
This code was executing in parallel with a QP destroy on another CPU
that had changed the state to reset, but the missing test caused to
response handling code to run on into the panic.

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-02-10 11:24:08 -08:00
Mike Marciniszyn
dd04e43d46 IB/qib: Unnecessary delayed completions on RC connection
Currently on receipt of a response message (ACKs, RDMA Response,
Atomic Responses etc.) if the SDMA completion counter is not advanced
the driver delays the completion of the WQE.  In most cases this is
overly pessimistic as the response (ACK) to a previously transmitted
send implies that the send is complete.  Ensure that SDMA queue is
progressed appropriately before determining if a send has delayed
completions.

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2011-01-10 17:42:22 -08:00
Jason Gunthorpe
5715f5d44b IB/qib: Process RDMA WRITE ONLY with IMMEDIATE properly
See table 35 in IBA - the header order for RDMA_WRITE_ONLY_WITH_IMMEDIATE
and SEND_LAST_WITH_IMMEDIATE is different: the RDMA_WRITE_ONLY has
a RETH header before the immediate data, so we need a different code path
to extract the immediate data.

I tested this with a userspace app that does RDMA_WRITE with immediate
on a QLE7140.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-10-22 22:12:15 -07:00
Ralph Campbell
a5210c12b7 IB/qib: Fix race between qib_error_qp() and receive packet processing
When transitioning a QP to the error state, in progress RWQEs need to
be marked complete.  This also involves releasing the reference count
to the memory regions referenced in the SGEs.  The locking in the
receive packet processing wasn't sufficient to prevent qib_error_qp()
from modifying the r_sge state at the same time, thus leading to
kernel panics.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-08-03 13:59:47 -07:00
Ralph Campbell
f931551baf IB/qib: Add new qib driver for QLogic PCIe InfiniBand adapters
Add a low-level IB driver for QLogic PCIe adapters.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-23 21:44:54 -07:00