Commit Graph

467 Commits

Author SHA1 Message Date
Christoph Paasch
1a2c6181c4 tcp: Remove TCPCT
TCPCT uses option-number 253, reserved for experimental use and should
not be used in production environments.
Further, TCPCT does not fully implement RFC 6013.

As a nice side-effect, removing TCPCT increases TCP's performance for
very short flows:

Doing an apache-benchmark with -c 100 -n 100000, sending HTTP-requests
for files of 1KB size.

before this patch:
	average (among 7 runs) of 20845.5 Requests/Second
after:
	average (among 7 runs) of 21403.6 Requests/Second

Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-17 14:35:13 -04:00
Zhouyi Zhou
aaa0c23cb9 Fix dst_neigh_lookup/dst_neigh_lookup_skb return value handling bug
When neighbour table is full, dst_neigh_lookup/dst_neigh_lookup_skb will return
-ENOBUFS which is absolutely non zero, while all the code in kernel which use
above functions assume failure only on zero return which will cause panic. (for
example: : https://bugzilla.kernel.org/show_bug.cgi?id=54731).

This patch corrects above error with smallest changes to kernel source code and
also correct two return value check missing bugs in drivers/infiniband/hw/cxgb4/cm.c

Tested on my x86_64 SMP machine

Reported-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
Tested-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
Signed-off-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-15 09:06:58 -04:00
Vipul Pandya
9919d5bd01 RDMA/cxgb4: Fix onchip queue support for T5
T5 adapter does not support onchip queue memory. Present logic fails to
allocate QP for T5 and returns an error. Also, if module parameter ocqp_support
is zero then we are unable to allocate QP which should not be the case. Ideally
if ocqp_support parameter is 0 or onchip queue support is disable then host QP
should be allocated before returning an error.

Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-14 11:35:59 -04:00
Vipul Pandya
3b174d942c RDMA/cxgb4: Bump tcam_full stat and WR reply timeout
Always bump the tcam_full stat. Also, bump wr reply timeout to 30 seconds.

Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-14 11:35:59 -04:00
Vipul Pandya
0e5eca791c RDMA/cxgb4: Map pbl buffers for dma if using DSGL.
Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-14 11:35:59 -04:00
Vipul Pandya
42b6a94990 RDMA/cxgb4: Use DSGLs for fastreg and adapter memory writes for T5.
It enables direct DMA by HW to memory region PBL arrays and fast register PBL
arrays from host memory, vs the T4 way of passing these arrays in the WR itself.
The result is lower latency for memory registration, and larger PBL array
support for fast register operations.

This patch also updates ULP_TX_MEM_WRITE command fields for T5. Ordering bit of
ULP_TX_MEM_WRITE is at bit position 22 in T5 and at 23 in T4.

Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-14 11:35:59 -04:00
Vipul Pandya
80ccdd6051 RDMA/cxgb4: Add module_params to enable DB FC & Coalescing on T5
Both DB Flow-Control and DB Coalescing are disabled by default on T5

Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-14 11:35:58 -04:00
Vipul Pandya
3cbdb928e2 RDMA/cxgb4: Turn off db coalescing when RDMA QPs are in use.
Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-14 11:35:58 -04:00
Vipul Pandya
f079af7a11 RDMA/cxgb4: Add Support for Chelsio T5 adapter
Adds support for Chelsio T5 adapter.
Enables T5's Write Combining feature.

Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-14 11:35:58 -04:00
Tejun Heo
e8d4dd606b IB/cxgb4: convert to idr_alloc()
Convert to the much saner new idr interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:16 -08:00
Roland Dreier
ef4e359d9b Merge branches 'core', 'cxgb4', 'ipoib', 'iser', 'misc', 'mlx4', 'qib' and 'srp' into for-next 2013-02-26 09:17:56 -08:00
Shani Michaeli
7083e42ee2 IB/core: Add "type 2" memory windows support
This patch enhances the IB core support for Memory Windows (MWs).

MWs allow an application to have better/flexible control over remote
access to memory.

Two types of MWs are supported, with the second type having two flavors:

    Type 1  - associated with PD only
    Type 2A - associated with QPN only
    Type 2B - associated with PD and QPN

Applications can allocate a MW once, and then repeatedly bind the MW
to different ranges in MRs that are associated to the same PD. Type 1
windows are bound through a verb, while type 2 windows are bound by
posting a work request.

The 32-bit memory key is composed of a 24-bit index and an 8-bit
key. The key is changed with each bind, thus allowing more control
over the peer's use of the memory key.

The changes introduced are the following:

* add memory window type enum and a corresponding parameter to ib_alloc_mw.
* type 2 memory window bind work request support.
* create a struct that contains the common part of the bind verb struct
  ibv_mw_bind and the bind work request into a single struct.
* add the ib_inc_rkey helper function to advance the tag part of an rkey.

Consumer interface details:

* new device capability flags IB_DEVICE_MEM_WINDOW_TYPE_2A and
  IB_DEVICE_MEM_WINDOW_TYPE_2B are added to indicate device support
  for these features.

  Devices can set either IB_DEVICE_MEM_WINDOW_TYPE_2A or
  IB_DEVICE_MEM_WINDOW_TYPE_2B if it supports type 2A or type 2B
  memory windows. It can set neither to indicate it doesn't support
  type 2 windows at all.

* modify existing provides and consumers code to the new param of
  ib_alloc_mw and the ib_mw_bind_info structure

Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Shani Michaeli <shanim@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-02-21 11:51:45 -08:00
Stefan Hasko
b23523d858 RDMA/cxgb4: Fix cast warning
Fix compile warning about cast to pointer from integer of different size.

Signed-off-by: Stefan Hasko <hasko.stevo@gmail.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-02-15 15:13:43 -08:00
Paul Bolle
710a31102b RDMA/cxgb4: "cookie" can stay in host endianness
Work requests are passed between the host and the firmware with a
"cookie".  This cookie is swapped to big-endian when passed to the
firmware and back to host endianness on return.  This swapping seems
to be implemented incorrectly.  Moreover, the byte swapping triggers
GCC warnings on 32 bit:

    drivers/infiniband/hw/cxgb4/cm.c: In function ‘passive_ofld_conn_reply’:
    drivers/infiniband/hw/cxgb4/cm.c:2803:12: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
    drivers/infiniband/hw/cxgb4/cm.c: In function ‘send_fw_pass_open_req’:
    drivers/infiniband/hw/cxgb4/cm.c:2941:16: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
    [...]

But byte swapping isn't needed as the firmware doesn't actually touch
the cookie.  Dropping byte swapping makes the warnings go away too.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-02-14 15:55:05 -08:00
Vipul Pandya
ef5d6355ed RDMA/cxgb4: Address sparse warnings
Fixe the following types of sparse warnings
- cast to pointer from integer of different size
- cast from pointer to integer of different size
- incorrect type in assignment (different base types)
- incorrect type in argument 1 (different base types)
- cast from restricted __be64
- cast from restricted __be32

Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-02-14 15:51:58 -08:00
Vipul Pandya
b3de6cfebc RDMA/cxgb4: Insert hwtid in pass_accept_req instead in pass_establish
CPL_ABORT_REQ_RSS can come before TCP connection is established.  In
such case peer_abort was trying to remove the hwtid, which was not
inserted.  To avoid this we insert the hwtid when we are sure that we
are surely going to send passive accept request.

Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-02-14 15:51:58 -08:00
Vipul Pandya
7c0a33d611 RDMA/cxgb4: Don't wakeup threads for MPAv2
Don't wakeup threads blocked in rdma_init/rdma_fini if we are on
MPAv2, and want to retry connection with MPAv1.

Stop ep-timer on getting MPA version mismatch, before doing the
abort_connection - in process_mpa_request.

Take care to stop ep-timer in error paths for process_mpa_request.

Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-02-14 15:51:57 -08:00
Vipul Pandya
fe7e0a4dd0 RDMA/cxgb4: Don't reconnect on abort for mpa_rev 1
Only reconnect if the endpoint wasn't freed.

peer_abort() should only attempt to reconnect if the endpoint wasn't
freed.  Also remove hwtid from the debugfs idr.

Add missing check for peer2peer in MPAv2 code

Use correct mpa version on reject.

Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-02-14 15:51:57 -08:00
Vipul Pandya
1ec779cc29 RDMA/cxgb4: Fix endpoint timeout race condition
The endpoint timeout logic had a race that could cause an endpoint
object to be freed while it was still on the timedout list.  This
can happen if the timer is stopped after it had fired, but before
the timedout thread processed the endpoint timeout.

Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-02-14 15:51:57 -08:00
Vipul Pandya
e8e5b9278b RDMA/cxgb4: Only log rx_data warnings if cpl status is non-zero
With newer firmware, we can get streaming data due to connection
errors before the driver moves the QP out of RTS.

Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-02-14 15:51:56 -08:00
Vipul Pandya
04236df2a5 RDMA/cxgb4: Always log async errors
Log AEs even if the QP isn't in RTS.  It is useful information.

Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-02-14 15:51:56 -08:00
Vipul Pandya
325abead6c RDMA/cxgb4: Keep QP referenced until TID released
The driver is currently releasing the last ref on the QP too early.
This can cause bus errors due to HW still fetching WRs from the HW
queue.  The fix is to keep a qp ref until we release the HW TID.

Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-02-14 15:51:56 -08:00
Vipul Pandya
1557967bf9 RDMA/cxgb4: Display streaming mode error only if detected in RTS
With later firmware, the chances of getting streaming mode data after
we exit RTS is likely, so we don't need to warn for it.  The only real
case where we don't expect it is when the QP is in RTS.

Move QP to ERROR when streaming mode data received.

Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-02-14 15:51:55 -08:00
Vipul Pandya
91e9c07195 RDMA/cxgb4: Abort connections when moving to ERROR state
If a FINI operation fails, then we need to ABORT instead of CLOSE.
Also, if we ABORT due to unexpected STREAMING data, then wake up
anybody blocked in FINI...

Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-02-14 15:51:55 -08:00
Vipul Pandya
55abf8df0a RDMA/cxgb4: Abort connections that receive unexpected streaming mode data
This error means the RDMA connection was knocked out of RDMA mode,
probably due to an error on the connection.

Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-02-14 15:51:55 -08:00
Vipul Pandya
793dad94e7 RDMA/cxgb4: Fix bug for active and passive LE hash collision path
Retries active opens for INUSE errors.

Logs any active ofld_connect_wr error replies.

Sends ofld_connect_wr on same ctrlq. It needs to go  on the same control txq as
regular CPL active/passive messages.

Retries on active open replies with EADDRINUSE.

Uses active open fw wr only if active filter region is set.

Adds stat for ofld_connect_wr failures.

This patch also adds debugfs file to show endpoints.

Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-12-19 23:03:12 -08:00
Vipul Pandya
1cab775c3e RDMA/cxgb4: Fix LE hash collision bug for passive open connection
It establishes passive open connection through firmware work request. Passive
open connection will go through this path as now instead of listening server we
create a server filter which will redirect the incoming SYN packet to the
offload queue. After this driver tries to establish the connection using
firmware work request.

Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-12-19 23:03:11 -08:00
Vipul Pandya
5be78ee924 RDMA/cxgb4: Fix LE hash collision bug for active open connection
It enables establishing active open connection using fw_ofld_connection work
request when cpl_act_open_rpl says TCAM full error which may be because
of LE hash collision. Current support is only for IPv4 active open connections.

Sets ntuple bits in active open requests. For T4 firmware greater than 1.4.10.0
ntuple bits are required to be set.

Adds nocong and enable_ecn module parameter options.

Signed-off-by: Vipul Pandya <vipul@chelsio.com>

[ Move all FW return values to t4fw_api.h.  - Roland ]

Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-12-19 23:02:43 -08:00
Julia Lawall
76f267b7ad RDMA/cxgb4: use WARN
Use WARN rather than printk followed by WARN_ON(1), for conciseness.

A simplified version of the semantic patch that makes this transformation
is as follows: (http://coccinelle.lip6.fr/)

// <smpl>
@@
expression list es;
@@

-printk(
+WARN(1,
  es);
-WARN_ON(1);
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-11-26 11:07:18 -08:00
Thadeu Lima de Souza Cascardo
32c631f9f2 RDMA/cxgb4: Don't free chunk that we have failed to allocate
In the error path of registering memory when there's a failure to
allocate a chunk from the memory pool, we try to free the same chunk
we just failed to allocate, which will BUG().

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-10-22 11:05:00 -07:00
Linus Torvalds
7a9a2970b5 First batch of InfiniBand/RDMA changes for the 3.7 merge window:
- mlx4 IB support for SR-IOV
  - A couple of SRP initiator fixes
  - Batch of nes hardware driver fixes
  - Fix for long-standing use-after-free crash in IPoIB
  - Other miscellaneous fixes
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABCAAGBQJQav4jAAoJEENa44ZhAt0hmL0QAJTuMdSOzYFd/NB38owJCNM2
 kz/N1GlBm3z98fIlGo8u+lzgV2qxqZSAzsJsouMeK38KiAX3CL8HKe44A1QvTM6v
 dXTNL4JFX24/YF+nlmMY8Av518I9Mkte3BZCnpYkBjVFBWe0ePwoRC/btfBXPDIV
 0snq4OtjoBAn00dOOyuZ5PoyY9xf0z4UB0Gple9sM4mzEb8wVWdNDDPOiuPJc6fA
 L+gk6HLkZDg54+QswafdKYwpeTq45wIKLmCdS3oUNmppMLVhZY8rECOwzSa+KiTr
 /Yo19n+zl+IBlvjQHhmUqGHvdD17PaGlr+TckAsQqmVfXUH5qqpEnkF8FoEK59c5
 YA3lVU8Sj4BPhJ7qX54CuN3767mZizakkxCr9iPRzABFTgzWVgcSgCrE8jjx4i0h
 Pam+L5bmANFStgmGR8PmXiNgcrCUcEqYHsOWDDAnHa5ekb2nyv1JL1c18hlY9hC3
 Xb1YTMZFwvofGza89hBu7oHrMbLOUc5kW2lBpvUn2nlyf3i0F8ISlVbVbNjFA54p
 60/jHa2VOQ2CcJUJKnJOk4ajOOEfHnPtMn2q96XJ69Dp8+eSYEO/G+0i1OlChq4h
 ClnG0Yp+NkT1o8WXMd7guDR+RsXt+DXIij5TiUWRIqnIlopIsMTRhNH28tMu4jQL
 fgN5n987wru91ewdX4gW
 =PAcy
 -----END PGP SIGNATURE-----

Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

Pull infiniband updates from Roland Dreier:
 "First batch of InfiniBand/RDMA changes for the 3.7 merge window:
   - mlx4 IB support for SR-IOV
   - A couple of SRP initiator fixes
   - Batch of nes hardware driver fixes
   - Fix for long-standing use-after-free crash in IPoIB
   - Other miscellaneous fixes"

This merge also removes a new use of __cancel_delayed_work(), and
replaces it with the regular cancel_delayed_work() that is now irq-safe
thanks to the workqueue updates.

That said, I suspect the sequence in question should probably use
"mod_delayed_work()".  I just did the minimal "don't use deprecated
functions" fixup, though.

* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (45 commits)
  IB/qib: Fix local access validation for user MRs
  mlx4_core: Disable SENSE_PORT for multifunction devices
  mlx4_core: Clean up enabling of SENSE_PORT for older (ConnectX-1/-2) HCAs
  mlx4_core: Stash PCI ID driver_data in mlx4_priv structure
  IB/srp: Avoid having aborted requests hang
  IB/srp: Fix use-after-free in srp_reset_req()
  IB/qib: Add a qib driver version
  RDMA/nes: Fix compilation error when nes_debug is enabled
  RDMA/nes: Print hardware resource type
  RDMA/nes: Fix for crash when TX checksum offload is off
  RDMA/nes: Cosmetic changes
  RDMA/nes: Fix for incorrect MSS when TSO is on
  RDMA/nes: Fix incorrect resolving of the loopback MAC address
  mlx4_core: Fix crash on uninitialized priv->cmd.slave_sem
  mlx4_core: Trivial cleanups to driver log messages
  mlx4_core: Trivial readability fix: "0X30" -> "0x30"
  IB/mlx4: Create paravirt contexts for VFs when master IB driver initializes
  mlx4: Modify proxy/tunnel QP mechanism so that guests do no calculations
  mlx4: Paravirtualize Node Guids for slaves
  mlx4: Activate SR-IOV mode for IB
  ...
2012-10-02 17:20:40 -07:00
Emil Goode
c079c28714 RDMA/cxgb4: Fix error handling in create_qp()
The variable ret is assigned return values in a couple of places, but
its value is never returned.  This patch makes use of the ret variable
so that the caller get correct error codes returned.

The following changes are also introduced:

- The alloc_oc_sq function can return -ENOSYS or -ENOMEM so we want to
  get the return value from it.

- Change the label names to improve readability.

Signed-off-by: Emil Goode <emilgoode@gmail.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-09-30 20:32:14 -07:00
David S. Miller
6a06e5e1bb Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/team/team.c
	drivers/net/usb/qmi_wwan.c
	net/batman-adv/bat_iv_ogm.c
	net/ipv4/fib_frontend.c
	net/ipv4/route.c
	net/l2tp/l2tp_netlink.c

The team, fib_frontend, route, and l2tp_netlink conflicts were simply
overlapping changes.

qmi_wwan and bat_iv_ogm were of the "use HEAD" variety.

With help from Antonio Quartulli.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-28 14:40:49 -04:00
Wei Yongjun
92dd6c3d4d RDMA/cxgb4: Move dereference below NULL test
spatch with a semantic match is used to found this.
(http://coccinelle.lip6.fr/)

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-09-07 16:19:03 -07:00
Vipul Pandya
e5619c120d RDMA/cxgb4: Update RDMA/cxgb4 due to macro definition removal in cxgb4 driver
cxgb4 driver has duplicate definitions of registers which will be removed. This
patch updates the RDMA/cxgb4 driver accordingly.

Signed-off-by: Santosh Rastapur <santosh@chelsio.com>
Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Reviewed-by: Sivakumar Subramani <sivasu@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-05 17:41:40 -04:00
Roland Dreier
f747c34af4 RDMA/cxgb4: Fix endianness of addition to mpa->private_data_size
sparse correctly warns that if mpa->private_data_size is __be16, then
doing += on it is wrong, even if we do += htons(<something>) -- on a
little endian system, carries will go the wrong way.  Fix this up by
doing the addition in native byte order.

Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-07-08 18:02:33 -07:00
Thadeu Lima de Souza Cascardo
71b43fd573 RDMA/cxgb4: Fix crash when peer address is 0.0.0.0
When using rping -c -a 0.0.0.0 with iw_cxgb4, the system crashes when
rdma_connect() is called.  ip_dev_find() will return NULL, but pdev is
accessed anyway.

Checking that pdev is NULL and returning -ENODEV prevents the system
from crashing.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-06-03 22:59:15 -07:00
Vipul Pandya
e572568fbc RDMA/cxgb4: Include vmalloc.h for vmalloc and vfree
Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-21 09:00:34 -07:00
Vipul Pandya
67bbc05512 RDMA/cxgb4: Add query_qp support
This allows querying the QP state before flushing.

Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-18 13:22:37 -07:00
Vipul Pandya
ec3eead217 RDMA/cxgb4: Remove kfifo usage
Using kfifos for ID management was limiting the number of QPs and
preventing NP384 MPI jobs.  So replace it with a simple bitmap
allocator.

Remove IDs from the IDR tables before deallocating them.  This bug was
causing the BUG_ON() in insert_handle() to fire because the ID was
getting reused before being removed from the IDR table.

Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-18 13:22:36 -07:00
Vipul Pandya
d716a2a014 RDMA/cxgb4: Use vmalloc() for debugfs QP dump
This allows dumping thousands of QPs.  Log active open failures of
interest.

Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-18 13:22:35 -07:00
Vipul Pandya
422eea0a8c RDMA/cxgb4: DB Drop Recovery for RDMA and LLD queues
Add module option db_fc_threshold which is the count of active QPs
that trigger automatic db flow control mode.  Automatically transition
to/from flow control mode when the active qp count crosses
db_fc_theshold.

Add more db debugfs stats

On DB DROP event from the LLD, recover all the iwarp queues.

Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-18 13:22:33 -07:00
Vipul Pandya
4984037bef RDMA/cxgb4: Disable interrupts in c4iw_ev_dispatch()
Use GFP_ATOMIC in _insert_handle() if ints are disabled.

Don't panic if we get an abort with no endpoint found.  Just log a
warning.

Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-18 13:22:32 -07:00
Vipul Pandya
2c97478106 RDMA/cxgb4: Add DB Overflow Avoidance
Get FULL/EMPTY/DROP events from LLD.  On FULL event, disable normal
user mode DB rings.

Add modify_qp semantics to allow user processes to call into the
kernel to ring doobells without overflowing.

Add DB Full/Empty/Drop stats.

Mark queues when created indicating the doorbell state.

If we're in the middle of db overflow avoidance, then newly created
queues should start out in this mode.

Bump the C4IW_UVERBS_ABI_VERSION to 2 so the user mode library can
know if the driver supports the kernel mode db ringing.

Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-18 13:22:31 -07:00
Vipul Pandya
8d81ef34b2 RDMA/cxgb4: Add debugfs RDMA memory stats
Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-18 13:22:29 -07:00
Steve Wise
14b9222808 RDMA/cxgb4: Drop peer_abort when no endpoint found
Log a warning and drop the abort message.  Otherwise we will do a
bogus wake_up() and crash.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-15 09:46:09 -07:00
Steve Wise
0f1dcfae6b RDMA/cxgb4: Always wake up waiters in c4iw_peer_abort_intr()
This fixes a race where an ingress abort fails to wake up the thread
blocked in rdma_init() causing the app to hang.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-15 09:46:04 -07:00
Steve Wise
bd61baaf59 RDMA/cxgb4: Use dst parameter in import_ep()
Function import_ep() is incorrectly using ep->dst instead of the dst
ptr passed in.  This causes a crash when accepting new rdma connections
becase ep->dst is not initialized yet.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-05-08 11:17:04 -07:00
Linus Torvalds
0c2fe82a9b InfiniBand/RDMA changes for the 3.4 merge window. Nothing big really
stands out; by patch count lots of fixes to the mlx4 driver plus some
 cleanups and fixes to the core and other drivers.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABCAAGBQJPZ2THAAoJEENa44ZhAt0hMgkP/AwXjwzz6lYlIizytQ5KOH11
 CT/VoHLIGIwY31aRhZRHggWLEbJi8akgfh04K9PiSh/muFRPYk5KJDoQEoJV5Odv
 12krqtpZeL41YcAPq0wiyP3Ki5AZDCDumzWbOtmxE9m93mVfi3yFTTWDHFo/7SOx
 A2kUITdKuZhgBpCFYS23Gs8QTWcMPUlwNlM76edG90BjSySHsK15tbqTUaEOC6Ux
 SPG0p0c2YjlZCPmRObrCL65o5LFRVajinZ4rXN7rre4LEU+IuXgW+mQU2Kwy8z6a
 oMPE2l4OMSqj270r2PlVK087gGH69nKfLgLQLJiP0Id+KwdYUk7vQcA+Fu0jwjP3
 Ci/ahllvB76IiaNbax0vTy2ohCJmilLLotAP46NW0vPR2/KnmVy0Mqk8pXQQddw4
 tnAFNnafn/MjTty8GwqyXR/enJZs29ePcSxcYnKjXYRIgG4ldejL2wktgSra8+gb
 3/qFag1HRgZzcICkXGpHj7uWJ2KDe++1m6KTb0THUmrjuiel6ATFZpYoeRed3GfS
 ZMqpjZvuXM8nA9CrKMkzm5vIiUFF8Qq0hUTLR38F8FiNEhmC08JqH5PqjJ3Z18if
 R1yG4W4xmp/0ICHg4zyg6LWV3YsweDD+jSCJpW+KNBvu3HtYb41HIlK+i2TTmtPD
 3etEZZRwROAyYKcwN01p
 =DbLx
 -----END PGP SIGNATURE-----

Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

Pull InfiniBand/RDMA changes for the 3.4 merge window from Roland Dreier:
 "Nothing big really stands out; by patch count lots of fixes to the
  mlx4 driver plus some cleanups and fixes to the core and other
  drivers."

* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (28 commits)
  mlx4_core: Scale size of MTT table with system RAM
  mlx4_core: Allow dynamic MTU configuration for IB ports
  IB/mlx4: Fix info returned when querying IBoE ports
  IB/mlx4: Fix possible missed completion event
  mlx4_core: Report thermal error events
  mlx4_core: Fix one more static exported function
  IB: Change CQE "csum_ok" field to a bit flag
  RDMA/iwcm: Reject connect requests if cmid is not in LISTEN state
  RDMA/cxgb3: Don't pass irq flags to flush_qp()
  mlx4_core: Get rid of redundant ext_port_cap flags
  RDMA/ucma: Fix AB-BA deadlock
  IB/ehca: Fix ilog2() compile failure
  IB: Use central enum for speed instead of hard-coded values
  IB/iser: Post initial receive buffers before sending the final login request
  IB/iser: Free IB connection resources in the proper place
  IB/srp: Consolidate repetitive sysfs code
  IB/srp: Use pr_fmt() and pr_err()/pr_warn()
  IB/core: Fix SDR rates in sysfs
  mlx4: Enforce device max FMR maps in FMR alloc
  IB/mlx4: Set bad_wr for invalid send opcode
  ...
2012-03-21 10:33:42 -07:00
Roland Dreier
f0e88aeb19 Merge branches 'cma', 'cxgb3', 'cxgb4', 'ehca', 'iser', 'mad', 'nes', 'qib', 'srp' and 'srpt' into for-next 2012-03-19 09:50:33 -07:00
Or Gerlitz
2e96691c31 IB: Use central enum for speed instead of hard-coded values
The kernel IB stack uses one enumeration for IB speed, which wasn't
explicitly specified in the verbs header file.  Add that enum, and use
it all over the code.

The IB speed/width notation is also used by iWARP and IBoE HW drivers,
which use the convention of rate = speed * width to advertise their
port link rate.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-03-05 09:25:16 -08:00
Kumar Sanghvi
91018f8632 RDMA/cxgb4: Add missing peer2peer check in MPAv2 code
Don't worry about p2p_type if peer2peer itself is not requested in the
first place.

Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-02-25 17:45:02 -08:00
David Miller
64b7007eb9 infiniband: cxgb4: Convert import_ep() over to dst_neigh_lookup().
Now we must provide the IP destination address, and a reference has
to be dropped when we're done with the entry.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-01-25 21:30:37 -05:00
David Miller
3786cf189f infiniband: cxgb4: Consolidate 3 copies of the same operation into 1 helper function.
Three pieces of code do the same thing, create a l2t entry and then
import this information into the c4iw_ep object.

Create a helper function and call it from these 3 locations instead.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Roland Dreier <roland@purestorage.com>
2011-12-05 15:20:20 -05:00
David Miller
2721745501 net: Rename dst_get_neighbour{, _raw} to dst_get_neighbour_noref{, _raw}.
To reflect the fact that a refrence is not obtained to the
resulting neighbour entry.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Roland Dreier <roland@purestorage.com>
2011-12-05 15:20:19 -05:00
Roland Dreier
a493f1a24a Merge branches 'cxgb4', 'ipoib', 'misc' and 'qib' into for-next 2011-11-29 18:01:53 -08:00
Eric Dumazet
580da35a31 IB: Fix RCU lockdep splats
Commit f2c31e32b3 ("net: fix NULL dereferences in check_peer_redir()")
forgot to take care of infiniband uses of dst neighbours.

Many thanks to Marc Aurele who provided a nice bug report and feedback.

Reported-by: Marc Aurele La France <tsi@ualberta.ca>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: <stable@kernel.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-11-29 13:37:11 -08:00
Kumar Sanghvi
01b225e18f RDMA/cxgb4: Fix retry with MPAv1 logic for MPAv2
Fix logic so that we don't retry with MPAv1 once we have done that
already.  Otherwise, we end up retrying with MPAv1 even when its not
needed on getting peer aborts - and this could lead to kernel panic.

Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-11-28 11:58:07 -08:00
Jonathan Lallinger
c34c97ad8c RDMA/cxgb4: Fix iw_cxgb4 count_rcqes() logic
Fix another place in the code where logic dealing with the t4_cqe was
using the wrong QID.  This fixes the counting logic so that it tests
against the SQ QID instead of the RQ QID when counting RCQES.

Signed-off by: Jonathan Lallinger <jonathan@ogc.us>
Signed-off by: Steve Wise <swise@ogc.us>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-11-28 11:53:05 -08:00
Linus Torvalds
32aaeffbd4 Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux
* 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)
  Revert "tracing: Include module.h in define_trace.h"
  irq: don't put module.h into irq.h for tracking irqgen modules.
  bluetooth: macroize two small inlines to avoid module.h
  ip_vs.h: fix implicit use of module_get/module_put from module.h
  nf_conntrack.h: fix up fallout from implicit moduleparam.h presence
  include: replace linux/module.h with "struct module" wherever possible
  include: convert various register fcns to macros to avoid include chaining
  crypto.h: remove unused crypto_tfm_alg_modname() inline
  uwb.h: fix implicit use of asm/page.h for PAGE_SIZE
  pm_runtime.h: explicitly requires notifier.h
  linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h
  miscdevice.h: fix up implicit use of lists and types
  stop_machine.h: fix implicit use of smp.h for smp_processor_id
  of: fix implicit use of errno.h in include/linux/of.h
  of_platform.h: delete needless include <linux/module.h>
  acpi: remove module.h include from platform/aclinux.h
  miscdevice.h: delete unnecessary inclusion of module.h
  device_cgroup.h: delete needless include <linux/module.h>
  net: sch_generic remove redundant use of <linux/module.h>
  net: inet_timewait_sock doesnt need <linux/module.h>
  ...

Fix up trivial conflicts (other header files, and  removal of the ab3550 mfd driver) in
 - drivers/media/dvb/frontends/dibx000_common.c
 - drivers/media/video/{mt9m111.c,ov6650.c}
 - drivers/mfd/ab3550-core.c
 - include/linux/dmaengine.h
2011-11-06 19:44:47 -08:00
Linus Torvalds
f470f8d4e7 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (62 commits)
  mlx4_core: Deprecate log_num_vlan module param
  IB/mlx4: Don't set VLAN in IBoE WQEs' control segment
  IB/mlx4: Enable 4K mtu for IBoE
  RDMA/cxgb4: Mark QP in error before disabling the queue in firmware
  RDMA/cxgb4: Serialize calls to CQ's comp_handler
  RDMA/cxgb3: Serialize calls to CQ's comp_handler
  IB/qib: Fix issue with link states and QSFP cables
  IB/mlx4: Configure extended active speeds
  mlx4_core: Add extended port capabilities support
  IB/qib: Hold links until tuning data is available
  IB/qib: Clean up checkpatch issue
  IB/qib: Remove s_lock around header validation
  IB/qib: Precompute timeout jiffies to optimize latency
  IB/qib: Use RCU for qpn lookup
  IB/qib: Eliminate divide/mod in converting idx to egr buf pointer
  IB/qib: Decode path MTU optimization
  IB/qib: Optimize RC/UC code by IB operation
  IPoIB: Use the right function to do DMA unmap pages
  RDMA/cxgb4: Use correct QID in insert_recv_cqe()
  RDMA/cxgb4: Make sure flush CQ entries are collected on connection close
  ...
2011-11-01 10:51:38 -07:00
Roland Dreier
504255f8d0 Merge branches 'amso1100', 'cma', 'cxgb3', 'cxgb4', 'fdr', 'ipath', 'ipoib', 'misc', 'mlx4', 'misc', 'nes', 'qib' and 'xrc' into for-next 2011-11-01 09:37:08 -07:00
Paul Gortmaker
e4dd23d753 infiniband: Fix up module files that need to include module.h
They had been getting it implicitly via device.h but we can't
rely on that for the future, due to a pending cleanup so fix
it now.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:31:35 -04:00
Tom Tucker
d32ae393db RDMA/cxgb4: Mark QP in error before disabling the queue in firmware
QPs need to be moved to error before telling the firwmare to shutdown
the queue.  Otherwise, the application can submit WRs that will never
get fetched by the hardware and never flushed by the driver.

Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Acked-by: Steve Wise <swsie@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-31 11:36:08 -07:00
Kumar Sanghvi
581bbe2cd0 RDMA/cxgb4: Serialize calls to CQ's comp_handler
Commit 01e7da6ba5 ("RDMA/cxgb4: Make sure flush CQ entries are
collected on connection close") introduced a potential problem where a
CQ's comp_handler can get called simultaneously from different places
in the iw_cxgb4 driver.  This does not comply with
Documentation/infiniband/core_locking.txt, which states that at a
given point of time, there should be only one callback per CQ should
be active.

This problem was reported by Parav Pandit <Parav.Pandit@Emulex.Com>.
Based on discussion between Parav Pandit and Steve Wise, this patch
fixes the above problem by serializing the calls to a CQ's
comp_handler using a spin_lock.

Reported-by: Parav Pandit <Parav.Pandit@Emulex.Com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-31 11:34:53 -07:00
Jonathan Lallinger
e14d62c05c RDMA/cxgb4: Use correct QID in insert_recv_cqe()
When creating flushed receive CQEs, set the QPID field in the t4_cqe
to the SQ QID and not the RQ QID.  Otherwise the poll code will not
find the correct QP context.

Signed-off by: Jonathan Lallinger <jonathan@ogc.us>
Signed-off by: Steve Wise <swise@ogc.us>

Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-14 14:23:40 -07:00
Kumar Sanghvi
01e7da6ba5 RDMA/cxgb4: Make sure flush CQ entries are collected on connection close
At the time when a peer closes the connection, iw_cxgb4 will not send
a cq event if ibqp.uobject exists.  In that case, its possible for a
user application to get blocked in ibv_get_cq_event().

To resolve this, call the cq's comp_handler to unblock any read from
ibv_get_cq_event().  This will trigger userspace to poll the cq and
collect flush status completions for any pending work requests.

Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-14 14:23:04 -07:00
Kumar Sanghvi
d2fe99e86b RDMA/cxgb4: Add support for MPAv2 Enhanced RDMA Negotiation
This patch adds support for Enhanced RDMA Connection Establishment
(draft-ietf-storm-mpa-peer-connect-06), aka MPAv2.  Details of draft
can be obtained from:
<http://www.ietf.org/id/draft-ietf-storm-mpa-peer-connect-06.txt>

The patch updates the following functions for initiator perspective:
 - send_mpa_request
 - process_mpa_reply
 - post_terminate for TERM error codes
 - destroy_qp for TERM related change
 - adds layer/etype/ecode to c4iw_qp_attrs for sending with TERM
 - peer_abort for retrying connection attempt with MPA_v1 message
 - added c4iw_reconnect function

The patch updates the following functions for responder perspective:
 - process_mpa_request
 - send_mpa_reply
 - c4iw_accept_cr
 - passes ird/ord to upper layers

Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-06 09:39:24 -07:00
Steve Wise
9efe10a1e1 RDMA/cxgb4: Fail RDMA initialization for unsupported cards
The iw_cxgb4 module crashes at init time if the T4 card does not
support RDMA.  So clean up the init logic to correctly deal with
non-RDMA cards.

 - If any RDMA resources are not available, then fail the initialization
   logging an info message.
 - Clean up properly on initialization failures.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-06 09:32:44 -07:00
Jeff Kirsher
f7917c009c chelsio: Move the Chelsio drivers
Moves the drivers for the Chelsio chipsets into
drivers/net/ethernet/chelsio/ and the necessary Kconfig and Makefile
changes.

CC: Divy Le Ray <divy@chelsio.com>
CC: Dimitris Michailidis <dm@chelsio.com>
CC: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2011-08-10 19:54:52 -07:00
Arun Sharma
60063497a9 atomic: use <linux/atomic.h>
This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>

Signed-off-by: Arun Sharma <asharma@fb.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-26 16:49:47 -07:00
Linus Torvalds
ece236ce2f Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (26 commits)
  IB/qib: Defer HCA error events to tasklet
  mlx4_core: Bump the driver version to 1.0
  RDMA/cxgb4: Use printk_ratelimited() instead of printk_ratelimit()
  IB/mlx4: Support PMA counters for IBoE
  IB/mlx4: Use flow counters on IBoE ports
  IB/pma: Add include file for IBA performance counters definitions
  mlx4_core: Add network flow counters
  mlx4_core: Fix location of counter index in QP context struct
  mlx4_core: Read extended capabilities into the flags field
  mlx4_core: Extend capability flags to 64 bits
  IB/mlx4: Generate GID change events in IBoE code
  IB/core: Add GID change event
  RDMA/cma: Don't allow IPoIB port space for IBoE
  RDMA: Allow for NULL .modify_device() and .modify_port() methods
  IB/qib: Update active link width
  IB/qib: Fix potential deadlock with link down interrupt
  IB/qib: Add sysfs interface to read free contexts
  IB/mthca: Remove unnecessary read of PCI_CAP_ID_EXP
  IB/qib: Remove double define
  IB/qib: Remove unnecessary read of PCI_CAP_ID_EXP
  ...
2011-07-22 14:50:12 -07:00
Roland Dreier
4460207561 Merge branches 'cma', 'cxgb4', 'ipath', 'misc', 'mlx4', 'mthca', 'qib' and 'srp' into for-next 2011-07-22 11:56:11 -07:00
Manuel Zerpies
3cbe182a5b RDMA/cxgb4: Use printk_ratelimited() instead of printk_ratelimit()
Since printk_ratelimit() shouldn't be used anymore (see comment in
include/linux/printk.h), replace it with printk_ratelimited().

Signed-off-by: Manuel Zerpies <manuel.f.zerpies@ww.stud.uni-erlangen.de>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-07-18 21:18:39 -07:00
Bart Van Assche
10e1b54bbb RDMA: Allow for NULL .modify_device() and .modify_port() methods
These methods don't make sense for iWARP devices, so rather than
forcing them to implement stubs, just return -ENOSYS in the core if
the hardware driver doesn't set .modify_device and/or .modify_port.

Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-07-18 16:44:30 -07:00
David S. Miller
69cce1d140 net: Abstract dst->neighbour accesses behind helpers.
dst_{get,set}_neighbour()

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-17 23:11:35 -07:00
Steve Wise
8da7e7a552 RDMA/cxgb4: Couple of abort fixes
- fix a race where the driver could end up sending a close_con_req
  after an abort_rpl.  In c4iw_ep_disconnect(), send abort or close
  request with the ep mutex held.

- fix a hang where driver fails to wake up when a connection is reset
  during a normal close.  Wake up any waiters in the interrupt path,
  and correctly cleanup after rdma_fini() failures.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-06-17 11:54:56 -07:00
Steve Wise
301c2c3f03 RDMA/cxgb4: Don't truncate MR lengths
Remove left-over code from T3 that limited MR sizes to 32b.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-06-17 11:54:50 -07:00
Steve Wise
2ff7d09a1b RDMA/cxgb4: Don't exceed hw IQ depth limit for user CQs
Memory allocated for user CQs gets rounded up to the next page
boundary.  And after rounding, we recalculate the resulting IQ depth
and we need to make sure we don't exceed the HW limits.

This bug can result a much smaller CQ allocated than was expected if
the HW size field is exceeded, resulting in CQ overflow failures.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-06-17 11:52:45 -07:00
Linus Torvalds
4c171acc20 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  RDMA/cma: Save PID of ID's owner
  RDMA/cma: Add support for netlink statistics export
  RDMA/cma: Pass QP type into rdma_create_id()
  RDMA: Update exported headers list
  RDMA/cma: Export enum cma_state in <rdma/rdma_cm.h>
  RDMA/nes: Add a check for strict_strtoul()
  RDMA/cxgb3: Don't post zero-byte read if endpoint is going away
  RDMA/cxgb4: Use completion objects for event blocking
  IB/srp: Fix integer -> pointer cast warnings
  IB: Add devnode methods to cm_class and umad_class
  IB/mad: Return EPROTONOSUPPORT when an RDMA device lacks the QP required
  IB/uverbs: Add devnode method to set path/mode
  RDMA/ucma: Add .nodename/.mode to tell userspace where to create device node
  RDMA: Add netlink infrastructure
  RDMA: Add error handling to ib_core_init()
2011-05-26 12:13:57 -07:00
Steve Wise
c337374bf2 RDMA/cxgb4: Use completion objects for event blocking
There exists a race condition when using wait_queue_head_t objects
that are declared on the stack.  This was being done in a few places
where we are sending work requests to the FW and awaiting replies, but
we don't have an endpoint structure with an embedded c4iw_wr_wait
struct.  So the code was allocating it locally on the stack.  Bad
design.  The race is:

  1) thread on cpuX declares the wait_queue_head_t on the stack, then
     posts a firmware WR with that wait object ptr as the cookie to be
     returned in the WR reply.  This thread will proceed to block in
     wait_event_timeout() but before it does:

  2) An interrupt runs on cpuY with the WR reply.  fw6_msg() handles
     this and calls c4iw_wake_up().  c4iw_wake_up() sets the condition
     variable in the c4iw_wr_wait object to TRUE and will call
     wake_up(), but before it calls wake_up():

  3) The thread on cpuX calls c4iw_wait_for_reply(), which calls
     wait_event_timeout().  The wait_event_timeout() macro checks the
     condition variable and returns immediately since it is TRUE.  So
     this thread never blocks/sleeps. The function then returns
     effectively deallocating the c4iw_wr_wait object that was on the
     stack.

  4) So at this point cpuY has a pointer to the c4iw_wr_wait object
     that is no longer valid.  Further its pointing to a stack frame
     that might now be in use by some other context/thread.  So cpuY
     continues execution and calls wake_up() on a ptr to a wait object
     that as been effectively deallocated.

This race, when it hits, can cause a crash in wake_up(), which I've
seen under heavy stress. It can also corrupt the referenced stack
which can cause any number of failures.

The fix:

Use struct completion, which supports on-stack declarations.
Completions use a spinlock around setting the condition to true and
the wake up so that steps 2 and 4 above are atomic and step 3 can
never happen in-between.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
2011-05-24 09:47:38 -07:00
Linus Torvalds
06f4e926d2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1446 commits)
  macvlan: fix panic if lowerdev in a bond
  tg3: Add braces around 5906 workaround.
  tg3: Fix NETIF_F_LOOPBACK error
  macvlan: remove one synchronize_rcu() call
  networking: NET_CLS_ROUTE4 depends on INET
  irda: Fix error propagation in ircomm_lmp_connect_response()
  irda: Kill set but unused variable 'bytes' in irlan_check_command_param()
  irda: Kill set but unused variable 'clen' in ircomm_connect_indication()
  rxrpc: Fix set but unused variable 'usage' in rxrpc_get_transport()
  be2net: Kill set but unused variable 'req' in lancer_fw_download()
  irda: Kill set but unused vars 'saddr' and 'daddr' in irlan_provider_connect_indication()
  atl1c: atl1c_resume() is only used when CONFIG_PM_SLEEP is defined.
  rxrpc: Fix set but unused variable 'usage' in rxrpc_get_peer().
  rxrpc: Kill set but unused variable 'local' in rxrpc_UDP_error_handler()
  rxrpc: Kill set but unused variable 'sp' in rxrpc_process_connection()
  rxrpc: Kill set but unused variable 'sp' in rxrpc_rotate_tx_window()
  pkt_sched: Kill set but unused variable 'protocol' in tc_classify()
  isdn: capi: Use pr_debug() instead of ifdefs.
  tg3: Update version to 3.119
  tg3: Apply rx_discards fix to 5719/5720
  ...

Fix up trivial conflicts in arch/x86/Kconfig and net/mac80211/agg-tx.c
as per Davem.
2011-05-20 13:43:21 -07:00
Benjamin Herrenschmidt
880102e785 Merge remote branch 'origin/master' into merge
Manual merge of arch/powerpc/kernel/smp.c and add missing scheduler_ipi()
call to arch/powerpc/platforms/cell/interrupt.c

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-20 15:36:52 +10:00
Steve Wise
2f25e9a540 RDMA/cxgb4: EEH errors can hang the driver
A few more EEH fixes:

c4iw_wait_for_reply(): detect fatal EEH condition on timeout and
return an error.

The iw_cxgb4 driver was only calling ib_deregister_device() on an EEH
event followed by a ib_register_device() when the device was
reinitialized.  However, the RDMA core doesn't allow multiple
iterations of register/deregister by the provider. See
drivers/infiniband/core/sysfs.c: ib_device_unregister_sysfs() where
the kobject ref is held until the device is deallocated in
ib_deallocate_device().  Calling deregister adds this kobj reference,
and then a subsequent register call will generate a WARN_ON() from the
kobject subsystem because the kobject is being initialized but is
already initialized with the ref held.

So the provider must deregister and dealloc when resetting for an EEH
event, then alloc/register to re-initialize.  To do this, we cannot
use the device ptr as our ULD handle since it will change with each
reallocation.  This commit adds a ULD context struct which is used as
the ULD handle, and then contains the device pointer and other state
needed.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-09 22:06:23 -07:00
Steve Wise
d9594d990a RDMA/cxgb4: Reset wait condition atomically
The driver was never really waiting for RDMA_WR/FINI completions
because the condition variable used to determine if the completion
happened was never reset, and this condition variable is reused for
both connection setup and teardown.  This causes various driver
crashes under heavy loads due to releasing resources too early.

The fix is to use atomic bits to correctly reset the condition
immediately after the completion is detected.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-09 22:06:22 -07:00
Roel Kluin
85d215b0f3 RDMA/cxgb4: Fix missing parentheses
Parens are missing: '|' has a higher presedence than '?'.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-09 22:06:22 -07:00
Steve Wise
bbe9a0a2bc RDMA/cxgb4: Initialization errors can cause crash
c4iw_uld_add() must return ERR_PTR() values instead of NULL on failure.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-09 22:06:22 -07:00
Steve Wise
30c95c2d49 RDMA/cxgb4: Don't change QP state outside EP lock
Concurrent ingress CLOSE and ULP ABORT operations causes a crash due
to a race condition where the close path releases the EP lock and then
tries to move the QP state to CLOSED.  This must be done inside the EP
lock to avoid the race.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-09 22:06:22 -07:00
David S. Miller
31e4543db2 ipv4: Make caller provide on-stack flow key to ip_route_output_ports().
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-03 20:25:42 -07:00
Nishanth Aravamudan
e297d9dd5c cxgb4: use pgprot_writecombine() on powerpc
Commit fe3cc0d99d ("powerpc: Add
pgprot_writecombine") in benh's tree exposes the pgprot_writecombine()
API to drivers on powerpc. cxgb4 has an open-coded version of the same,
so use the common API now that it's available.

Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: Steve Wise <swise@opengridcomputing.com>
Cc: Anton Blanchard <anton@samba.org>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 14:18:25 +10:00
Linus Torvalds
7a6362800c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1480 commits)
  bonding: enable netpoll without checking link status
  xfrm: Refcount destination entry on xfrm_lookup
  net: introduce rx_handler results and logic around that
  bonding: get rid of IFF_SLAVE_INACTIVE netdev->priv_flag
  bonding: wrap slave state work
  net: get rid of multiple bond-related netdevice->priv_flags
  bonding: register slave pointer for rx_handler
  be2net: Bump up the version number
  be2net: Copyright notice change. Update to Emulex instead of ServerEngines
  e1000e: fix kconfig for crc32 dependency
  netfilter ebtables: fix xt_AUDIT to work with ebtables
  xen network backend driver
  bonding: Improve syslog message at device creation time
  bonding: Call netif_carrier_off after register_netdevice
  bonding: Incorrect TX queue offset
  net_sched: fix ip_tos2prio
  xfrm: fix __xfrm_route_forward()
  be2net: Fix UDP packet detected status in RX compl
  Phonet: fix aligned-mode pipe socket buffer header reserve
  netxen: support for GbE port settings
  ...

Fix up conflicts in drivers/staging/brcm80211/brcmsmac/wl_mac80211.c
with the staging updates.
2011-03-16 16:29:25 -07:00
Steve Wise
db5d040d7b RDMA/cxgb4: Debugfs dump_qp() updates
- Show whether the SQ is in onchip memory or not.
- Dump both SQ and RQ QIDs.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-03-14 12:09:14 -07:00
Steve Wise
767fbe8151 RDMA/cxgb4: Dispatch FATAL event on EEH errors
This at least kicks the user mode applications that are watching for
device events.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-03-14 12:09:13 -07:00
Steve Wise
b48f3b9c10 RDMA/cxgb4: Use ULP_MODE_TCPDDP
Set the ULP mode for initial RDMA connection setup to the proper DDP
mode.  This avoids wasting some HW resources while in streaming mode.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-03-14 12:09:12 -07:00
Steve Wise
a9c7719800 RDMA/cxgb4: Enable on-chip SQ support by default
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-03-14 12:09:12 -07:00
Steve Wise
ffc3f7487f RDMA/cxgb4: Do CIDX_INC updates every 1/16 CQ depth CQE reaps
This avoids the CIDX_INC overflow issue with T4A2 when running
kernel RDMA applications.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-03-14 12:09:11 -07:00
Steve Wise
2942813739 RDMA/cxgb4: Remove db_drop_task
Unloading iw_cxgb4 can crash due to the unload code trying to use
db_drop_task, which is uninitialized.  So remove this dead code.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-03-14 12:09:10 -07:00
Steve Wise
b52fe09e33 RDMA/cxgb4: Turn on delayed ACK
Set the default to on.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-03-14 12:09:09 -07:00
David S. Miller
78fbfd8a65 ipv4: Create and use route lookup helpers.
The idea here is this minimizes the number of places one has to edit
in order to make changes to how flows are defined and used.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:42 -08:00
David S. Miller
b23dd4fe42 ipv4: Make output route lookup return rtable directly.
Instead of on the stack.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-02 14:31:35 -08:00
David S. Miller
273447b352 ipv4: Kill can_sleep arg to ip_route_output_flow()
This boolean state is now available in the flow flags.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-01 14:27:04 -08:00
David S. Miller
420d44daa7 ipv4: Make final arg to ip_route_output_flow to be boolean "can_sleep"
Since that is what the current vague "flags" argument means.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-01 14:19:23 -08:00
Steve Wise
94788657c9 RDMA/cxgb4: Set the correct device physical function for iWARP connections
The PF passed to FW was 0, causing PCI failures in an SR-IOV environment.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Cc: <stable@kernel.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-01-28 15:34:28 -08:00
Steve Wise
6a09a9d694 RDMA/cxgb4: Limit MAXBURST EQ context field to 256B
MAXBURST cannot exceed 256B for on-chip queues.  With a 512B MAXBURST,
we can lock up the chip.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Cc: <stable@kernel.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-01-28 15:34:24 -08:00
Linus Torvalds
008d23e485 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits)
  Documentation/trace/events.txt: Remove obsolete sched_signal_send.
  writeback: fix global_dirty_limits comment runtime -> real-time
  ppc: fix comment typo singal -> signal
  drivers: fix comment typo diable -> disable.
  m68k: fix comment typo diable -> disable.
  wireless: comment typo fix diable -> disable.
  media: comment typo fix diable -> disable.
  remove doc for obsolete dynamic-printk kernel-parameter
  remove extraneous 'is' from Documentation/iostats.txt
  Fix spelling milisec -> ms in snd_ps3 module parameter description
  Fix spelling mistakes in comments
  Revert conflicting V4L changes
  i7core_edac: fix typos in comments
  mm/rmap.c: fix comment
  sound, ca0106: Fix assignment to 'channel'.
  hrtimer: fix a typo in comment
  init/Kconfig: fix typo
  anon_inodes: fix wrong function name in comment
  fix comment typos concerning "consistent"
  poll: fix a typo in comment
  ...

Fix up trivial conflicts in:
 - drivers/net/wireless/iwlwifi/iwl-core.c (moved to iwl-legacy.c)
 - fs/ext4/ext4.h

Also fix missed 'diabled' typo in drivers/net/bnx2x/bnx2x.h while at it.
2011-01-13 10:05:56 -08:00
Steve Wise
db8b101671 RDMA/cxgb4: Don't re-init wait object in init/fini paths
Re-initializing the wait object in rdma_init()/rdma_fini() causes a
timing window which can lead to a deadlock during close.  Once this
deadlock hits, all RDMA activity over the T4 device will be stuck.

There's no need to re-init the wait object, so remove it.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Cc: <stable@kernel.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2011-01-10 17:41:43 -08:00
Stephen Hemminger
c943109163 RDMA/cxgb3,cxgb4: Remove dead code
This removes unused code found by running 'make namespacecheck';
compile tested only.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2011-01-10 17:41:43 -08:00
Jesper Juhl
e987fa357a infiniband: Only include mutex.h once in drivers/infiniband/hw/cxgb4/iw_cxgb4.h
Only include the header linux/mutex.h once inside
drivers/infiniband/hw/cxgb4/iw_cxgb4.h

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-11-15 14:35:19 +01:00
Linus Torvalds
9e5fca251f Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (63 commits)
  IB/qib: clean up properly if pci_set_consistent_dma_mask() fails
  IB/qib: Allow driver to load if PCIe AER fails
  IB/qib: Fix uninitialized pointer if CONFIG_PCI_MSI not set
  IB/qib: Fix extra log level in qib_early_err()
  RDMA/cxgb4: Remove unnecessary KERN_<level> use
  RDMA/cxgb3: Remove unnecessary KERN_<level> use
  IB/core: Add link layer type information to sysfs
  IB/mlx4: Add VLAN support for IBoE
  IB/core: Add VLAN support for IBoE
  IB/mlx4: Add support for IBoE
  mlx4_en: Change multicast promiscuous mode to support IBoE
  mlx4_core: Update data structures and constants for IBoE
  mlx4_core: Allow protocol drivers to find corresponding interfaces
  IB/uverbs: Return link layer type to userspace for query port operation
  IB/srp: Sync buffer before posting send
  IB/srp: Use list_first_entry()
  IB/srp: Reduce number of BUSY conditions
  IB/srp: Eliminate two forward declarations
  IB/mlx4: Signal node desc changes to SM by using FW to generate trap 144
  IB: Replace EXTRA_CFLAGS with ccflags-y
  ...
2010-10-26 17:54:22 -07:00
Roland Dreier
116e9535fe Merge branches 'amso1100', 'cma', 'cxgb3', 'cxgb4', 'ehca', 'iboe', 'ipoib', 'misc', 'mlx4', 'nes', 'qib' and 'srp' into for-next 2010-10-26 16:09:11 -07:00
Joe Perches
aa1ad26089 RDMA/cxgb4: Remove unnecessary KERN_<level> use
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-10-26 13:45:59 -07:00
matt mooney
7454159d3c IB: Replace EXTRA_CFLAGS with ccflags-y
Signed-off-by: matt mooney <mfm@muteddisk.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-10-23 13:45:03 -07:00
Steve Wise
da411ba1da RDMA/cxgb4: Use cxgb4 service for packet gl to skb
Remove the local service t4_pktgl_to_skb() and use cxgb4_pktgl_to_skb()
exported by cxgb4.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-10-22 21:58:50 -07:00
Steve Wise
de5dd81b49 RDMA/cxgb4: Export T4 TCP MIB
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-10-22 21:57:26 -07:00
Justin P. Mattock
631dd1a885 Update broken web addresses in the kernel.
The patch below updates broken web addresses in the kernel

Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Finn Thain <fthain@telegraphics.com.au>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Dimitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Mike Frysinger <vapier.adi@gmail.com>
Acked-by: Ben Pfaff <blp@cs.stanford.edu>
Acked-by: Hans J. Koch <hjk@linutronix.de>
Reviewed-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-10-18 11:03:14 +02:00
Steve Wise
3160977a6e RDMA/cxgb4: Use simple_read_from_buffer() for debugfs handlers
We can replace our equivalent open-coded version.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-10-11 20:15:14 -07:00
Steve Wise
8bbac892fb RDMA/cxgb4: Add default_llseek to debugfs files
Incorporate BKL removal changes.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-10-11 20:14:00 -07:00
Steve Wise
40dbf6ee38 RDMA/cxgb4: Fastreg NSMR fixes
- Remove dsgl support - doesn't work in T4.
- Wrap the immediate PBL as needed when building it in the wr.
- Adjust max pbl depth allowed based on ulptx alignment requirements.
- Bump the slots per SQ to 5 to allow up to 128MB fast registers.
- Advertise fastreg support by default.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-09-28 10:53:50 -07:00
Steve Wise
410ade4c26 RDMA/cxgb4: Don't set completion flag for read requests
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-09-28 10:53:49 -07:00
Steve Wise
98ae68b7ee RDMA/cxgb4: Set the default TCP send window to 128KB
This helps with large IO throughput.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-09-28 10:53:48 -07:00
Steve Wise
2f5b48c3ad RDMA/cxgb4: Use a mutex for QP and EP state transitions
Move the connection setup/teardown paths to the workq thread removing
spin lock/irq disable requirements for these paths.  This allows calls
down to the LLD for EP and QP state transition actions to be atomic
with respect to processing CPL messages coming up from the HW.
Namely, calls to rdma_init() and rdma_fini() can now be called with
the mutex held avoiding many race conditions with the abort path.

The QP spinlock is still used but only to manipulate the qp state.  This
allows the fastpaths, poll, post_send, and pos_recv, to run in the
irq context.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-09-28 10:53:48 -07:00
Steve Wise
c6d7b26791 RDMA/cxgb4: Support on-chip SQs
T4 support on-chip SQs to reduce latency.  This patch adds support for
this in iw_cxgb4:

 - Manage ocqp memory like other adapter mem resources.
 - Allocate user mode SQs from ocqp mem if available.
 - Map ocqp mem to user process using write combining.
 - Map PCIE_MA_SYNC reg to user process.

Bump uverbs ABI.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-09-28 10:46:35 -07:00
Steve Wise
aadc4df308 RDMA/cxgb4: Centralize the wait logic
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-09-28 10:46:34 -07:00
Steve Wise
9e8d1fa342 RDMA/cxgb4: debugfs files for dumping active stags
Add "stags" debugfs file.  This is useful for examining the TPTE and
PBL entries in adapter memory.  It allows scripts to dump just the
active entries.

Also clean up the "qps" file handlers and shared common code.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-09-28 10:46:33 -07:00
Steve Wise
05fb962947 RDMA/cxgb4: Log HW lack-of-resource errors
This helps debug cases where HW resources are depleted.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-09-28 10:46:33 -07:00
Steve Wise
0e42c1f430 RDMA/cxgb4: Handle CPL_RDMA_TERMINATE messages
T4 FW sends up CPL_RDMA_TERMINATE to indicate a peer TERM.  This
triggers the QP moving to TERMINATE state.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-09-28 10:46:32 -07:00
Steve Wise
6ff0e343b3 RDMA/cxgb4: Ignore TERMINATE CQEs
T4 incorrectly inserts TERM CQEs into the CQ.  Silently ignore them.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-09-28 10:46:31 -07:00
Steve Wise
7459486133 RDMA/cxgb4: Ignore positive return values from cxgb4_*_send() functions
The cxgb4_*_send() functions return NET_XMIT_ values, which are
positive integers or negative errno values.  So don't treat positive
return values as an error.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-09-28 10:46:31 -07:00
Steve Wise
13fecb83b4 RDMA/cxgb4: Zero out ISGL padding
The HW design requires zeroing any pad in SGLs.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-09-28 10:46:30 -07:00
Steve Wise
af93fb5dcc RDMA/cxgb4: Don't use null ep ptr
In c4iw_modify_qp() error path, only use qhp->ep if ep is not already set.
Otherwise qhp->ep can be NULL and we crash.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-09-28 10:46:29 -07:00
Roland Dreier
c8e081a1bf RDMA/cxgb4: Fix warnings about casts to/from pointers of different sizes
Fix:

  drivers/infiniband/hw/cxgb4/qp.c: In function ‘create_qp’:
  drivers/infiniband/hw/cxgb4/qp.c:147: warning: cast from pointer to integer of different size
  drivers/infiniband/hw/cxgb4/qp.c: In function ‘rdma_fini’:
  drivers/infiniband/hw/cxgb4/qp.c:988: warning: cast from pointer to integer of different size
  drivers/infiniband/hw/cxgb4/qp.c: In function ‘rdma_init’:
  drivers/infiniband/hw/cxgb4/qp.c:1063: warning: cast from pointer to integer of different size
  drivers/infiniband/hw/cxgb4/mem.c: In function ‘write_adapter_mem’:
  drivers/infiniband/hw/cxgb4/mem.c:74: warning: cast from pointer to integer of different size
  drivers/infiniband/hw/cxgb4/cq.c: In function ‘destroy_cq’:
  drivers/infiniband/hw/cxgb4/cq.c:58: warning: cast from pointer to integer of different size
  drivers/infiniband/hw/cxgb4/cq.c: In function ‘create_cq’:
  drivers/infiniband/hw/cxgb4/cq.c:135: warning: cast from pointer to integer of different size
  drivers/infiniband/hw/cxgb4/cm.c: In function ‘fw6_msg’:
  drivers/infiniband/hw/cxgb4/cm.c:2326: warning: cast to pointer from integer of different size

by casting pointers to unsigned long instead of u64.

Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-09-27 17:51:04 -07:00
Steve Wise
93fb72e443 RDMA/cxgb4: Obtain RDMA QID ranges from LLD/FW
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-08-07 23:08:47 -07:00
Linus Torvalds
3cc08fc35d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (42 commits)
  IB/qib: Add missing <linux/slab.h> include
  IB/ehca: Drop unnecessary NULL test
  RDMA/nes: Fix confusing if statement indentation
  IB/ehca: Init irq tasklet before irq can happen
  RDMA/nes: Fix misindented code
  RDMA/nes: Fix showing wqm_quanta
  RDMA/nes: Get rid of "set but not used" variables
  RDMA/nes: Read firmware version from correct place
  IB/srp: Export req_lim via sysfs
  IB/srp: Make receive buffer handling more robust
  IB/srp: Use print_hex_dump()
  IB: Rename RAW_ETY to RAW_ETHERTYPE
  RDMA/nes: Fix two sparse warnings
  RDMA/cxgb3: Make needlessly global iwch_l2t_send() static
  IB/iser: Make needlessly global iser_alloc_rx_descriptors() static
  RDMA/cxgb4: Add timeouts when waiting for FW responses
  IB/qib: Fix race between qib_error_qp() and receive packet processing
  IB/qib: Limit the number of packets processed per interrupt
  IB/qib: Allow writes to the diag_counters to be able to clear them
  IB/qib: Set cfgctxts to number of CPUs by default
  ...
2010-08-07 17:08:02 -07:00
Linus Torvalds
3cfc2c42c1 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (48 commits)
  Documentation: update broken web addresses.
  fix comment typo "choosed" -> "chosen"
  hostap:hostap_hw.c Fix typo in comment
  Fix spelling contorller -> controller in comments
  Kconfig.debug: FAIL_IO_TIMEOUT: typo Faul -> Fault
  fs/Kconfig: Fix typo Userpace -> Userspace
  Removing dead MACH_U300_BS26
  drivers/infiniband: Remove unnecessary casts of private_data
  fs/ocfs2: Remove unnecessary casts of private_data
  libfc: use ARRAY_SIZE
  scsi: bfa: use ARRAY_SIZE
  drm: i915: use ARRAY_SIZE
  drm: drm_edid: use ARRAY_SIZE
  synclink: use ARRAY_SIZE
  block: cciss: use ARRAY_SIZE
  comment typo fixes: charater => character
  fix comment typos concerning "challenge"
  arm: plat-spear: fix typo in kerneldoc
  reiserfs: typo comment fix
  update email address
  ...
2010-08-04 15:31:02 -07:00
Linus Torvalds
6ba74014c1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1443 commits)
  phy/marvell: add 88ec048 support
  igb: Program MDICNFG register prior to PHY init
  e1000e: correct MAC-PHY interconnect register offset for 82579
  hso: Add new product ID
  can: Add driver for esd CAN-USB/2 device
  l2tp: fix export of header file for userspace
  can-raw: Fix skb_orphan_try handling
  Revert "net: remove zap_completion_queue"
  net: cleanup inclusion
  phy/marvell: add 88e1121 interface mode support
  u32: negative offset fix
  net: Fix a typo from "dev" to "ndev"
  igb: Use irq_synchronize per vector when using MSI-X
  ixgbevf: fix null pointer dereference due to filter being set for VLAN 0
  e1000e: Fix irq_synchronize in MSI-X case
  e1000e: register pm_qos request on hardware activation
  ip_fragment: fix subtracting PPPOE_SES_HLEN from mtu twice
  net: Add getsockopt support for TCP thin-streams
  cxgb4: update driver version
  cxgb4: add new PCI IDs
  ...

Manually fix up conflicts in:
 - drivers/net/e1000e/netdev.c: due to pm_qos registration
   infrastructure changes
 - drivers/net/phy/marvell.c: conflict between adding 88ec048 support
   and cleaning up the IDs
 - drivers/net/wireless/ipw2x00/ipw2100.c: trivial ipw2100_pm_qos_req
   conflict (registration change vs marking it static)
2010-08-04 11:47:58 -07:00
Steve Wise
a5f4a07820 RDMA/cxgb4: Add timeouts when waiting for FW responses
Don't hang a host thread if the FW stops responding.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-08-04 09:54:42 -07:00
Jiri Kosina
d790d4d583 Merge branch 'master' into for-next 2010-08-04 15:14:38 +02:00
Steve Wise
ca5a22028d RDMA/cxgb4: Set/reset the EP timer inside EP lock
Endpoint timer manipulation needs to be done inside the lock.  Otherwise
we can get into a situation where a timer is stopped before it is started,
which hits the WARN_ON() in stop_ep_timer().

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-08-02 21:06:17 -07:00
Steve Wise
d4f1a5c6ef RDMA/cxgb4: Use correct control txq
There is only one control txq per tx channel.  So use the port number
as the queue index when sending.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-08-02 21:06:12 -07:00
Steve Wise
73d6fcad2a RDMA/cxgb4: Fix race in fini path
There exists a race condition where the app disconnects, which
initiates an orderly close (via rdma_fini()), concurrently with an
ingress abort condition, which initiates an abortive close operation.
Since rdma_fini() must be called without IRQs disabled, the fini can
be called after the QP has been transitioned to ERROR.  This is ok,
but we need to protect against qp->ep getting NULLed.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-08-02 21:06:06 -07:00
Steve Wise
d37ac31ddc RDMA/cxgb4: Support variable sized work requests
T4 EQ entries are in multiples of 64 bytes.  Currently the RDMA SQ and
RQ use fixed sized entries composed of 4 EQ entries for the SQ and 2
EQ entries for the RQ.  For optimial latency with small IO, we need to
change this so the HW only needs to DMA the EQ entries actually used
by a given work request.

Implementation:

- add wq_pidx counter to track where we are in the EQ.  cidx/pidx are
  used for the sw sq/rq tracking and flow control.

- the variable part of work requests is the SGL.  Add new functions to
  build the SGL and/or immediate data directly in the EQ memory
  wrapping when needed.

- adjust the min burst size for the EQ contexts to 64B.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-21 11:16:20 -07:00
David Rientjes
d3c814e8b2 RDMA/cxgb4: Remove dependency on __GFP_NOFAIL
The alloc_skb() in various allocations are failable, so remove
__GFP_NOFAIL from their masks.

Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-21 10:55:05 -07:00
Steve Wise
ba6d39256b RDMA/cxgb4: Add module option to tweak delayed ack
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-21 10:53:52 -07:00
Roland Dreier
85963e4cbc RDMA/cxgb4: Remove unneeded NULL check
The rest of the code seems to assume that ep->com.cm_id can't be NULL,
so remove an unneeded test.

Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-19 13:13:09 -07:00
Dan Carpenter
c1d7356c85 RDMA/cxgb4: Remove unneeded assignment
We don't need to assign rpl here, we do that later on.

Signed-off-by: Dan Carpenter <error27@gmail.com>

[ Indeed this assignment makes no sense, since skb is set to NULL a
  couple of lines before.  - Roland ]

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-19 13:09:40 -07:00
Steve Wise
2c5934bfc5 RDMA/cxgb4: Derive smac_idx from port viid
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-06 14:05:16 -07:00
Steve Wise
1973e8b8ed RDMA/cxgb4: Avoid false GTS CIDX_INC overflows
The T4 IQ hw design assumes CIDX_INC credits will be returned on a
regular basis and always before the CIDX counter crosses over the PIDX
counter.  For RDMA CQs, however, returning CIDX_INC credits is only
needed and desired when and if the CQ is armed for notification.  This
can lead to a GTS write returning credits that causes the HW to reject
the credit update because it causes CIDX to pass PIDX.  Once this
happens, the CIDX/PIDX counters get out of whack and an application
can miss a notification and get stuck blocked awaiting a notification.

To avoid this, we allocate the HW IQ 2x times the requested size.
This seems to avoid the false overflow failures.  If we see more
issues with this, then we'll have to add code in the poll path to
return credits periodically like when the amount reaches 1/2 the queue
depth).  I would like to avoid this as it adds a PCI write transaction
for applications that never arm the CQ (like most MPIs).

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-06 14:04:04 -07:00
Steve Wise
b21ef16a8b RDMA/cxgb4: Don't call abort_connection() for active connect failures
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-06 14:02:54 -07:00
FUJITA Tomonori
f38926aa1d RDMA/cxgb4: Use the DMA state API instead of the pci equivalents
This replace the PCI DMA state API (include/linux/pci-dma.h) with the
DMA equivalents since the PCI DMA state API will be obsolete.

No functional change.

For further information about the background:

http://marc.info/?l=linux-netdev&m=127037540020276&w=2

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-06 14:01:42 -07:00
Jiri Kosina
f1bbbb6912 Merge branch 'master' into for-next 2010-06-16 18:08:13 +02:00
Uwe Kleine-König
732bee7af3 fix typos concerning "hierarchy"
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-06-16 18:03:14 +02:00
Changli Gao
d8d1f30b95 net-next: remove useless union keyword
remove useless union keyword in rtable, rt6_info and dn_route.

Since there is only one member in a union, the union keyword isn't useful.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-10 23:31:35 -07:00
Roland Dreier
acdc30b56a Merge branches 'cxgb4', 'misc', 'mlx4', 'nes' and 'qib' into for-next 2010-05-25 09:54:03 -07:00
Steve Wise
30a6a62fc3 RDMA/cxgb4: Only insert sq qid in lookup table
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24 21:08:05 -07:00
Steve Wise
2f1fb507ee RDMA/cxgb4: Support IB_WR_READ_WITH_INV opcode
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24 21:08:04 -07:00
Steve Wise
4ab1eb9c8d RDMA/cxgb4: Set fence flag for inv-local-stag work requests
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24 21:08:04 -07:00
Steve Wise
f64b88433c RDMA/cxgb4: Update some HW limits
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24 21:08:03 -07:00
Steve Wise
25737bd4ca RDMA/cxgb4: Don't limit fastreg page list depth
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24 21:08:03 -07:00
Steve Wise
841dba9a5a RDMA/cxgb4: Return proper errors in fastreg mr/pbl allocation
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24 21:08:02 -07:00
Steve Wise
7ec45b9234 RDMA/cxgb4: Fix overflow bug in CQ arm
- wrap cq->cqidx_inc based on cq size.
- optimize t4_arm_cq logic.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24 21:08:01 -07:00
Steve Wise
84172dee05 RDMA/cxgb4: Optimize CQ overflow detection
1) save the timestamp flit in the cq when we consume a CQE.

2) always compare the saved flit with the previous entry flit when
   reading the next CQE entry.  If the flits don't compare, then we
   have overflowed.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24 21:08:01 -07:00
Steve Wise
895cf5f3d6 RDMA/cxgb4: CQ size must be IQ size - 2
We need 1 extra entry for the status page and 1 to always have 1 free
entry to detect when the queue is full.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24 21:08:00 -07:00
Steve Wise
1c01c53883 RDMA/cxgb4: Register RDMA provider based on LLD state_change events
The LLD now supports proper UP state change events, so move the RDMA
provider registration to UP path.

This fixes a crash when loading iw_cxgb4 _after_ the NFS/RDMA
transport is up and running.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24 21:07:59 -07:00
Steve Wise
fd388ce677 RDMA/cxgb4: Detach from the LLD after unregistering RDMA device
In the RDMA core unregister path, kernel users will be calling down
into the T4 provider to release resources.  So we cannot detach from
the LLD until this process completes.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-24 21:07:59 -07:00
Ralph Campbell
9a6edb60ec IB/core: Allow device-specific per-port sysfs files
Add a new parameter to ib_register_device() so that low-level device
drivers can pass in a pointer to a callback function that will be
called for each port that is registered in sysfs.  This allows
low-level device drivers to create files in

    /sys/class/infiniband/<hca>/ports/<N>/

without having to poke through the internals of the RDMA sysfs handling.

There is no need for an unregister function since the kobject
reference will go to zero when ib_unregister_device() is called.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-21 10:34:44 -07:00
Roland Dreier
be4c9bad9d MAINTAINERS: Add cxgb4 and iw_cxgb4 entries
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-05 14:45:40 -07:00
Steve Wise
cfdda9d764 RDMA/cxgb4: Add driver for Chelsio T4 RNIC
Add an RDMA/iWARP driver for Chelsio T4 Ethernet adapters.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-04-21 15:30:06 -07:00