Commit Graph

1311 Commits

Author SHA1 Message Date
Bart Van Assche
697a35d709 IB/srpt: Remove struct srpt_node_acl
Since struct srpt_node_acl is identical to struct se_node_acl,
remove the definition of the former structure. This patch does
not change any functionality.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Estrin <alex.estrin@intel.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-02-29 17:12:34 -05:00
Bart Van Assche
9d2aa2b4fd IB/srpt: Add parentheses around sizeof argument
Although sizeof is an operator and hence in many cases parentheses can
be left out, the recommended kernel coding style is to surround the
sizeof argument with parentheses. This patch does not change any
functionality. It has been generated by running the following shell
command:

sed -i 's/sizeof \([^ );,]*\)/sizeof(\1)/g' drivers/infiniband/ulp/srpt/*.[ch]

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-02-29 17:12:34 -05:00
Bart Van Assche
51093254bf IB/srpt: Simplify srpt_handle_tsk_mgmt()
Let the target core check task existence instead of the SRP target
driver. Additionally, let the target core check the validity of the
task management request instead of the ib_srpt driver.

This patch fixes the following kernel crash:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000001
IP: [<ffffffffa0565f37>] srpt_handle_new_iu+0x6d7/0x790 [ib_srpt]
Oops: 0002 [#1] SMP
Call Trace:
 [<ffffffffa05660ce>] srpt_process_completion+0xde/0x570 [ib_srpt]
 [<ffffffffa056669f>] srpt_compl_thread+0x13f/0x160 [ib_srpt]
 [<ffffffff8109726f>] kthread+0xcf/0xe0
 [<ffffffff81613cfc>] ret_from_fork+0x7c/0xb0

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Fixes: 3e4f574857 ("ib_srpt: Convert TMR path to target_submit_tmr")
Tested-by: Alex Estrin <alex.estrin@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-02-29 17:12:34 -05:00
Steve Wise
4c8ba94d17 IB/iser: Use ib_drain_sq()
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-02-29 17:10:27 -05:00
Steve Wise
561392d42d IB/srp: Use ib_drain_rq()
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-02-29 17:10:27 -05:00
Alex Estrin
08bc327629 IB/ipoib: fix for rare multicast join race condition
A narrow window for race condition still exist between
multicast join thread and *dev_flush workers.
A kernel crash caused by prolong erratic link state changes
was observed (most likely a faulty cabling):

[167275.656270] BUG: unable to handle kernel NULL pointer dereference at
0000000000000020
[167275.665973] IP: [<ffffffffa05f8f2e>] ipoib_mcast_join+0xae/0x1d0 [ib_ipoib]
[167275.674443] PGD 0
[167275.677373] Oops: 0000 [#1] SMP
...
[167275.977530] Call Trace:
[167275.982225]  [<ffffffffa05f92f0>] ? ipoib_mcast_free+0x200/0x200 [ib_ipoib]
[167275.992024]  [<ffffffffa05fa1b7>] ipoib_mcast_join_task+0x2a7/0x490
[ib_ipoib]
[167276.002149]  [<ffffffff8109d5fb>] process_one_work+0x17b/0x470
[167276.010754]  [<ffffffff8109e3cb>] worker_thread+0x11b/0x400
[167276.019088]  [<ffffffff8109e2b0>] ? rescuer_thread+0x400/0x400
[167276.027737]  [<ffffffff810a5aef>] kthread+0xcf/0xe0
Here was a hit spot:
ipoib_mcast_join() {
..............
      rec.qkey      = priv->broadcast->mcmember.qkey;
                                       ^^^^^^^
.....
 }
Proposed patch should prevent multicast join task to continue
if link state change is detected.

Signed-off-by: Alex Estrin <alex.estrin@intel.com>

Changes from v4:
- as suggested by Doug Ledford, optimized spinlock usage,
i.e. ipoib_mcast_join() is called with lock held.
Changes from v3:
- sync with priv->lock before flag check.
Chages from v2:
- Move check for OPER_UP flag state to mcast_join() to
ensure no event worker is in progress.
- minor style fixes.
Changes from v1:
- No need to lock again if error detected.
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-02-12 14:53:22 -05:00
Carol L Soto
bb6a777369 IB/IPoIB: Do not set skb truesize since using one linearskb
We are seeing this warning: at net/core/skbuff.c:4174
and before commit a44878d100 ("IB/ipoib: Use one linear skb in RX flow")
skb truesize was not being set when ipoib was using just one skb.
Removing this line avoids the warning when running tcp tests like iperf.

Fixes: a44878d100 ("IB/ipoib: Use one linear skb in RX flow")
Signed-off-by: Carol L Soto <clsoto@linux.vnet.ibm.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-02-04 07:08:12 -05:00
Linus Torvalds
048ccca8c1 Initial roundup of 4.5 merge window patches
- Remove usage of ib_query_device and instead store attributes in
   ib_device struct
 - Move iopoll out of block and into lib, rename to irqpoll, and use
   in several places in the rdma stack as our new completion queue
   polling library mechanism.  Update the other block drivers that
   already used iopoll to use the new mechanism too.
 - Replace the per-entry GID table locks with a single GID table lock
 - IPoIB multicast cleanup
 - Cleanups to the IB MR facility
 - Add support for 64bit extended IB counters
 - Fix for netlink oops while parsing RDMA nl messages
 - RoCEv2 support for the core IB code
 - mlx4 RoCEv2 support
 - mlx5 RoCEv2 support
 - Cross Channel support for mlx5
 - Timestamp support for mlx5
 - Atomic support for mlx5
 - Raw QP support for mlx5
 - MAINTAINERS update for mlx4/mlx5
 - Misc ocrdma, qib, nes, usNIC, cxgb3, cxgb4, mlx4, mlx5 updates
 - Add support for remote invalidate to the iSER driver (pushed through the
   RDMA tree due to dependencies, acknowledged by nab)
 - Update to NFSoRDMA (pushed through the RDMA tree due to dependencies,
   acknowledged by Bruce)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWoSygAAoJELgmozMOVy/dDjsP/2vbTda2MvQfkfkGEZBQdJSg
 095RN0gQgCJdg78lAl8yuaK8r4VN/7uefpDtFdudH1I/Pei7X0wxN9R1UzFNG4KR
 AD53lz92IVPs15328SbPR2kvNWISR9aBFQo3rlElq3Grqlp0EMn2Ou1vtu87rekF
 aMllxr8Nl0uZhP+eWusOsYpJUUtwirLgRnrAyfqo2UxZh/TMIroT0TCx1KXjVcAg
 dhDARiZAdu3OgSc6OsWqmH+DELEq6dFVA5F+DDBGAb8bFZqlJc7cuMHWInwNsNXT
 so4bnEQ835alTbsdYtqs5DUNS8heJTAJP4Uz0ehkTh/uNCcvnKeUTw1c2P/lXI1k
 7s33gMM+0FXj0swMBw0kKwAF2d9Hhus9UAN7NwjBuOyHcjGRd5q7SAnfWkvKx000
 s9jVW19slb2I38gB58nhjOh8s+vXUArgxnV1+kTia1+bJSR5swvVoWRicRXdF0vh
 TvLX/BjbSIU73g1TnnLNYoBTV3ybFKQ6bVdQW7fzSTDs54dsI1vvdHXi3bYZCpnL
 HVwQTZRfEzkvb0AdKbcvf8p/TlaAHem3ODqtO1eHvO4if1QJBSn+SptTEeJVYYdK
 n4B3l/dMoBH4JXJUmEHB9jwAvYOpv/YLAFIvdL7NFwbqGNsC3nfXFcmkVORB1W3B
 KEMcM2we4bz+uyKMjEAD
 =5oO7
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma

Pull rdma updates from Doug Ledford:
 "Initial roundup of 4.5 merge window patches

   - Remove usage of ib_query_device and instead store attributes in
     ib_device struct

   - Move iopoll out of block and into lib, rename to irqpoll, and use
     in several places in the rdma stack as our new completion queue
     polling library mechanism.  Update the other block drivers that
     already used iopoll to use the new mechanism too.

   - Replace the per-entry GID table locks with a single GID table lock

   - IPoIB multicast cleanup

   - Cleanups to the IB MR facility

   - Add support for 64bit extended IB counters

   - Fix for netlink oops while parsing RDMA nl messages

   - RoCEv2 support for the core IB code

   - mlx4 RoCEv2 support

   - mlx5 RoCEv2 support

   - Cross Channel support for mlx5

   - Timestamp support for mlx5

   - Atomic support for mlx5

   - Raw QP support for mlx5

   - MAINTAINERS update for mlx4/mlx5

   - Misc ocrdma, qib, nes, usNIC, cxgb3, cxgb4, mlx4, mlx5 updates

   - Add support for remote invalidate to the iSER driver (pushed
     through the RDMA tree due to dependencies, acknowledged by nab)

   - Update to NFSoRDMA (pushed through the RDMA tree due to
     dependencies, acknowledged by Bruce)"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (169 commits)
  IB/mlx5: Unify CQ create flags check
  IB/mlx5: Expose Raw Packet QP to user space consumers
  {IB, net}/mlx5: Move the modify QP operation table to mlx5_ib
  IB/mlx5: Support setting Ethernet priority for Raw Packet QPs
  IB/mlx5: Add Raw Packet QP query functionality
  IB/mlx5: Add create and destroy functionality for Raw Packet QP
  IB/mlx5: Refactor mlx5_ib_qp to accommodate other QP types
  IB/mlx5: Allocate a Transport Domain for each ucontext
  net/mlx5_core: Warn on unsupported events of QP/RQ/SQ
  net/mlx5_core: Add RQ and SQ event handling
  net/mlx5_core: Export transport objects
  IB/mlx5: Expose CQE version to user-space
  IB/mlx5: Add CQE version 1 support to user QPs and SRQs
  IB/mlx5: Fix data validation in mlx5_ib_alloc_ucontext
  IB/sa: Fix netlink local service GFP crash
  IB/srpt: Remove redundant wc array
  IB/qib: Improve ipoib UD performance
  IB/mlx4: Advertise RoCE v2 support
  IB/mlx4: Create and use another QP1 for RoCEv2
  IB/mlx4: Enable send of RoCE QP1 packets with IP/UDP headers
  ...
2016-01-23 18:45:06 -08:00
Linus Torvalds
71e4634e00 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target updates from Nicholas Bellinger:
 "The highlights this round include:

   - Introduce configfs support for unlocked configfs_depend_item()
     (krzysztof + andrezej)
   - Conversion of usb-gadget target driver to new function registration
     interface (andrzej + sebastian)
   - Enable qla2xxx FC target mode support for Extended Logins (himansu +
     giridhar)
   - Enable qla2xxx FC target mode support for Exchange Offload (himansu +
     giridhar)
   - Add qla2xxx FC target mode irq affinity notification + selective
     command queuing.  (quinn + himanshu)
   - Fix iscsi-target deadlock in se_node_acl configfs deletion (sagi +
     nab)
   - Convert se_node_acl configfs deletion + se_node_acl->queue_depth to
     proper se_session->sess_kref + target_get_session() usage.  (hch +
     sagi + nab)
   - Fix long-standing race between se_node_acl->acl_kref get and
     get_initiator_node_acl() lookup.  (hch + nab)
   - Fix target/user block-size handling, and make sure netlink reaches
     all network namespaces (sheng + andy)

  Note there is an outstanding bug-fix series for remote I_T nexus port
  TMR LUN_RESET has been posted and still being tested, and will likely
  become post -rc1 material at this point"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (56 commits)
  scsi: qla2xxxx: avoid type mismatch in comparison
  target/user: Make sure netlink would reach all network namespaces
  target: Obtain se_node_acl->acl_kref during get_initiator_node_acl
  target: Convert ACL change queue_depth se_session reference usage
  iscsi-target: Fix potential dead-lock during node acl delete
  ib_srpt: Convert acl lookup to modern get_initiator_node_acl usage
  tcm_fc: Convert acl lookup to modern get_initiator_node_acl usage
  tcm_fc: Wait for command completion before freeing a session
  target: Fix a memory leak in target_dev_lba_map_store()
  target: Support aborting tasks with a 64-bit tag
  usb/gadget: Remove set-but-not-used variables
  target: Remove an unused variable
  target: Fix indentation in target_core_configfs.c
  target/user: Allow user to set block size before enabling device
  iser-target: Fix non negative ERR_PTR isert_device_get usage
  target/fcoe: Add tag support to tcm_fc
  qla2xxx: Check for online flag instead of active reset when transmitting responses
  qla2xxx: Set all queues to 4k
  qla2xxx: Disable ZIO at start time.
  qla2xxx: Move atioq to a different lock to reduce lock contention
  ...
2016-01-20 17:20:53 -08:00
Sagi Grimberg
f9a6ed62c4 IB/srpt: Remove redundant wc array
No usage after the conversion to the new CQ API.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19 16:40:31 -05:00
Bart Van Assche
19f57298f0 IB/srpt: Fix the RDMA completion handlers
Avoid that the following kernel crash is triggered when processing
an RDMA completion:

BUG: unable to handle kernel paging request at 0000000100000198
IP: [<ffffffff810a4ea2>] __lock_acquire+0xa2/0x560
Call Trace:
 [<ffffffff810a53c2>] lock_acquire+0x62/0x80
 [<ffffffff8151bd33>] _raw_spin_lock_irqsave+0x43/0x60
 [<ffffffffa04fd437>] srpt_rdma_read_done+0x57/0x120 [ib_srpt]
 [<ffffffffa0144dd3>] __ib_process_cq+0x43/0xc0 [ib_core]
 [<ffffffffa0145115>] ib_cq_poll_work+0x25/0x70 [ib_core]
 [<ffffffff8107184d>] process_one_work+0x1bd/0x460
 [<ffffffff81073148>] worker_thread+0x118/0x420
 [<ffffffff81078454>] kthread+0xe4/0x100
 [<ffffffff8151cbbf>] ret_from_fork+0x3f/0x70

Fixes: commit 59fae4deaa ("IB/srpt: chain RDMA READ/WRITE requests").
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19 15:26:55 -05:00
Christoph Hellwig
ca281265c0 IB/mad: pass ib_mad_send_buf explicitly to the recv_handler
Stop abusing wr_id and just pass the parameter explicitly.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hal Rosenstock <hal@mellanox.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19 15:25:36 -05:00
Erez Shitrit
50be28de6f IB/IPoIB: Fix kernel panic on multicast flow
ipoib_mcast_restart_task calls ipoib_mcast_remove_list with the
parameter mcast->dev. That mcast is a temporary (used as an iterator)
variable that may be uninitialized.
There is no need to send the variable dev to the function, as each mcast
has its dev as a member in the mcast struct.

This causes the next panic:
RIP: 0010: ipoib_mcast_leave+0x6d/0xf0 [ib_ipoib]
RSP: 0018: EFLAGS: 00010246
RAX: f0201 RBX: 24e00 RCX: 00000
....
....
Stack:
Call Trace:
	ipoib_mcast_remove_list+0x3a/0x70 [ib_ipoib]
	ipoib_mcast_restart_task+0x3bb/0x520 [ib_ipoib]
	process_one_work+0x164/0x470
	worker_thread+0x11d/0x420
	...

Fixes: 5a0e81f6f4 ('IB/IPoIB: factor out common multicast list removal code')
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Reported-by: Doron Tsur <doront@mellanox.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19 12:59:54 -05:00
Nicholas Bellinger
f246c94154 ib_srpt: Convert acl lookup to modern get_initiator_node_acl usage
This patch does a simple conversion of ib_srpt code to use
proper modern core_tpg_get_initiator_node_acl() lookup using
se_node_acl->acl_kref, and drops the legacy internal list
usage from srpt_lookup_acl().

This involves doing transport_init_session() earlier, and
making sure transport_free_session() is called during
a se_node_acl lookup failure to drop the last ->acl_kref.

Also, it adds a minor backwards-compat hack to avoid the
potential for user-space wrt node-acl WWPN formatting by
simply stripping off '0x' prefix from ch->sess_name, and
retrying once if core_tpg_get_initiator_node_acl() fails.

Finally, go ahead and drop port_acl_list port_acl_lock
since they are no longer used.

Cc: Vu Pham <vu@mellanox.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-01-12 23:44:32 -08:00
Nicholas Bellinger
373a4cd737 iser-target: Fix non negative ERR_PTR isert_device_get usage
As reported by Dan, isert_create_device_ib_res() failure within
isert_device_get() can potentially return a postive value,
resulting in ERR_PTR() triggering a NULL pointer dereference.

Caught by the static checker:

     drivers/infiniband/ulp/isert/ib_isert.c:423 isert_device_get()
     error: passing non negative 1 to ERR_PTR

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-01-07 13:57:51 -08:00
Jenny Derzhavetz
59caaed7a7 IB/iser: Support the remote invalidation exception
Declare that we support remote invalidation in case we are:
1. using fastreg method
2. always registering memory

Detect the invalidated rkey from the work completion info so we
won't invalidate it locally. The spec mandates that we must not rely
on the target remote invalidate our rkey so we must check it upon
a receive (scsi response) completion.

Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-26 19:27:10 -05:00
Sagi Grimberg
e26d2d21ff IB/iser: Change the increment rkey flow logic
When we enable remote invalidate support we won't want to perform
local invalidates at the same time we do today, but we still need
to get new rkeys.  So, decouple the rkey update from the local
invalidate and tie it to memory reg instead.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-24 00:17:36 -05:00
Jenny Derzhavetz
422bd0acb0 IB/isert: Support the remote invalidation exception
We'll use remote invalidate, according to negotiation result
during connection establishment. If the initiator declared that
it supports the remote invalidate exception and the local HCA
supports IB_DEVICE_MEM_MGT_EXTENSIONS then the target will
use IB_WR_SEND_WITH_INV with the correct rkey for the response.

Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-24 00:17:36 -05:00
Jenny Derzhavetz
13bce4821f IB/isert: Declare correct flags when accepting a connection
iser target does not support zero based virtual addresses and
send with invalidate, so it should declare that it doesn't.

Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-24 00:17:35 -05:00
Sagi Grimberg
c64941533f IB/isert: Remove unused file iser_proto.h
We don't need iser_proto.h anymore, remove it and
move (non-protocol) declarations to ib_isert.h

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-24 00:17:35 -05:00
Sagi Grimberg
d3cf81f9c8 IB/iser,isert: Create and use new shared header
The iser RDMA_CM negotiation protocol is shared by
the initiator and the target, so have a shared header
for the defines and structure. Move relevant items from
the initiator and target headers.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-24 00:17:35 -05:00
Jenny Derzhavetz
1caa70d8a7 IB/iser: set intuitive values for mr_valid
This parameter is described as "is mr valid indicator".
In other words, it indicates whether memory registration
is valid or not. So intuitive values would be:
mr_valid=True, when memory registration is valid and
mr_valid=False otherwise.

Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-24 00:17:34 -05:00
Jenny Derzhavetz
b5f04b00f7 IB/iser: Don't register memory for all immediate data writes
When all the task data is sent as immediate data, we are
allowed to use the local_dma_lkey as it is not sent to
the wire.

Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-24 00:17:34 -05:00
Sagi Grimberg
bfe066e256 IB/iser: Reuse ib_sg_to_pages
We have in iser iser_sg_to_page_vec which has exactly
the same role as ib_sg_to_pages. Customize the page_vec
to hold a fake MR so we can reuse ib_sg_to_pages.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-24 00:17:34 -05:00
Roi Dayan
08ff089b12 IB/iser: Fix module init not cleaning up on error flow
Destroy workqueue on transport register error, also
release kmem cache on workqueue allocation error.

Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-24 00:17:33 -05:00
Julia Lawall
2392a4cdcb IB/iser: constify iser_reg_ops structure
The iser_reg_ops structures are never modified, so declare them as const.

Done with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-24 00:17:33 -05:00
Christoph Lameter
432c55fff4 IB/IPoIB: Move multicast specific code out of ipoib_main.c
Code cleanup to move multicast specific code that checks for
a sendonly join to ipoib_multicast.c. This allows the removal
of the export of __ipoib_mcast_find().

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23 14:28:52 -05:00
Christoph Lameter
5a0e81f6f4 IB/IPoIB: factor out common multicast list removal code
Code cleanup to remove multicast specific code from ipoib_main.c

The removal of a list of multicast groups occurs in three places.
Create a new function ipoib_mcast_remove_list(). Use this new
function in ipoib_main.c too.
That in turn allows the dropping of two functions that were
exported from ipoib_multicast.c for expiration of mc groups.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23 14:28:00 -05:00
Doug Ledford
882f3b3b91 Merge branches '4.5/Or-cleanup' and '4.5/rdma-cq' into k.o/for-4.5
Signed-off-by: Doug Ledford <dledford@redhat.com>

Conflicts:
	drivers/infiniband/ulp/iser/iser_verbs.c
2015-12-22 17:03:15 -05:00
Or Gerlitz
4a061b287b IB/ulps: Avoid calling ib_query_device
Instead, use the cached copy of the attributes present on the device.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-22 14:39:00 -05:00
Doug Ledford
c6333f9f9f Merge branch 'rdma-cq.2' of git://git.infradead.org/users/hch/rdma into 4.5/rdma-cq
Signed-off-by: Doug Ledford <dledford@redhat.com>

Conflicts:
	drivers/infiniband/ulp/srp/ib_srp.c - Conflicts with changes in
	ib_srp.c introduced during 4.4-rc updates
2015-12-15 14:10:44 -05:00
Christoph Hellwig
cfeb91b375 IB/iser: Convert to CQ abstraction
Use the new CQ abstraction to simplify completions in the iSER
initiator.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2015-12-11 14:10:52 -08:00
Sagi Grimberg
7edc5a999d IB/iser: Use helper for container_of
Nicer this way.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2015-12-11 14:10:51 -08:00
Sagi Grimberg
0f512b34c6 IB/iser: Use a dedicated descriptor for login
We'll need it later with the new CQ abstraction. also switch
login bufs to void pointers.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2015-12-11 14:10:50 -08:00
Christoph Hellwig
1dc7b1f10d IB/srp: use the new CQ API
This also moves recv completion handling from hardirq context into
softirq context.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2015-12-11 14:10:49 -08:00
Christoph Hellwig
59fae4deaa IB/srpt: chain RDMA READ/WRITE requests
Remove struct rdma_iu and instead allocate the struct ib_rdma_wr array
early and fill out directly.  This allows us to chain the WRs, and thus
archives both less lock contention on the HCA workqueue as well as much
simpler error handling.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
2015-12-11 14:10:48 -08:00
Christoph Hellwig
14d3a3b249 IB: add a proper completion queue abstraction
This adds an abstraction that allows ULPs to simply pass a completion
object and completion callback with each submitted WR and let the RDMA
core handle the nitty gritty details of how to handle completion
interrupts and poll the CQ.

In detail there is a new ib_cqe structure which just contains the
completion callback, and which can be used to get at the containing
object using container_of.  It is pointed to by the WR and WC as an
alternative to the wr_id field, similar to how many ULPs already use
the field to store a pointer using casts.

A driver using the new completion callbacks allocates it's CQs using
the new ib_create_cq API, which in addition to the number of CQEs and
the completion vectors also takes a mode on how we poll for CQEs.
Three modes are available: direct for drivers that never take CQ
interrupts and just poll for them, softirq to poll from softirq context
using the to be renamed blk-iopoll infrastructure which takes care of
rearming and budgeting, or a workqueue for consumer who want to be
called from user context.

Thanks a lot to Sagi Grimberg who helped reviewing the API, wrote
the current version of the workqueue code because my two previous
attempts sucked too much and converted the iSER initiator to the new
API.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2015-12-11 14:10:43 -08:00
Sagi Grimberg
f1a47d37fb iser-target: Remove explicit mlx4 work-around
The driver now exposes sufficient limits so we can
avoid having mlx4 specific work-around.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-08 13:01:11 -05:00
Bart Van Assche
57b0be9c0f IB/srp: Fix srp_map_sg_fr()
After dma_map_sg() has been called the return value of that function
must be used as the number of elements in the scatterlist instead of
scsi_sg_count().

Fixes: commit f7f7aab1a5 ("IB/srp: Convert to new registration API")
Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: stable <stable@vger.kernel.org> # v4.4+
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Sebastian Parschauer <sebastian.riemer@profitbricks.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07 17:20:11 -05:00
Bart Van Assche
a745f4f410 IB/srp: Fix indirect data buffer rkey endianness
Detected by sparse.

Fixes: commit 330179f2fa ("IB/srp: Register the indirect data buffer descriptor")
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: stable <stable@vger.kernel.org> # v4.3+
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Sebastian Parschauer <sebastian.riemer@profitbricks.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07 17:20:11 -05:00
Christoph Hellwig
fc925518aa IB/srp: Initialize dma_length in srp_map_idb
Without this sg_dma_len will return 0 on architectures tha have
the dma_length field.

Fixes: commit f7f7aab1a5 ("IB/srp: Convert to new registration API")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07 17:20:11 -05:00
Sagi Grimberg
09c0c0bea5 IB/srp: Fix possible send queue overflow
When using work request based memory registration (fast_reg)
we must reserve SQ entries for registration and invalidation
in addition to send operations. Each IO consumes 3 SQ entries
(registration, send, invalidation) so we need to allocate 3x
larger send-queue instead of 2x.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
CC: Stable <stable@vger.kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07 17:20:11 -05:00
Bart Van Assche
4d59ad2995 IB/srp: Fix a memory leak
If srp_connect_ch() returns a positive value then that is considered
by its caller as a connection failure but this does not result in a
scsi_host_put() call and additionally causes the srp_create_target()
function to return a positive value while it should return a negative
value. Avoid all this confusion and additionally fix a memory leak by
ensuring that srp_connect_ch() always returns a value that is <= 0.
This patch avoids that a rejected login triggers the following memory
leak:

unreferenced object 0xffff88021b24a220 (size 8):
  comm "srp_daemon", pid 56421, jiffies 4295006762 (age 4240.750s)
  hex dump (first 8 bytes):
    68 6f 73 74 35 38 00 a5                          host58..
  backtrace:
    [<ffffffff8151014a>] kmemleak_alloc+0x7a/0xc0
    [<ffffffff81165c1e>] __kmalloc_track_caller+0xfe/0x160
    [<ffffffff81260d2b>] kvasprintf+0x5b/0x90
    [<ffffffff81260e2d>] kvasprintf_const+0x8d/0xb0
    [<ffffffff81254b0c>] kobject_set_name_vargs+0x3c/0xa0
    [<ffffffff81337e3c>] dev_set_name+0x3c/0x40
    [<ffffffff81355757>] scsi_host_alloc+0x327/0x4b0
    [<ffffffffa03edc8e>] srp_create_target+0x4e/0x8a0 [ib_srp]
    [<ffffffff8133778b>] dev_attr_store+0x1b/0x20
    [<ffffffff811f27fa>] sysfs_kf_write+0x4a/0x60
    [<ffffffff811f1e8e>] kernfs_fop_write+0x14e/0x180
    [<ffffffff81176eef>] __vfs_write+0x2f/0xf0
    [<ffffffff811771e4>] vfs_write+0xa4/0x100
    [<ffffffff81177c64>] SyS_write+0x54/0xc0
    [<ffffffff8151b257>] entry_SYSCALL_64_fastpath+0x12/0x6f

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Sebastian Parschauer <sebastian.riemer@profitbricks.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07 17:20:11 -05:00
Arnd Bergmann
2c63d1072a IB/iser: use sector_div instead of do_div
do_div is the wrong way to divide a sector_t, as it is less
efficient when sector_t is 32-bit wide. With the upcoming
do_div optimizations, the kernel starts warning about this:

drivers/infiniband/ulp/iser/iser_verbs.c:1296:4: note: in expansion of macro 'do_div'
include/asm-generic/div64.h:224:22: warning: passing argument 1 of '__div64_32' from incompatible pointer type

This changes the code to use sector_div instead, which always
produces optimal code.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07 16:42:33 -05:00
Linus Torvalds
d83763f4a6 SCSI misc on 20151113
Sorry for the delay in this patch which was mostly caused by getting the
 merger of the mpt2/mpt3sas driver, which was seen as an essential item of
 maintenance work to do before the drivers diverge too much.  Unfortunately,
 this caused a compile failure (detected by linux-next), which then had to be
 fixed up and incubated.  In addition to the mpt2/3sas rework, there are
 updates from pm80xx, lpfc, bnx2fc, hpsa, ipr, aacraid, megaraid_sas, storvsc
 and ufs plus an assortment of changes including some year 2038 issues, a fix
 for a remove before detach issue in some drivers and a couple of other minor
 issues.
 
 Signed-off-by: James Bottomley <JBottomley@Odin.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABAgAGBQJWRnpiAAoJEDeqqVYsXL0MJd0IAIkNP1q/6ksMw/lam2UlbdxN
 zFsFOIhGM3xLnBFehbCx7JW3cmybA7WC5jhMjaa1OeJmYHLpwbTzvVwrLs2l/0y1
 /S+G0wwxROZfIKj/1EO3oKbSPCu9N3lStxOnmNXt5PSUEzAXrTqSNTtnMf2ZGh7j
 bQxTWEJe+66GckgGw4ozTXJHWXqM/Zs/FsYjn2h/WzFhFv8utr7zRSzHMVjcqpLG
 TK/Lt03hiTGqiignKrV9W6JzdGTWf2LGIsj/njgR0dxv59cNH8PrHIjZyknSBxgn
 lblinsCjxDHWnP4BSfTi9MQG1lEiZiWO3Y6TKkKJTgxZ9M0Eitspc+cLOiJ1mpg=
 =HvQf
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull final round of SCSI updates from James Bottomley:
 "Sorry for the delay in this patch which was mostly caused by getting
  the merger of the mpt2/mpt3sas driver, which was seen as an essential
  item of maintenance work to do before the drivers diverge too much.
  Unfortunately, this caused a compile failure (detected by linux-next),
  which then had to be fixed up and incubated.

  In addition to the mpt2/3sas rework, there are updates from pm80xx,
  lpfc, bnx2fc, hpsa, ipr, aacraid, megaraid_sas, storvsc and ufs plus
  an assortment of changes including some year 2038 issues, a fix for a
  remove before detach issue in some drivers and a couple of other minor
  issues"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (141 commits)
  mpt3sas: fix inline markers on non inline function declarations
  sd: Clear PS bit before Mode Select.
  ibmvscsi: set max_lun to 32
  ibmvscsi: display default value for max_id, max_lun and max_channel.
  mptfusion: don't allow negative bytes in kbuf_alloc_2_sgl()
  scsi: pmcraid: replace struct timeval with ktime_get_real_seconds()
  mvumi: 64bit value for seconds_since1970
  be2iscsi: Fix bogus WARN_ON length check
  scsi_scan: don't dump trace when scsi_prep_async_scan() is called twice
  mpt3sas: Bump mpt3sas driver version to 09.102.00.00
  mpt3sas: Single driver module which supports both SAS 2.0 & SAS 3.0 HBAs
  mpt2sas, mpt3sas: Update the driver versions
  mpt3sas: setpci reset kernel oops fix
  mpt3sas: Added OEM Gen2 PnP ID branding names
  mpt3sas: Refcount fw_events and fix unsafe list usage
  mpt3sas: Refcount sas_device objects and fix unsafe list usage
  mpt3sas: sysfs attribute to report Backup Rail Monitor Status
  mpt3sas: Ported WarpDrive product SSS6200 support
  mpt3sas: fix for driver fails EEH, recovery from injected pci bus error
  mpt3sas: Manage MSI-X vectors according to HBA device type
  ...
2015-11-13 20:35:54 -08:00
Linus Torvalds
9aa3d651a9 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target updates from Nicholas Bellinger:
 "This series contains HCH's changes to absorb configfs attribute
  ->show() + ->store() function pointer usage from it's original
  tree-wide consumers, into common configfs code.

  It includes usb-gadget, target w/ drivers, netconsole and ocfs2
  changes to realize the improved simplicity, that now renders the
  original include/target/configfs_macros.h CPP magic for fabric drivers
  and others, unnecessary and obsolete.

  And with common code in place, new configfs attributes can be added
  easier than ever before.

  Note, there are further improvements in-flight from other folks for
  v4.5 code in configfs land, plus number of target fixes for post -rc1
  code"

In the meantime, a new user of the now-removed old configfs API came in
through the char/misc tree in commit 7bd1d4093c ("stm class: Introduce
an abstraction for System Trace Module devices").

This merge resolution comes from Alexander Shishkin, who updated his stm
class tracing abstraction to account for the removal of the old
show_attribute and store_attribute methods in commit 517982229f
("configfs: remove old API") from this pull.  As Alexander says about
that patch:

 "There's no need to keep an extra wrapper structure per item and the
  awkward show_attribute/store_attribute item ops are no longer needed.

  This patch converts policy code to the new api, all the while making
  the code quite a bit smaller and easier on the eyes.

  Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>"

That patch was folded into the merge so that the tree should be fully
bisectable.

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (23 commits)
  configfs: remove old API
  ocfs2/cluster: use per-attribute show and store methods
  ocfs2/cluster: move locking into attribute store methods
  netconsole: use per-attribute show and store methods
  target: use per-attribute show and store methods
  spear13xx_pcie_gadget: use per-attribute show and store methods
  dlm: use per-attribute show and store methods
  usb-gadget/f_serial: use per-attribute show and store methods
  usb-gadget/f_phonet: use per-attribute show and store methods
  usb-gadget/f_obex: use per-attribute show and store methods
  usb-gadget/f_uac2: use per-attribute show and store methods
  usb-gadget/f_uac1: use per-attribute show and store methods
  usb-gadget/f_mass_storage: use per-attribute show and store methods
  usb-gadget/f_sourcesink: use per-attribute show and store methods
  usb-gadget/f_printer: use per-attribute show and store methods
  usb-gadget/f_midi: use per-attribute show and store methods
  usb-gadget/f_loopback: use per-attribute show and store methods
  usb-gadget/ether: use per-attribute show and store methods
  usb-gadget/f_acm: use per-attribute show and store methods
  usb-gadget/f_hid: use per-attribute show and store methods
  ...
2015-11-13 20:04:17 -08:00
Christoph Hellwig
64d513ac31 scsi: use host wide tags by default
This patch changes the !blk-mq path to the same defaults as the blk-mq
I/O path by always enabling block tagging, and always using host wide
tags.  We've had blk-mq available for a few releases so bugs with
this mode should have been ironed out, and this ensures we get better
coverage of over tagging setup over different configs.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
2015-11-09 17:11:57 -08:00
Sagi Grimberg
9a21be531c IB/srp: Dont allocate a page vector when using fast_reg
The new fast registration API does not reuqire a page vector
so we can't avoid allocating it.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Tested-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-28 22:27:19 -04:00
Sagi Grimberg
51c2b8e2ac IB/srp: Remove srp_finish_mapping
No callers left, remove it.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Tested-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-28 22:27:19 -04:00
Sagi Grimberg
f7f7aab1a5 IB/srp: Convert to new registration API
Instead of constructing a page list, call ib_map_mr_sg
and post a new ib_reg_wr. srp_map_finish_fr now returns
the number of sg elements registered.

Remove srp_finish_mapping since no one is calling it.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Tested-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-28 22:27:19 -04:00
Sagi Grimberg
26630e8a09 IB/srp: Split srp_map_sg
This is a preparation patch for the new registration API
conversion. It splits srp_map_sg per registration strategy
(srp_map_sg[fmr|fr|dma]. On its own it adds some code duplication,
but it makes the API switch easier to comprehend.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Tested-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-28 22:27:18 -04:00
Sagi Grimberg
16c2d702f2 iser-target: Port to new memory registration API
Remove fastreg page list allocation as the page vector
is now private to the provider. Instead of constructing
the page list and fast_req work request, call ib_map_mr_sg
and construct ib_reg_wr.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-28 22:27:18 -04:00
Sagi Grimberg
3940588500 IB/iser: Port to new fast registration API
Remove fastreg page list allocation as the page vector
is now private to the provider. Instead of constructing
the page list and fast_req work request, call ib_map_mr_sg
and construct ib_reg_wr.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-28 22:27:18 -04:00
Doug Ledford
63e8790d39 Merge branch 'wr-cleanup' into k.o/for-4.4 2015-10-28 22:23:34 -04:00
Doug Ledford
eb14ab3ba1 Merge branch 'wr-cleanup' of git://git.infradead.org/users/hch/rdma into wr-cleanup
Signed-off-by: Doug Ledford <dledford@redhat.com>

Conflicts:
	drivers/infiniband/ulp/isert/ib_isert.c - Commit 4366b19ca5
	(iser-target: Change the recv buffers posting logic) changed the
	logic in isert_put_datain() and had to be hand merged
2015-10-28 22:21:09 -04:00
Guy Shapiro
fa20105e09 IB/cma: Add support for network namespaces
Add support for network namespaces in the ib_cma module. This is
accomplished by:

1. Adding network namespace parameter for rdma_create_id. This parameter is
   used to populate the network namespace field in rdma_id_private.
   rdma_create_id keeps a reference on the network namespace.
2. Using the network namespace from the rdma_id instead of init_net inside
   of ib_cma, when listening on an ID and when looking for an ID for an
   incoming request.
3. Decrementing the reference count for the appropriate network namespace
   when calling rdma_destroy_id.

In order to preserve the current behavior init_net is passed when calling
from other modules.

Signed-off-by: Guy Shapiro <guysh@mellanox.com>
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Yotam Kenneth <yotamke@mellanox.com>
Signed-off-by: Shachar Raindel <raindel@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-28 12:32:48 -04:00
Sagi Grimberg
630c3183ce IB/iser: Enable SG clustering
iser is perfectly capable supporting SG clustering as it translates
the SG list to a page vector. Enabling SG clustering can dramatically
reduce the number of SG elements, which doesn't make much of a difference
at this point, but with arbitrary SG list support, reducing the
number of SG elements can benefit greatly as as it would reduce
the length of the HW descriptors array.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-28 12:26:06 -04:00
Sagi Grimberg
dd0107a089 IB/iser: set block queue_virt_boundary
The block layer can reliably guarantee that SG lists won't
contain gaps (page unaligned) if a driver set the queue
virt_boundary.

With this setting the block layer will:
- refuse merges if bios are not aligned to the virtual boundary
- split bios/requests that are not aligned to the virtual boundary
- or, bounce buffer SG_IOs that are not aligned to the virtual boundary

Since iser is working in 4K page size, set the virt_boundary to
4K pages. With this setting, we can now safely remove the bounce
buffering logic in iser.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-28 12:26:06 -04:00
Bart Van Assche
6c760b3dd5 iser-target: Remove an unused variable
Detected this by compiling with W=1.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-22 18:37:47 -04:00
Bart Van Assche
78fc3fc4cc IB/iser: Remove an unused variable
Detected this by compiling with W=1.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-22 18:37:14 -04:00
Matan Barak
55ee3ab2e4 IB/core: Add netdev and gid attributes paramteres to cache
Adding an ability to query the IB cache by a netdev and get the
attributes of a GID. These parameters are necessary in order to
successfully resolve the required GID (when the netdevice is known)
and get the Ethernet L2 attributes from a GID.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Reviewed-By: Devesh Sharma <devesh.sharma@avagotech.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-21 23:48:17 -04:00
Geliang Tang
68a5e60436 IB/iser: fix a comment typo
Just fix a typo in the code comment.

Signed-off-by: Geliang Tang <geliangtang@163.com>
Acked-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-21 16:41:54 -04:00
Doug Ledford
fc81a06965 Merge branch 'k.o/for-4.3-v1' into k.o/for-4.4
Pick up the late fixes from the 4.3 cycle so we have them in our
next branch.
2015-10-21 16:40:21 -04:00
Christoph Hellwig
2eafd72939 target: use per-attribute show and store methods
This also allows to remove the target-specific old configfs macros, and
gets rid of the target_core_fabric_configfs.h header which only had one
function declaration left that could be moved to a better place.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Nicholas Bellinger <nab@linux-iscsi.org>
Acked-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-10-13 22:17:49 -07:00
Christoph Lameter
0b5c9279e5 IB/ipoib: For sendonly join free the multicast group on leave
When we leave the multicast group on expiration of a neighbor we
do not free the mcast structure. This results in a memory leak
that causes ib_dealloc_pd to fail and print a WARN_ON message
and backtrace.

Fixes: bd99b2e05c (IB/ipoib: Expire sendonly multicast joins)
Signed-off-by: Christoph Lameter <cl@linux.com>
Tested-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-13 16:43:59 -04:00
Christoph Hellwig
e622f2f4ad IB: split struct ib_send_wr
This patch split up struct ib_send_wr so that all non-trivial verbs
use their own structure which embedds struct ib_send_wr.  This dramaticly
shrinks the size of a WR for most common operations:

sizeof(struct ib_send_wr) (old):	96

sizeof(struct ib_send_wr):		48
sizeof(struct ib_rdma_wr):		64
sizeof(struct ib_atomic_wr):		96
sizeof(struct ib_ud_wr):		88
sizeof(struct ib_fast_reg_wr):		88
sizeof(struct ib_bind_mw_wr):		96
sizeof(struct ib_sig_handover_wr):	80

And with Sagi's pending MR rework the fast registration WR will also be
down to a reasonable size:

sizeof(struct ib_fastreg_wr):		64

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> [srp, srpt]
Reviewed-by: Chuck Lever <chuck.lever@oracle.com> [sunrpc]
Tested-by: Haggai Eran <haggaie@mellanox.com>
Tested-by: Sagi Grimberg <sagig@mellanox.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
2015-10-08 11:09:10 +01:00
Doug Ledford
2866196f29 IB/ipoib: increase the max mcast backlog queue
When performing sendonly joins, we queue the packets that trigger
the join until the join completes.  This may take on the order of
hundreds of milliseconds.  It is easy to have many more than three
packets come in during that time.  Expand the maximum queue depth
in order to try and prevent dropped packets during the time it
takes to join the multicast group.

Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-09-25 22:30:24 -04:00
Doug Ledford
c3852ab0e6 IB/ipoib: Make sendonly multicast joins create the mcast group
Since IPoIB should, as much as possible, emulate how multicast
sends work on Ethernet for regular TCP/IP apps, there should be
no requirement to subscribe to a multicast group before your
sends are properly sent.  However, due to the difference in how
multicast is handled on InfiniBand, we must join the appropriate
multicast group before we can send to it.  Previously we tried
not to trigger the auto-create feature of the subnet manager when
doing this because we didn't have tracking of these sendonly
groups and the auto-creation might never get undone.  The previous
patch added timing to these sendonly joins and allows us to
leave them after a reasonable idle expiration time.  So supply
all of the information needed to auto-create group.

Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-09-25 14:46:58 -04:00
Christoph Lameter
bd99b2e05c IB/ipoib: Expire sendonly multicast joins
On neighbor expiration, check to see if the neighbor was actually a
sendonly multicast join, and if so, leave the multicast group as we
expire the neighbor.

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-09-25 14:43:19 -04:00
Sagi Grimberg
3cffd93017 IB/iser: Add module parameter for always register memory
This module parameter forces memory registration even for
a continuous memory region. It is true by default as sending
an all-physical rkey with remote permissions might be insecure.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-09-25 10:46:51 -04:00
Jenny Derzhavetz
9fd60088ff iser-target: Skip data copy if all the command data comes as immediate
Given that supporting zcopy immediate data for all IOs requires
iser driver to use its own buffer allocations, we settle with
avoiding data copy for IOs with data length of up to 8K (which
is more latency sensitive anyway).

This trims IO write latency by up to 3us and increase IOPs
by up to 40% by saving CPU time doing sg_copy_from_buffer
(8K IO size is the obvious winner here).

Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-09-15 15:47:31 -07:00
Jenny Derzhavetz
4366b19ca5 iser-target: Change the recv buffers posting logic
iser target batches post recv operations to avoid
the overhead of acquiring the recv queue lock and
posting a HW doorbell for each command.

We change it to be per command in order to support
zcopy immediate data for IOs that fits in the 8K
transfer boundary (in the next patch).

(Fix minor patch fuzz due to ib_mr removal - nab)

Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-09-15 15:47:29 -07:00
Jenny Derzhavetz
bd3792205a iser-target: Fix pending connections handling in target stack shutdown sequnce
Instead of handing a connection to the iscsi stack
for processing right after accepting (rdma_accept) we only hand
the connection to the iscsi core after we reached to a connected
state (ESTABLISHED CM event). This will prevent two error scenrios:

1. race between rdma connection teardown and iscsi login sequence
   reported by Nic in: (ce9a9fc20a "iser-target: Fix REJECT CM event
   use-after-free OOPs")

2. target stack shutdown sequence race with constant login attempts by
   multiple initiators.

We address this by maintaining two queues at the isert_np level:
- accepted: connections that were accepted but have not reached
  connected state (might get rejected, unreachable or error).
- pending: connections in connected state, but have yet to handed
  to the iscsi core for login processing. iser connections are promoted
  to the pending queue only from the accepted queue.

This way the iscsi core now will only handle functional iser connections
and once we shutdown the target stack, we look for any stales that
got left behind so we can safely release them.

Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Cc: <stable@vger.kernel.org> # v3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-09-15 15:47:27 -07:00
Jenny Derzhavetz
ed8cb0a437 iser-target: Remove np_ prefix from isert_np members
These are always referenced from np-> so no need
for the prefix.

Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-09-15 15:47:25 -07:00
Jenny Derzhavetz
f27dfa1f0e iser-target: Remove unused variables
Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-09-15 15:47:23 -07:00
Jenny Derzhavetz
3e03c4b01d iser-target: Put the reference on commands waiting for unsol data
The iscsi target core teardown sequence calls wait_conn for
all active commands to finish gracefully by:
- move the queue-pair to error state
- drain all the completions
- wait for the core to finish handling all session commands

However, when tearing down a session while there are sequenced
commands that are still waiting for unsolicited data outs, we can
block forever as these are missing an extra reference put.

We basically need the equivalent of iscsit_free_queue_reqs_for_conn()
which is called after wait_conn has returned. Address this by an
explicit walk on conn_cmd_list and put the extra reference.

Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Cc: <stable@vger.kernel.org> # v3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-09-15 15:47:21 -07:00
Jenny Derzhavetz
a4c15cd957 iser-target: remove command with state ISTATE_REMOVE
As documented in iscsit_sequence_cmd:
/*
 * Existing callers for iscsit_sequence_cmd() will silently
 * ignore commands with CMDSN_LOWER_THAN_EXP, so force this
 * return for CMDSN_MAXCMDSN_OVERRUN as well..
 */

We need to silently finish a command when it's in ISTATE_REMOVE.
This fixes an teardown hang we were seeing where a mis-behaved
initiator (triggered by allocation error injections) sent us a
cmdsn which was lower than expected.

Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Cc: <stable@vger.kernel.org> # v3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-09-15 15:47:19 -07:00
Linus Torvalds
05c78081d2 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target updates from Nicholas Bellinger:
 "Here are the outstanding target-pending updates for v4.3-rc1.

  Mostly bug-fixes and minor changes this round.  The fallout from the
  big v4.2-rc1 RCU conversion have (thus far) been minimal.

  The highlights this round include:

   - Move sense handling routines into scsi_common code (Sagi)

   - Return ABORTED_COMMAND sense key for PI errors (Sagi)

   - Add tpg_enabled_sendtargets attribute for disabled iscsi-target
     discovery (David)

   - Shrink target struct se_cmd by rearranging fields (Roland)

   - Drop iSCSI use of mutex around max_cmd_sn increment (Roland)

   - Replace iSCSI __kernel_sockaddr_storage with sockaddr_storage (Andy +
     Chris)

   - Honor fabric max_data_sg_nents I/O transfer limit (Arun + Himanshu +
     nab)

   - Fix EXTENDED_COPY >= v4.1 regression OOPsen (Alex + nab)"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (37 commits)
  target: use stringify.h instead of own definition
  target/user: Fix UFLAG_UNKNOWN_OP handling
  target: Remove no-op conditional
  target/user: Remove unused variable
  target: Fix max_cmd_sn increment w/o cmdsn mutex regressions
  target: Attach EXTENDED_COPY local I/O descriptors to xcopy_pt_sess
  target/qla2xxx: Honor max_data_sg_nents I/O transfer limit
  target/iscsi: Replace __kernel_sockaddr_storage with sockaddr_storage
  target/iscsi: Replace conn->login_ip with login_sockaddr
  target/iscsi: Keep local_ip as the actual sockaddr
  target/iscsi: Fix np_ip bracket issue by removing np_ip
  target: Drop iSCSI use of mutex around max_cmd_sn increment
  qla2xxx: Update tcm_qla2xxx module description to 24xx+
  iscsi-target: Add tpg_enabled_sendtargets for disabled discovery
  drivers: target: Drop unlikely before IS_ERR(_OR_NULL)
  target: check DPO/FUA usage for COMPARE AND WRITE
  target: Shrink struct se_cmd by rearranging fields
  target: Remove cmd->se_ordered_id (unused except debug log lines)
  target: add support for START_STOP_UNIT SCSI opcode
  target: improve unsupported opcode message
  ...
2015-09-11 19:00:42 -07:00
Jason Gunthorpe
d1178cbcdc IB/ipoib: Suppress warning for send only join failures
We expect send only joins to fail, it just means there are no listeners
for the group. The correct thing to do is silently drop the packet
at source.

Eg avahi will full join 224.0.0.251 which causes a send only IGMP packet
to 224.0.0.22, and then a warning level kmessage like this:

 ib0: sendonly multicast join failed for ff12:401b:ffff:0000:0000:0000:0000:0016, status -22

If there is no IP router listening to IGMP.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-09-03 17:11:05 -04:00
Doug Ledford
c3acdc06a9 IB/ipoib: Clean up send-only multicast joins
Even though we don't expect the group to be created by the SM we
sill need to provide all the parameters to force the SM to validate
they are correct.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-09-03 17:05:58 -04:00
Sagi Grimberg
7fbc67df2c IB/srp: Fix possible protection fault
srp_destroy_qp is designed to indicate we are safe to continue with
freeing the channel resources by modifying the qp error state,
posting a dummy wr on the queue-pair and waiting for it to flush.
This also holds for the channel registration pool as we are unmapping
the memory region when handling a scsi response. Destroying the
channel registration pool before we make sure we processed all the
inflight IO might introduce a use-after-free of the registration pool.

This use-after-free is demonstrated in the stack trace below where
srp is trying to unmap a used FMR after the fmr_pool was already destroyed.

general protection fault: 0000 [#1] SMP
RIP: 0010:[<ffffffff8151121b>]  [<ffffffff8151121b>] _raw_spin_lock_irqsave+0x1b/0x50
Call Trace:
 [<ffffffffa055d88a>] ib_fmr_pool_unmap+0x1a/0xb0 [ib_core]
 [<ffffffffa06c00ed>] srp_unmap_data.isra.28+0x17d/0x250 [ib_srp]
 [<ffffffffa06c01eb>] srp_free_req+0x2b/0x60 [ib_srp]
 [<ffffffffa06c0c94>] srp_recv_completion+0x174/0x580 [ib_srp]
 [<ffffffffa04580fe>] mlx4_eq_int+0x4de/0xe50 [mlx4_core]
 [<ffffffffa0458b00>] mlx4_msi_x_interrupt+0x10/0x20 [mlx4_core]
 [<ffffffff810abc45>] handle_irq_event_percpu+0x35/0x1b0
 [<ffffffff810abdf2>] handle_irq_event+0x32/0x50
 [<ffffffff810ae5cf>] handle_edge_irq+0x6f/0x120
 [<ffffffff8100455a>] handle_irq+0x1a/0x30
 [<ffffffff8151b475>] do_IRQ+0x45/0xb0
 [<ffffffff8151162d>] common_interrupt+0x6d/0x6d
 [<ffffffff813e4d2f>] cpuidle_enter_state+0x4f/0xc0
 [<ffffffff813e4e6c>] cpuidle_idle_call+0xcc/0x210
 [<ffffffff8100b9ea>] arch_cpu_idle+0xa/0x30
 [<ffffffff810ab1e1>] cpu_startup_entry+0xe1/0x270
 [<ffffffff81030b3a>] start_secondary+0x21a/0x2c0

Reported-by: Eliott Kespi <eliottk@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-09-03 15:59:48 -04:00
Jason Gunthorpe
7dd78647a2 IB/core: Make ib_dealloc_pd return void
The majority of callers never check the return value, and even if they
did, they can't do anything about a failure.

All possible failure cases represent a bug in the caller, so just
WARN_ON inside the function instead.

This fixes a few random errors:
 net/rd/iw.c infinite loops while it fails. (racing with EBUSY?)

This also lays the ground work to get rid of error return from the
drivers. Most drivers do not error, the few that do are broken since
it cannot be handled.

Since uverbs can legitimately make use of EBUSY, open code the
check.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30 18:12:39 -04:00
Bart Van Assche
03f6fb93fd IB/srp: Create an insecure all physical rkey only if needed
The SRP initiator only needs this if the insecure register_always=N
performance optimization is enabled, or if FRWR/FMR is not supported
in the driver.

Do not create an all physical MR unless it is needed to support
either of those modes. Default register_always to true so the out of
the box configuration does not create an insecure all physical MR.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
[bvanassche: reworked and rebased this patch]
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30 18:12:39 -04:00
Bart Van Assche
330179f2fa IB/srp: Register the indirect data buffer descriptor
Instead of always using the global rkey for the indirect data
buffer descriptor, register that descriptor with the HCA if
the kernel module parameter register_always has been set to Y.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30 18:12:38 -04:00
Bart Van Assche
002f15674c IB/srp: Introduce srp_device.use_fmr
Introduce the variable srp_device.use_fmr. Leave out the dev->has_fr /
dev->has_fmr and ch->fr_pool / ch->fmr_pool checks since these are
redundant. This patch does not change any functionality but makes the
source code easier to read.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30 18:12:38 -04:00
Bart Van Assche
3ae95da883 IB/srp: Remove use_mr argument from srp_map_sg_entry()
Move the srp_map_desc() call from inside srp_map_sg_entry() to
srp_map_sg() such that the use_mr argument can be removed from
srp_map_sg_entry().

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30 18:12:38 -04:00
Bart Van Assche
0e0d3a4800 IB/srp: Remove the memory registration backtracking code
Mapping a discontiguous sg-list requires multiple memory regions
and hence can exhaust the memory region pool. The SRP initiator
already handles this by temporarily reducing the queue depth. This
means that it is safe to remove the memory registration backtracking
code. This patch has been tested with direct I/O sizes up to 256 MB.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30 18:12:37 -04:00
Bart Van Assche
f731ed6293 IB/srp: Add memory descriptor array pointer range checking
Although most paths through which a request is submitted check
block layer parameters like the max_segments limit, these are
not checked when an SG_IO or direct I/O request is submitted.
Hence add a range check for the memory descriptor array pointer.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30 18:12:37 -04:00
Bart Van Assche
7e85c91970 IB/srp: Use multiple registrations for large memory regions
Instead of using the global rkey for large memory regions, use
multiple registrations. See also the while (dma_len) loop further
down in srp_map_sg_entry().

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30 18:12:37 -04:00
Bart Van Assche
186fbc6689 IB/srp: Re-enable FMR for non-page aligned buffers
During a discussion in 2011 nobody recalled why FMR was not used for
non-page aligned buffers (see also
http://thread.gmane.org/gmane.linux.drivers.rdma/7149). Re-enable FMR
for such buffers. For the reason why the srp_map_fmr() function needs
to be modified, see also patch "IB/srp: rework mapping engine to use
multiple FMR entries" (commit ID 8f26c9ff9cd0; January 2011).

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30 18:12:36 -04:00
Jason Gunthorpe
5a783956c2 ib_srpt: Remove ib_get_dma_mr calls
The pd now has a local_dma_lkey member which completely replaces
ib_get_dma_mr, use it instead.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30 18:12:35 -04:00
Jason Gunthorpe
e6bf5f48d2 IB/srp: Use pd->local_dma_lkey
Replace all leys with  pd->local_dma_lkey. This driver does not support
iWarp, so this is safe.

The insecure use of ib_get_dma_mr is thus isolated to an rkey, and will
have to be fixed separately.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30 18:12:35 -04:00
Jason Gunthorpe
34efc7dfbd iser-target: Remove ib_get_dma_mr calls
The pd now has a local_dma_lkey member which completely replaces
ib_get_dma_mr, use it instead.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30 18:12:35 -04:00
Jason Gunthorpe
256b7ad273 IB/iser: Use pd->local_dma_lkey
Replace all leys with  pd->local_dma_lkey. This driver does not support
iWarp, so this is safe.

The insecure use of ib_get_dma_mr is thus isolated to an rkey, and this
looks trivially fixed by forcing the use of registration in a future
patch.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30 18:12:34 -04:00
Jason Gunthorpe
77b1f99660 IB/ipoib: Remove ib_get_dma_mr calls
The pd now has a local_dma_lkey member which completely replaces
ib_get_dma_mr, use it instead.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30 18:12:34 -04:00
Sagi Grimberg
7332bed085 IB/iser: Chain all iser transaction send work requests
Chaning of send work requests benefits performance by
reducing the send queue lock contention (acquired in
ib_post_send) and saves us HW doorbells which is posted
only once.

Currently, in normal IO flows iser does not chain the CDB send
work request with the registration work request. Also in PI
flows, signature work requests are not chained as well.

Lets chain those and post only once.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30 18:12:33 -04:00
Sagi Grimberg
1b16c9894b IB/iser: Add debug prints to the various memory registration methods
Easier to debug when we have the registration details.

This patch does not change any functionality.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Adir Lev <adirl@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30 18:12:32 -04:00
Sagi Grimberg
df749cdc45 IB/iser: Support up to 8MB data transfer in a single command
iser support up to 512KB data transfer in a single scsi command.
This means that larger IOs will split to different request. While
iser can easily saturate FDR/EDR wires, some arrays are fine tuned
for 1MB (or larger) IO sizes, hence add an option to support larger
transfers (up to 8MB) if the device allows it.

Given that a few target implementations don't support data transfers
of more than 512KB by default and the fact that larger IO sizes require
more resources, we introduce a module parameter to determine the
maximum number of 512B sectors in a single scsi command.
Users that are interested in larger transfers can change this value given
that the target supports larger transfers.

At the moment, iser works in 4K pages granularity, In a later stage
we will get it to work with system page size instead.

IO operations that consists of N pages will need a page vector
of size N+1 in case the first SG element contains an offset. Given
that some devices allocates memory regions in powers of 2, this
means that allocating a region with N+1 pages, will result in
region resources allocation of the next power of 2. Since we don't
want that to happen, in case we are in the limit of IO size supported
and the first SG element has an offset, we align the SG list using a
bounce buffer (which is OK given that this is not likely to happen a lot).

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30 18:12:32 -04:00
Sagi Grimberg
f8db651da2 IB/iser: Pass registration pool a size parameter
Hard coded for now. This will allow to allocate different
sized MRs depending on the IO size needed (and device
capabilities).

This patch does not change any functionality.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30 18:12:32 -04:00
Sagi Grimberg
32467c420b IB/iser: Unify fast memory registration flows
iser_reg_rdma_mem_[fastreg|fmr] share a lot of code, and
logically do the same thing other than the buffer registration
method itself (iser_fast_reg_mr vs. iser_fast_reg_fmr).
The DIF logic is not implemented in the FMR flow as there is no
existing device that supports FMRs and Signature feature.

This patch unifies the flow in a single routine iser_reg_rdma_mem
and just split to fmr/frwr for the buffer registration itself.

Also, for symmetry reasons, unify iser_unreg_rdma_mem (which will
call the relevant device specific unreg routine).

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Adir Lev <adirl@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30 18:12:31 -04:00