Commit Graph

722378 Commits

Author SHA1 Message Date
Tatyana Nikolova
ce9ce74145 i40iw: Validate correct IRD/ORD connection parameters
Casting to u16 before validating IRD/ORD connection
parameters could cause recording wrong IRD/ORD values
in the cm_node. Validate the IRD/ORD parameters as
they are passed by the application before recording
them.

Fixes: f27b4746f3 ("i40iw: add connection management code")
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-22 13:39:48 -07:00
Shiraz Saleem
756f77d216 i40iw: Ignore LLP_DOUBT_REACHABILITY AE
The LLP_DOUBT_REACHABILITY Asynchronous Event (AE) is an early
warning of a connection issue. It is followed by LLP_TOO_MANY_RETRIES
AE, if the retransmit threshold is reached and recovery is not possible
for the connection.

Currently we terminate the connection on receiving the
LLP_DOUBT_REACHABILITY AE. Ignore this AE and
terminate the connection only on LLP_TOO_MANY_RETRIES AE.

This improves the user experience on cable disconnect/reconnect
scenario while running iWARP traffic. On cable disconnect,
the QP traffic is paused and the user has a larger and more
reasonable timeout within which if the cable is reconnected,
traffic can continue.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-22 13:39:21 -07:00
Shiraz Saleem
df8b13a1b2 i40iw: Fix sequence number for the first partial FPDU
Partial FPDU processing is broken as the sequence number
for the first partial FPDU is wrong due to incorrect
Q2 buffer offset. The offset should be 64 rather than 16.

Fixes: 786c6adb3a ("i40iw: add puda code")
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-22 13:39:20 -07:00
Shiraz Saleem
3020f252c3 i40iw: Selectively teardown QPs on IP addr change event
On IP address change event, all connected QPs are torn down
irrespective of whether IP address is involved in a connection.

Only teardown connections those source or destination address
matches the netdev interface IP address being changed, and if
they are on the same VLAN as the netdev.

Fixes: e5e74b61b1 ("i40iw: Add IP addr handling on netdev events")
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-22 13:39:07 -07:00
Shiraz Saleem
0c5d515546 i40iw: Add notifier for network device events
Register a netdevice notifier for netdev UP/DOWN
notification events and report the appropriate ib event.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-22 13:38:05 -07:00
Shiraz Saleem
fe99afd1fe i40iw: Correct Q1/XF object count equation
Lower Inbound RDMA Read Queue (Q1) object count by a factor of 2
as it is incorrectly doubled. Also, round up Q1 and Transmit FIFO (XF)
object count to power of 2 to satisfy hardware requirement.

Fixes: 86dbcd0f12 ("i40iw: add file to handle cqp calls")
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-22 13:38:05 -07:00
Shiraz Saleem
8758768ad8 i40iw: Use utility function roundup_pow_of_two()
Consolidate all power of 2 round calculations to
use kernel utility function roundup_pow_of_two().

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-22 13:37:51 -07:00
Shiraz Saleem
f32b766cf7 i40iw: Set MAX_IRD_SIZE to 64
Increase I40IW_MAX_IRD_SIZE to 64 which is the device limit.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-22 13:33:30 -07:00
Dennis Dalessandro
896b1ec847 rdma: Update maintainer contact for Intel RDMA drivers
Ensure both Mike and I are listed as maintainer contacts for Intel's qib,
hfi1, and rdmavt drivers.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-22 13:33:30 -07:00
Venkata Sandeep Dhanalakota
af808ece5c IB/SA: Check dlid before SA agent queries for ClassPortInfo
SA queries SM for class port info when there is a LID_CHANGE event.

When a base lid is configured before fm is started ie when smlid is
not yet assigned, SA handles the LID_CHANGE event and tries query SM
with lid 0. This will cause an hang.

[ 1106.958820] INFO: task kworker/2:0:23 blocked for more than 120 seconds.
[ 1106.965082] Tainted: G O 4.12.0+ #1
[ 1106.969602] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
 this message.
[ 1106.977227] kworker/2:0 D 0 23 2 0x00000000
[ 1106.977250] Workqueue: infiniband update_ib_cpi [ib_core]
[ 1106.977261] Call Trace:
[ 1106.977273] __schedule+0x28e/0x860
[ 1106.977285] schedule+0x36/0x80
[ 1106.977298] schedule_timeout+0x1a3/0x2e0
[ 1106.977310] ? radix_tree_iter_tag_clear+0x1b/0x20
[ 1106.977322] ? idr_alloc+0x64/0x90
[ 1106.977334] wait_for_completion+0xe3/0x140
[ 1106.977347] ? wake_up_q+0x80/0x80
[ 1106.977369] update_ib_cpi+0x163/0x210 [ib_core]
[ 1106.977381] process_one_work+0x147/0x370
[ 1106.977394] worker_thread+0x4a/0x390
[ 1106.977406] kthread+0x109/0x140
[ 1106.977418] ? process_one_work+0x370/0x370
[ 1106.977430] ? kthread_park+0x60/0x60
[ 1106.977443] ret_from_fork+0x22/0x30

Always ensure a proper smlid is assigned before querying SM for cpi.

Fixes: ee1c60b1bf ("IB/SA: Modify SA to implicitly cache Class Port info")
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-22 13:33:30 -07:00
Shiraz Saleem
adb1c0d15e nes: Change accelerated flag to bool
The accelerated flag only utilizes two values: 0 and 1.
Modify accelerated flag in struct nes_cm_node to bool.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-22 13:33:30 -07:00
Michael J. Ruhl
4c009af473 IB/hfi: Only read capability registers if the capability exists
During driver init, various registers are saved to allow restoration
after an FLR or gen3 bump.  Some of these registers are not available
in some circumstances (i.e. Virtual machines).

This bug makes the driver unusable when the PCI device is passed into
a VM, it fails during probe.

Delete unnecessary register read/write, and only access register if
the capability exists.

Cc: <stable@vger.kernel.org> # 4.14.x
Fixes: a618b7e40a ("IB/hfi1: Move saving PCI values to a separate function")
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-22 10:42:08 -07:00
Pravin Shedge
72c7fe90ee drivers: infiniband: remove duplicate includes
These duplicate includes have been found with scripts/checkincludes.pl but
they have been removed manually to avoid removing false positives.

Signed-off-by: Pravin Shedge <pravin.shedge4linux@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-22 09:39:35 -07:00
Yixian Liu
a5073d6054 RDMA/hns: Add eq support of hip08
This patch adds eq support for hip08. The eq table can
be multi-hop addressed.

Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Reviewed-by: Lijun Ou <oulijun@huawei.com>
Reviewed-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-22 09:21:45 -07:00
Yixian Liu
b16f818847 RDMA/hns: Refactor eq code for hip06
Considering the compatibility of supporting hip08's eq
process and possible changes of data structure, this patch
refactors the eq code structure of hip06.

We move all the eq process code for hip06 from hns_roce_eq.c
into hns_roce_hw_v1.c, and also for hns_roce_eq.h. With
these changes, it will be convenient to add the eq support
for later hardware version.

Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Reviewed-by: Lijun Ou <oulijun@huawei.com>
Reviewed-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-22 09:21:38 -07:00
Alex Vesker
1f80bd6a6c IB/ipoib: Fix lockdep issue found on ipoib_ib_dev_heavy_flush
The locking order of vlan_rwsem (LOCK A) and then rtnl (LOCK B),
contradicts other flows such as ipoib_open possibly causing a deadlock.
To prevent this deadlock heavy flush is called with RTNL locked and
only then tries to acquire vlan_rwsem.
This deadlock is possible only when there are child interfaces.

[  140.941758] ======================================================
[  140.946276] WARNING: possible circular locking dependency detected
[  140.950950] 4.15.0-rc1+ #9 Tainted: G           O
[  140.954797] ------------------------------------------------------
[  140.959424] kworker/u32:1/146 is trying to acquire lock:
[  140.963450]  (rtnl_mutex){+.+.}, at: [<ffffffffc083516a>] __ipoib_ib_dev_flush+0x2da/0x4e0 [ib_ipoib]
[  140.970006]
but task is already holding lock:
[  140.975141]  (&priv->vlan_rwsem){++++}, at: [<ffffffffc0834ee1>] __ipoib_ib_dev_flush+0x51/0x4e0 [ib_ipoib]
[  140.982105]
which lock already depends on the new lock.
[  140.990023]
the existing dependency chain (in reverse order) is:
[  140.998650]
-> #1 (&priv->vlan_rwsem){++++}:
[  141.005276]        down_read+0x4d/0xb0
[  141.009560]        ipoib_open+0xad/0x120 [ib_ipoib]
[  141.014400]        __dev_open+0xcb/0x140
[  141.017919]        __dev_change_flags+0x1a4/0x1e0
[  141.022133]        dev_change_flags+0x23/0x60
[  141.025695]        devinet_ioctl+0x704/0x7d0
[  141.029156]        sock_do_ioctl+0x20/0x50
[  141.032526]        sock_ioctl+0x221/0x300
[  141.036079]        do_vfs_ioctl+0xa6/0x6d0
[  141.039656]        SyS_ioctl+0x74/0x80
[  141.042811]        entry_SYSCALL_64_fastpath+0x1f/0x96
[  141.046891]
-> #0 (rtnl_mutex){+.+.}:
[  141.051701]        lock_acquire+0xd4/0x220
[  141.055212]        __mutex_lock+0x88/0x970
[  141.058631]        __ipoib_ib_dev_flush+0x2da/0x4e0 [ib_ipoib]
[  141.063160]        __ipoib_ib_dev_flush+0x71/0x4e0 [ib_ipoib]
[  141.067648]        process_one_work+0x1f5/0x610
[  141.071429]        worker_thread+0x4a/0x3f0
[  141.074890]        kthread+0x141/0x180
[  141.078085]        ret_from_fork+0x24/0x30
[  141.081559]

other info that might help us debug this:
[  141.088967]  Possible unsafe locking scenario:
[  141.094280]        CPU0                    CPU1
[  141.097953]        ----                    ----
[  141.101640]   lock(&priv->vlan_rwsem);
[  141.104771]                                lock(rtnl_mutex);
[  141.109207]                                lock(&priv->vlan_rwsem);
[  141.114032]   lock(rtnl_mutex);
[  141.116800]
 *** DEADLOCK ***

Fixes: b4b678b06f ("IB/ipoib: Grab rtnl lock on heavy flush when calling ndo_open/stop")
Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-21 16:06:07 -07:00
Majd Dibbiny
71a0ff65a2 IB/mlx5: Fix congestion counters in LAG mode
Congestion counters are counted and queried per physical function.
When working in LAG mode, CNP packets can be sent or received on both
of the functions, thus congestion counters should be aggregated from
the two physical functions.

Fixes: e1f24a79f4 ("IB/mlx5: Support congestion related counters")
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Aviv Heller <avivh@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-21 16:06:07 -07:00
Bryan Tan
e3524b269e RDMA/vmw_pvrdma: Avoid use after free due to QP/CQ/SRQ destroy
The use of wait queues in vmw_pvrdma for handling concurrent
access to a resource leaves a race condition which can cause a use
after free bug.

Fix this by using the pattern from other drivers, complete() protected by
dec_and_test to ensure complete() is called only once.

Fixes: 29c8d9eba5 ("IB: Add vmw_pvrdma driver")
Signed-off-by: Bryan Tan <bryantan@vmware.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-21 16:06:07 -07:00
Bryan Tan
30a366a9da RDMA/vmw_pvrdma: Use refcount_dec_and_test to avoid warning
refcount_dec generates a warning when the operation
causes the refcount to hit zero. Avoid this by using
refcount_dec_and_test.

Fixes: 8b10ba783c ("RDMA/vmw_pvrdma: Add shared receive queue support")
Reviewed-by: Adit Ranadive <aditr@vmware.com>
Reviewed-by: Aditya Sarwade <asarwade@vmware.com>
Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
Signed-off-by: Bryan Tan <bryantan@vmware.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-21 16:06:07 -07:00
Bryan Tan
17748056ce RDMA/vmw_pvrdma: Call ib_umem_release on destroy QP path
The QP cleanup did not previously call ib_umem_release,
resulting in a user-triggerable kernel resource leak.

Fixes: 29c8d9eba5 ("IB: Add vmw_pvrdma driver")
Reviewed-by: Adit Ranadive <aditr@vmware.com>
Reviewed-by: Aditya Sarwade <asarwade@vmware.com>
Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
Signed-off-by: Bryan Tan <bryantan@vmware.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-21 16:06:06 -07:00
Steve Wise
d145873345 iw_cxgb4: when flushing, complete all wrs in a chain
If a wr chain was posted and needed to be flushed, only the first
wr in the chain was completed with FLUSHED status.  The rest were
never completed.  This caused isert to hang on shutdown due to the
missing completions which left iscsi IO commands referenced, stalling
the shutdown.

Fixes: 4fe7c2962e ("iw_cxgb4: refactor sq/rq drain logic")

Cc: stable@vger.kernel.org
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-21 16:06:06 -07:00
Steve Wise
96a236ed28 iw_cxgb4: reflect the original WR opcode in drain cqes
The flush/drain logic was not retaining the original wr opcode in
its completion.  This can cause problems if the application uses
the completion opcode to make decisions.

Use bit 10 of the CQE header word to indicate the CQE is a special
drain completion, and save the original WR opcode in the cqe header
opcode field.

Fixes: 4fe7c2962e ("iw_cxgb4: refactor sq/rq drain logic")

Cc: stable@vger.kernel.org
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-21 16:06:06 -07:00
Steve Wise
f55688c454 iw_cxgb4: Only validate the MSN for successful completions
If the RECV CQE is in error, ignore the MSN check.  This was causing
recvs that were flushed into the sw cq to be completed with the wrong
status (BAD_MSN instead of FLUSHED).

Cc: stable@vger.kernel.org
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-21 16:06:06 -07:00
Anton Vasilyev
7448208691 RDMA/ocrdma: Fix permissions for OCRDMA_RESET_STATS
Debugfs file reset_stats is created with S_IRUSR permissions,
but ocrdma_dbgfs_ops_read() doesn't support OCRDMA_RESET_STATS,
whereas ocrdma_dbgfs_ops_write() supports only OCRDMA_RESET_STATS.

The patch fixes misstype with permissions.

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Anton Vasilyev <vasilyev@ispras.ru>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-20 19:56:24 -07:00
Bharat Potnuri
66f53e6f54 iser-target: avoid reinitializing rdma contexts for isert commands
isert commands that failed during isert_rdma_rw_ctx_post() are queued to
Queue-Full(QF) queue and are scheduled to be reposted during queue-full
queue processing. During this reposting, the rdma contexts are initialised
again in isert_rdma_rw_ctx_post(), which is leaking significant memory.

unreferenced object 0xffff8830201d9640 (size 64):
  comm "kworker/0:2", pid 195, jiffies 4295374851 (age 4528.436s)
  hex dump (first 32 bytes):
    00 60 8b cb 2e 00 00 00 00 10 00 00 00 00 00 00  .`..............
    00 90 e3 cb 2e 00 00 00 00 10 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff8170711e>] kmemleak_alloc+0x4e/0xb0
    [<ffffffff811f8ba5>] __kmalloc+0x125/0x2b0
    [<ffffffffa046b24f>] rdma_rw_ctx_init+0x15f/0x6f0 [ib_core]
    [<ffffffffa07ab644>] isert_rdma_rw_ctx_post+0xc4/0x3c0 [ib_isert]
    [<ffffffffa07ad972>] isert_put_datain+0x112/0x1c0 [ib_isert]
    [<ffffffffa07dddce>] lio_queue_data_in+0x2e/0x30 [iscsi_target_mod]
    [<ffffffffa076c322>] target_qf_do_work+0x2b2/0x4b0 [target_core_mod]
    [<ffffffff81080c3b>] process_one_work+0x1db/0x5d0
    [<ffffffff8108107d>] worker_thread+0x4d/0x3e0
    [<ffffffff81088667>] kthread+0x117/0x150
    [<ffffffff81713fa7>] ret_from_fork+0x27/0x40
    [<ffffffffffffffff>] 0xffffffffffffffff

Here is patch to use the older rdma contexts while reposting
the isert commands intead of reinitialising them.

Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18 16:12:18 -07:00
Parav Pandit
16c72e4028 IB/cm: Refactor to avoid setting path record software only fields
When path ah_attr initialization from path record
fails, ib_cm_send_rej() uses av.ah_attr fields to send out reject
message. In such cases initialization of path record software fields
is not needed. Code is simplified for same.

Additionally in current code in cm_req_handler, when ib_get_cached_gid
fails for a given sgid_index of the GID of the GRH of the incoming CM MAD,
error code 12 is sent. This error code refers to primary GID in incoming
CM REQ and not for the GID in in MAD packet.
Therefore code is refactored to send code 5 (unsupported request) for such
error.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Hal Rosenstock <hal@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18 15:37:12 -07:00
Parav Pandit
f6bdb14267 IB/{core, umad, cm}: Rename ib_init_ah_from_wc to ib_init_ah_attr_from_wc
Currently ib_init_ah_from_wc initializes address handle attributes and
not the address handle object itself.
To avoid confusion between ah_attr vs ah, ib_init_ah_from_wc is
renamed to ib_init_ah_attr_from_wc to reflect that its initialzes
ah_attr.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18 15:37:11 -07:00
Parav Pandit
4ad6a0245e IB/{core, cm, cma, ipoib}: Rename ib_init_ah_from_path to ib_init_ah_attr_from_path
Since ib_init_ah_from_path initializes the address handle attribute, it is
renamed to reflect so.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18 15:37:11 -07:00
Parav Pandit
33f93e1ebc IB/cm: Fix sleeping while spin lock is held
In case of LAP are used for RoCE, it can lead to a problem of sleeping a
context while spin lock is held in below flow.

cm_lap_handler
	->spin_lock
	-> <..switch_case..>
	-> cm_init_av_for_response
		-> ib_init_ah_from_wc
			-> rdma_addr_find_l2_eth_by_grh
				wait_for_completion()

Therefore ah attribute initialization is done for incoming lap requests
outside of the lock context.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18 15:37:11 -07:00
Parav Pandit
5cf3968afc IB/cm: Handle address handle attribute init error
cm_init_av_by_path depends on ib_init_ah_from_path to initialize ah
attribute and ib_init_ah_from_path() can fail, such error should not
be ignored.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18 15:37:11 -07:00
Parav Pandit
0c4386ec77 IB/{cm, umad}: Handle av init error
cm_init_av_for_response depends on ib_init_ah_from_wc() whose return
status is ignored.
ib_init_ah_from_wc() can fail and its return status should be handled as
done in this patch.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18 15:37:10 -07:00
Parav Pandit
dbb12562f7 IB/{core, ipoib}: Simplify ib_find_gid to search only for IB link layer
Currently there are no users of ib_find_gid for RoCE transport. It is
only used by IPoIB.
Therefore its simplified to ignore RoCE ports and GID type check which
was previously done for every port.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18 15:37:10 -07:00
Parav Pandit
5092d17a39 RDMA/core: Avoid copying ifindex twice
rdma_copy_addr copies the ifndex to bound_dev_if.
Therefore avoid copying it again after rdma_copy_addr call is completed.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18 15:37:10 -07:00
Parav Pandit
575c7e583e RDMA/{core, cma}: Simplify rdma_translate_ip
Since no caller needs vlan, rdma_translate_ip is simplified to avoid
vlan pointer.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18 15:37:09 -07:00
Parav Pandit
699a83f1eb IB/core: Removed unused function
rdma_addr_find_smac_by_sgid() is exported symbol not used by any kernel
module. Therefore its removed.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18 15:37:09 -07:00
Parav Pandit
86937fcd6e RDMA/core: Avoid redundant memcpy in rdma_addr_find_l2_eth_by_grh
rdma_resolve_ip already copies 'addr' to its dev_addr argument.
Remove the duplicate memcpy and since it was the only user, remove the
'addr' member from resolve_cb_context.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18 15:37:09 -07:00
Parav Pandit
1c43d5d308 IB/core: Avoid exporting module internal ib_find_gid_by_filter()
ib_find_gid_by_filter() is used only by ib_core, therefore avoid
exporting it.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18 15:37:09 -07:00
Parav Pandit
fe78acf95f IB/rxe: Avoid passing unused index pointer which is optional
While searching for GID, returned index is not used, so avoid passing
pointer during invocation.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18 15:37:08 -07:00
Parav Pandit
151ed9d700 IB/core: Refactor to avoid unnecessary check on GID lookup miss
Currently on every gid entry comparison miss found variable is checked;
which is not needed as those two comparison fail already indicate that
GID is not found yet.
So refactor to avoid such check and copy the GID index when found.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18 15:37:08 -07:00
Parav Pandit
b0dd0d3353 IB/core: Avoid unnecessary type cast
Type cast from void to struct find_gid_index_context is not needed.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18 15:37:08 -07:00
Parav Pandit
981b5a2384 RDMA/cma: Introduce and use helper functions to init work
Introduce and user helper functions to initialize work for address
resolved and route resolved event that avoid code duplication at few
places.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18 15:37:07 -07:00
Parav Pandit
c423880537 RDMA/cma: Avoid setting path record type twice
Avoid setting path record type twice for RoCE.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18 15:37:07 -07:00
Parav Pandit
4367ec7fe2 RDMA/cma: Simplify netdev check
Current code checks for NULL ndev twice where 2nd check is always
invalid given the fact that during route resolving stage, device address
must be bound to netdevice interface.

This patch simplifies such check.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18 15:37:07 -07:00
Parav Pandit
5c181bda77 RDMA/cma: Set default GID type as RoCE when resolving RoCE route
As the function name suggests cma_resolve_iboe_route() resolves RoCE
route. However, its default GID type is IB_GID_TYPE_IB and not
IB_GID_TYPE_ROCE, even though both are mapped to the same enum value.
Change default GID type to IB_GID_TYPE_ROCE.

cma_iboe_set_mgid() is updated to reflect the RoCEv2 GID check.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18 15:37:06 -07:00
Artemy Kovalyov
edf1a84fe3 IB/umem: Fix use of npages/nmap fields
In ib_umem structure npages holds original number of sg entries, while
nmap is number of DMA blocks returned by dma_map_sg.

Fixes: c5d76f130b ('IB/core: Add umem function to read data from user-space')
Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18 15:37:06 -07:00
Daniel Jurgens
119bf81793 IB/cm: Add debug prints to ib_cm
Add debug prints to the error paths in the connection manager control
flows, to help debug connection management problems.

Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18 15:37:06 -07:00
Matan Barak
8b00914654 IB/core: Fix memory leak in cm_req_handler error flows
In cm_req_handler error flows, sometimes cm_id_priv->timewait_info
isn't free'd.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18 15:37:05 -07:00
Parav Pandit
7baaa49af3 RDMA/cma: Use correct size when writing netlink stats
The code was using the src size when formatting the dst. They are almost
certainly the same value but it reads wrong.

Fixes: ce117ffac2 ("RDMA/cma: Export AF_IB statistics")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18 15:37:05 -07:00
Erez Shitrit
98aebc550b IB/ipoib: Update pathrec field if not valid record
In case that the PathRecord is not valid (SM changed its network prefix)
ipoib will continue issue PathQuery requests with the same parameters
that are in its database, which are no longer valid anymore.

Now the driver in that case will re-initialize the record from a valid
place (the priv structure keeps the updated values), and a valid request
will be issued.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18 15:37:05 -07:00
Erez Shitrit
439000892e IB/ipoib: Avoid memory leak if the SA returns a different DGID
The ipoib path database is organized around DGIDs from the LLADDR, but the
SA is free to return a different GID when asked for path. This causes a
bug because the SA's modified DGID is copied into the database key, even
though it is no longer the correct lookup key, causing a memory leak and
other malfunctions.

Ensure the database key does not change after the SA query completes.

Demonstration of the bug is as  follows
ipoib wants to send to GID fe80:0000:0000:0000:0002:c903:00ef:5ee2, it
creates new record in the DB with that gid as a key, and issues a new
request to the SM.
Now, the SM from some reason returns path-record with other SGID (for
example, 2001:0000:0000:0000:0002:c903:00ef:5ee2 that contains the local
subnet prefix) now ipoib will overwrite the current entry with the new
one, and if new request to the original GID arrives ipoib  will not find
it in the DB (was overwritten) and will create new record that in its
turn will also be overwritten by the response from the SM, and so on
till the driver eats all the device memory.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18 15:37:05 -07:00