Commit Graph

3889 Commits

Author SHA1 Message Date
Maor Gottlieb
c73b7911de IB/mlx5: Assign SRQ type earlier
Move the SRQ type assignment to be before actually using it
in create_srq_user() and in create_srq_kernel() functions.

Fixes: af1ba291c5 ('{net, IB}/mlx5: Refactor internal SRQ API')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13 13:39:46 -05:00
Jack Morgenstein
c482af646d IB/mlx4: Fix out-of-range array index in destroy qp flow
For non-special QPs, the port value becomes non-zero only at the
RESET-to-INIT transition. If the QP has not undergone that transition,
its port number value is still zero.

If such a QP is destroyed before being moved out of the RESET state,
subtracting one from the qp port number results in a negative value.
Using that negative value as an index into the qp1_proxy array
results in an out-of-bounds array reference.

Fix this by testing that the QP type is one that uses qp1_proxy before
using the port number. For special QPs of all types, the port number is
specified at QP creation time.

Fixes: 9433c18891 ("IB/mlx4: Invoke UPDATE_QP for proxy QP1 on MAC changes")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13 13:39:46 -05:00
Moni Shoua
41c450fd8d IB/mlx5: Make create/destroy_ah available to userspace
Advertise that create_ah and destroy_ah verbs are accessible from
uverbs interface.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13 13:39:19 -05:00
Moni Shoua
5097e71f3e IB/mlx5: Use kernel driver to help userspace create ah
Resolving a MAC address for a given IP address in userspace is inefficient.
This patch lets mlx5 user driver using the kernel driver to resolve the mac
and get the answer in the private section of the response.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13 13:38:49 -05:00
Moni Shoua
477864c8fc IB/core: Let create_ah return extended response to user
Add struct ib_udata to the signature of create_ah callback that is
implemented by IB device drivers. This allows HW drivers to return extra
data to the userspace library.
This patch prepares the ground for mlx5 driver to resolve destination
mac address for a given GID and return it to userspace.
This patch was previously submitted by Knut Omang as a part of the
patch set to support Oracle's Infiniband HCA (SIF).

Signed-off-by: Knut Omang <knut.omang@oracle.com>
Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13 13:38:27 -05:00
Moni Shoua
6ad279c5a2 IB/mlx5: Report that device has udata response in create_ah
To make mlx5 user driver aware of whether kernel driver returns dmac
in user data response add a new flag that will be returned back to
user-space through alloc_ucontext.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13 13:37:19 -05:00
Moses Reuben
2d1e697e9b IB/mlx5: Add support to match inner packet fields
Add support to match packet fields which are tunneled,
i.e. support matching the header of the inner packet which is the result of
or bit operation of the original header and the IB_FLOW_SPEC_INNER type.

The combination of IB_FLOW_SPEC_INNER | IB_FLOW_SPEC_VXLAN_TUNNEL is not
needed to be checked, because the IB core has this check already.

Signed-off-by: Moses Reuben <mosesr@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13 13:34:24 -05:00
Moses Reuben
ffb30d8f10 IB/mlx5: Support Vxlan tunneling specification
Add support to receive specific Vxlan packet in ConnectX-4.

Signed-off-by: Moses Reuben <mosesr@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13 13:34:23 -05:00
Bodong Wang
1cbe6fc86c IB/mlx5: Add support for CQE compressing
CQE compressing reduces PCI overhead by coalescing and compressing
multiple CQEs into a single merged CQE. Successful compressing
improves message rate especially for small packet traffic.

CQE compressing is supported for all 64B CQE formats (with certain
limitations) generated by RQ/Responder or by SQ/Requestor.

Signed-off-by: Bodong Wang <bodong@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13 13:34:20 -05:00
Bodong Wang
7e43a2a5ba IB/mlx5: Report mlx5 CQE compression caps during query
The capabilities include:
- Max number of compressed and aggregated CQEs in a single session,
  while zero means unsupported.
- For Responder, there are two formats of mini CQE: mini CQE with Rx
  hash and mini CQE with checksum. They're mutual exclusive.

Signed-off-by: Bodong Wang <bodong@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13 13:34:03 -05:00
Bodong Wang
191ded4a4d IB/mlx5: Report mlx5 multi packet WQE caps during query
The capabilities whether hardware support multi packet WQE or not is
exposed to user space through query_device by uhw.

Signed-off-by: Bodong Wang <bodong@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13 13:33:25 -05:00
Eran Ben Elisha
bf08e884bf IB/mlx4: Check if GRH is available before using it
Before reading GRH attributes, need to make sure AH contains GRH,
and in addition, initialize GID type.

Fixes: dbf727de74 ('IB/core: Use GID table in AH creation and dmac resolution')
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13 13:32:51 -05:00
Eran Ben Elisha
1f22e454df IB/mlx4: When no DMFS for IPoIB, don't allow NET_IF QPs
According to the firmware spec, FLOW_STEERING_IB_UC_QP_RANGE command is
supported only if dmfs_ipoib bit is set.

If it isn't set we want to ensure allocating NET_IF QPs fail. We do so
by filling out the allocation bitmap. By thus, the NET_IF QPs allocating
function won't find any free QP and will fail.

Fixes: c1c9850112 ('IB/mlx4: Add support for steerable IB UD QPs')
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-13 13:29:46 -05:00
Henry Orosco
d6f7bbcc2e i40iw: Reorganize structures to align with HW capabilities
Some resources are incorrectly organized and at odds with
HW capabilities. Specifically, ILQ, IEQ, QPs, MSS, QOS
and statistics belong in a VSI.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-12 17:20:29 -05:00
Mustafa Ismail
0cc0d851cc i40iw: Fix incorrect check for error
In i40iw_ieq_handle_partial() the check for !status is incorrect.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-12 17:20:29 -05:00
Mustafa Ismail
6b0805c256 i40iw: Assign MSS only when it is a new MTU
Currently we are changing the MSS regardless of whether
there is a change or not in MTU. Fix to make the
assignment of MSS dependent on an MTU change.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-12 17:20:28 -05:00
Shiraz Saleem
d627b50631 i40iw: Fix race condition in terminate timer's handler
Add a QP reference when terminate timer is started to ensure
the destroy QP doesn't race ahead to free the QP while it is being
referenced in the terminate timer's handler.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-12 17:20:28 -05:00
Mustafa Ismail
fd90d4d4c2 i40iw: Fix memory leak in CQP destroy when in reset
On a device close, the control QP (CQP) is destroyed by calling
cqp_destroy which destroys the CQP and frees its SD buffer memory.
However, if the reset flag is true, cqp_destroy is never called and
leads to a memory leak on SD buffer memory. Fix this by always calling
cqp_destroy, on device close, regardless of reset. The exception to this
when CQP create fails. In this case, the SD buffer memory is already
freed on an error check and there is no need to call cqp_destroy.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-12 17:20:27 -05:00
Shiraz Saleem
1cda28bb5b i40iw: Fix QP flush to not hang on empty queues or failure
When flush QP and there are no pending work requests, signal completion
to unblock i40iw_drain_sq and i40iw_drain_rq which are waiting on
completion for iwqp->sq_drained and iwqp->sq_drained respectively.
Also, signal completion if flush QP fails to prevent the drain SQ or RQ
from being blocked indefintely.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-12 17:20:27 -05:00
Mustafa Ismail
f4a87ca12a i40iw: Fix double free of QP
A QP can be double freed if i40iw_cm_disconn() is
called while it is currently being freed by
i40iw_rem_ref(). The fix in i40iw_cm_disconn() will
first check if the QP is already freed before
making another request for the QP to be freed.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-12 17:20:26 -05:00
Shiraz Saleem
91c42b72f8 i40iw: Use correct src address in memcpy to rdma stats counters
hw_stats is a pointer to i40_iw_dev_stats struct in i40iw_get_hw_stats().
Use hw_stats and not &hw_stats in the memcpy to copy the i40iw device stats
data into rdma_hw_stats counters.

Fixes: b40f4757da ("IB/core: Make device counter infrastructure dynamic")

Cc: stable@vger.kernel.org # 4.7+
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-12 17:19:02 -05:00
Thomas Huth
5e58917122 i40iw: Remove macros I40IW_STAG_KEY_FROM_STAG and I40IW_STAG_INDEX_FROM_STAG
The macros I40IW_STAG_KEY_FROM_STAG and I40IW_STAG_INDEX_FROM_STAG are
apparently bad - they are using the logical "&&" operation which
does not make sense here. It should have been a bitwise "&" instead.
Since the macros seem to be completely unused, let's simply remove
them so that nobody accidentially uses them in the future. And while
we're at it, also remove the unused macro I40IW_CREATE_STAG.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-12 17:13:02 -05:00
Bart Van Assche
66431b0e86 IB/hfi1: Define platform_config_table_limits once
Defining static data structures in a header file is wrong because
this causes the data structure to be instantiated once in every .c
file it is included in. Hence move the definition of a static
array from a header file into the only .c file in which it is used.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Dennis Dalessandro <dennis.dalessandro@intel.com>
Cc: Dean Luick <dean.luick@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-11 15:29:42 -05:00
Bhumika Goyal
0fc859a657 IB/hfi1: constify mmu_notifier_ops structure
Declare the structure mmu_notifier_ops as const as it is only stored in
the ops field of a mmu_notifier structure. The ops field is of type
const struct mmu_notifier_ops *, so mmu_notifier_ops structures having
this property can be declared as const.
Done using coccinelle:
@r1 disable optional_qualifier @
identifier i;
position p;
@@
static struct mmu_notifier_ops i@p = {...};

@ok1@
identifier r1.i;
position p;
struct mmu_rb_handler handler;
@@
handler.mn.ops=&i@p

@bad@
position p!={r1.p,ok1.p};
identifier r1.i;
@@
i@p

@depends on !bad disable optional_qualifier@
identifier r1.i;
@@
static
+const
struct mmu_notifier_ops i={...};

@depends on !bad disable optional_qualifier@
identifier r1.i;
@@
+const
struct mmu_notifier_ops i;

File size before:
   text	   data	    bss	    dec	    hex	filename
   3566	     72	     16	   3654	    e46
drivers/infiniband/hw/hfi1/mmu_rb.o

File size after:
   text	   data	    bss	    dec	    hex	filename
   3658	      0	     16	   3674	    e5a
drivers/infiniband/hw/hfi1/mmu_rb.o

Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-11 15:29:42 -05:00
Mike Marciniszyn
5dc806052a IB/rdmavt, IB/hfi1, IB/qib: Add inlines for mtu division
Add rvt_div_round_up_mtu() and rvt_div_mtu() routines to
do the computation based on the pmtu and the log_pmtu.

Change divides in qib, hfi1 to use the new inlines.

Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-11 15:29:42 -05:00
Mike Marciniszyn
c64607aa8a IB/hfi1,IB/qib: use rvt swqe mr deref helper
Convert to use new swqe put routine.

Reviewed-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-11 15:29:42 -05:00
Harish Chegondi
9d8145a604 IB/hfi1: Avoid credit return allocation for cpu-less NUMA nodes
Do not allocate credit return base and DMA memory for
NUMA nodes without CPUs.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-11 15:29:42 -05:00
Mike Marciniszyn
0771da5a6e IB/hfi1,IB/qib: Use new send completion helper
Convert cq completion returns in both rdmavt drivers
to use the new helper.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-11 15:29:42 -05:00
Sebastian Sanchez
238b1862b4 IB/qib: Use standard refcount wrapper for QPs
Use the standard driver wrapper for QP reference counters.
This makes the code more maintainable.

Fixes: Commit 4d6f85c3fa ("IB/rdmavt, IB/qib, IB/hfi1: Use new QP put get routines")
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-11 15:29:42 -05:00
Sebastian Sanchez
b44980f879 IB/hfi1: Replace qp->refcount release code with standard driver wrapper
Some parts of the code don't use the standard release
wrapper rvt_put_qp() for decrementing and testing
the refcount to then try to use a resource.
Replace this code with the standard driver wrapper.

Fixes: Commit 4d6f85c3fa ("IB/rdmavt, IB/qib, IB/hfi1: Use new QP put get routines")
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-11 15:29:42 -05:00
Dean Luick
0080167467 IB/hfi1: Preserve external device completed bit
The driver should not change the external device request
completed bit when not actually doing an external device
request.

Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-11 15:29:42 -05:00
Sebastian Sanchez
9b86071c5e IB/hfi1: Remove critical section gap in sc_buffer_alloc()
In sc_buffer_alloc(), the sc->alloc_lock is released
before calling sc_release_update(), and it is reacquired
after the function call. This causes CPU lock trading.
Fix it by not dropping the lock before calling
sc_release_update().

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-11 15:29:42 -05:00
Mitko Haralanov
b777f154a0 IB/hfi1: Remove usage of qp->s_cur_sge
The s_cur_sge field in the qp structure holds a pointer to the
SGE of the currently processed WQE. It assumes the protection
of the RVT_S_BUSY flag to prevent the changing of this field
while the send engine is using it. This scheme works as long
as there is only one instance of the send engine running at a
time.

Scaling of the send engine to multiple cores would break this
assumption as there could be multiple instances of the send engine
running on different CPUs. This opens a window where the QP's
RVT_S_BUSY flag is not set but the send engine is still running.

To prevent accidental changing of the s_cur_sge pointer, the QP's
dependence on it is removed. The SGE pointer is now stored in the
verbs_txreq, which is a per-packet data structure. This ensures
that each individual packet has it's own pointer, which is setup
while the RVT_S_BUSY flag is set.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-11 15:29:42 -05:00
Dean Luick
5213006ade IB/hfi1: Add special setting for low power AOC
Low power QSFP AOC cables require a different SerDes
Tx PLL bandwidth setting than the default.  The
8051 firmware does not know the details, so the driver
needs to tell the firmware through a special setting.

Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-11 15:29:42 -05:00
Tadeusz Struk
6e40b59cfa IB/hfi1: Remove definition of unused hfi1_affinity struct
The struct hfi1_affinity is not used anymore.
We use the struct hfi1_affinity_node and hfi1_affinity_node_list
instead.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-11 15:29:42 -05:00
Don Hiatt
e922ae06e9 IB/hfi1: Remove dependence on qp->s_cur_size
The qp->s_cur_size field assumes that the S_BUSY bit protects
the field from modification after the slock is dropped. Scaling the
send engine to multiple cores would break that assumption.

Correct the issue by carrying the payload size in the txreq structure.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-11 15:25:13 -05:00
Jianxin Xiong
b7481944b0 IB/hfi1: Show statistics counters under IB stats interface
Previously tools like hfi1stats had to access these counters through
debugfs, which often caused permission issue for non-root users. It is
not always acceptable to change the debugfs mounting permission due
to security concerns. When exposed under the IB stats interface, the
counters are universally readable by default.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-11 15:25:13 -05:00
Jakub Pawlak
e730139b34 IB/hfi1: Disable header suppression for short packets
For the received packets with payload less or equal 8DWS
RxDmaDataFifoRdUncErr is not reported. There is set RHF.EccErr
if the header is not suppressed. When such packet is detected
on the send side the header suppression mechanism is disabled
by clearing SH bit in the packet header.

Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-11 15:25:13 -05:00
Dean Luick
1b9e774933 IB/hfi1: Export 8051 memory and LCB registers via debugfs
Both the 8051 memory and LCB register access require multiple
steps and coordination with the driver.  This cannot be safely
done with resource0 alone.

The 8051 memory is exported read-only.  LCB is exported read/write.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-11 15:25:13 -05:00
Sebastian Sanchez
53e91d264b IB/hfi1: Use non-atomic __test_and_clear_bit in hot path
qp->r_aflags is already protected by qp->r_lock, therefore,
test_and_clear_bit() doesn't need to be atomic. Profile
shows this function call is costly.

Change the test_and_clear_bit() call to use the non-atomic
variant.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-11 15:25:13 -05:00
Dean Luick
d7cf4ccf6f IB/hfi1: Fix dc8051 multiple qword memory reads
When reading multiple dc8051 data memory locations
at once, the read enabled field must be toggled
at every address change.  Do that by writing only
the address first, then writing the enable.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-11 15:25:13 -05:00
Dean Luick
62aeddbf28 IB/hfi1: Read new EPROM format
Add the ability to read the new EPROM format.

Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-11 15:25:13 -05:00
Wei Yongjun
15f7e3c21b iw_cxgb4: Fix error return code in c4iw_rdev_open()
Fix to return error code -ENOMEM from the __get_free_page() error
handling case instead of 0, as done elsewhere in this function.

Fixes: 05eb23893c ("cxgb4/iw_cxgb4: Doorbell Drop Avoidance Bug Fixes")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-08 10:03:36 -05:00
Henry Orosco
78300cf815 i40iw: Add request for reset on CQP timeout
When CQP times out, send a request to LAN driver for reset.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-05 16:09:44 -05:00
Henry Orosco
1ef936b229 i40iw: Code cleanup, remove check of PBLE pages
Remove check for zero 'pages' of unallocated pbles calculated in
add_pble_pool(); as it can never be true.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-05 16:09:44 -05:00
Shiraz Saleem
bf69f494c3 i40iw: Correctly fail loopback connection if no listener
Fail the connect and return the proper error code if a client
is started with local IP address and there is no corresponding
loopback listener.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-05 16:09:43 -05:00
Shiraz Saleem
fd4e906b2e i40iw: Fill in IRD value when on connect request
IRD is not populated on connect request and application is
getting 0 for the value. Fill in the correct value on
connect request.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-05 16:09:43 -05:00
Shiraz Saleem
7eb2bde7f3 i40iw: Set TOS field in IP header
Set the TOS field in IP header with the value passed in
from application. If there is mismatch between the remote
client's TOS and listener, set the listener Tos to the higher
of the two values.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-05 16:09:42 -05:00
Shiraz Saleem
e0b010da87 i40iw: Add NULL check for ibqp event handler
Add NULL check for ibqp event handler before calling it to report
QP events, as it might not initialized.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-05 16:09:42 -05:00
Mustafa Ismail
a05e15135b i40iw: Replace list_for_each_entry macro with safe version
Use list_for_each_entry_safe macro for the IPv6 addr list
as IPv6 addresses can be deleted while going through the
list.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-05 16:09:41 -05:00
Mustafa Ismail
e5e74b61b1 i40iw: Add IP addr handling on netdev events
Disable listeners and disconnect all connected QPs on
a netdev interface down event. On an interface up event,
the listeners are re-enabled.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-05 16:09:41 -05:00
Mustafa Ismail
d59659340c i40iw: Add missing cleanup on device close
On i40iw device close, disconnect all connected QPs by moving
them to error state; and block further QPs, PDs and CQs from
being created. Additionally, make sure all resources have been
freed before deallocating the ibdev as part of the device close.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-05 16:09:40 -05:00
Henry Orosco
f26c7c8339 i40iw: Add 2MB page support
Add support to allow each independent memory region to
be configured for 2MB page size in addition to 4KB
page size.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-05 16:09:40 -05:00
Henry Orosco
b6a529da69 i40iw: Utilize physically mapped memory regions
Add support to use physically mapped WQ's and MR's if determined
that the OS registered user-memory for the region is physically
contiguous. This feature will eliminate the need for unnecessarily
setting up and using PBL's when not required.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-05 16:09:39 -05:00
Henry Orosco
d4165e3abd i40iw: Fix incorrect assignment of SQ head
The SQ head is incorrectly incremented when the number
of WQEs required is greater than the number available.
The fix is to use the I40IW_RING_MOV_HEAD_BY_COUNT
macro. This checks for the SQ full condition first and
only if SQ has room for the request, then we move the
head appropriately.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-05 16:09:39 -05:00
Henry Orosco
78e945aace i40iw: Remove variable flush_code and check to set qp->sq_flush
The flush_code variable in i40iw_bld_terminate_hdr() is obsolete and
the check to set qp->sq_flush is unreachable. Currently flush code is
populated in setup_term_hdr() and both SQ and RQ are flushed always
as part of the tear down flow.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-05 16:09:38 -05:00
Henry Orosco
dfd9c43b3c i40iw: Remove check on return from device_init_pestat()
Remove unnecessary check for return code from
device_init_pestat() and change func to void.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-05 16:09:38 -05:00
Henry Orosco
5ebcb0ff54 i40iw: Use runtime check for IS_ENABLED(CONFIG_IPV6)
To be consistent, use the runtime check instead of
conditional compile.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-05 16:09:37 -05:00
Henry Orosco
e67791858e i40iw: Use actual page size
In i40iw_post_send, use the actual page size instead of
encoded page size. This is to be consistent with the
rest of the file.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-05 16:09:37 -05:00
Henry Orosco
1ad19f739f i40iw: Remove NULL check for cm_node->iwdev
It is not necessary to check cm_node->iwdev in
i40iw_rem_ref_cm_node() as it can never be NULL after
a successful call out of i40iw_make_cm_node().

Signed-off-by: Chien Tin Tung <chien.tin.tung@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-05 16:09:36 -05:00
Henry Orosco
799749979d i40iw: Remove checks for more than 48 bytes inline data
Remove dead code, which isn't executed because we
return error if the data size is greater than 48 bytes.

Inline data size greater than 48 bytes isn't supported
and the maximum WQE size is 64 bytes.

Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-05 16:09:36 -05:00
Henry Orosco
85a87c90ee i40iw: Query device accounts for internal rsrc
Some resources are consumed internally and not available to the user.
After hw is initialized, figure out how many resources are consumed
and subtract those numbers from the initial max device capability in
i40iw_query_device().

Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-05 16:09:35 -05:00
Henry Orosco
e7f9774af5 i40iw: Optimize inline data copy
Use memcpy for inline data copy in sends
and writes instead of byte by byte copy.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-05 16:09:35 -05:00
Henry Orosco
c38d7e0d08 i40iw: Fix for LAN handler removal
If i40iw_open() fails for any reason, the LAN handler
is not being removed. Modify i40iw_deinit_device()
to always remove the handler.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-05 16:09:35 -05:00
Henry Orosco
01d0b36798 i40iw: Correct values for max_recv_sge, max_send_sge
When creating QPs, ensure init_attr->cap.max_recv_sge
is clipped to MAX_FRAG_COUNT.

Expose MAX_FRAG_COUNT for max_recv_sge and max_send_sge in
i40iw_query_qp().

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Reviewed-By: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-05 16:09:34 -05:00
Henry Orosco
e69c509361 i40iw: Use vector when creating CQs
Assign each CEQ vector to a different CPU when possible, then
when creating a CQ, use the vector for the CEQ id. This
allows completion work to be distributed over multiple cores.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-05 16:09:34 -05:00
Henry Orosco
68583ca2a1 i40iw: Convert page_size to encoded value
Passed in page_size was used as encoded value for writing
the WQE and passed in value was usually 4096. This was
working out since bit 0 was 0 and implies 4KB pages,
but would not work for other page sizes.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-05 16:09:28 -05:00
Henry Orosco
7cba2cc13e i40iw: Set MAX IRD, MAX ORD size to max supported value
Set the MAX_IRD and MAX_ORD size negotiated to the maximum
supported values.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03 15:24:53 -05:00
Henry Orosco
7581e96ca4 i40iw: Remove workaround for pre-production errata
Pre-production silicon incorrectly truncates 4 bytes of the MPA
packet in UDP loopback case. Remove the workaround as it is no
longer necessary.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03 15:24:52 -05:00
Henry Orosco
d62d563424 i40iw: Enable message packing
Remove the parameter to disable message packing and
always enable it.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03 15:24:52 -05:00
Henry Orosco
0fc2dc5889 i40iw: Add Quality of Service support
Add support for QoS on QPs. Upon device initialization,
a map is created from user priority to queue set
handles. On QP creation, use ToS to look up the queue
set handle for use with the QP.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03 15:24:51 -05:00
Leon Romanovsky
4d4099584c IB/hns: Move HNS RoCE user vendor structures
This patch moves HNS vendor's specific structures to
common UAPI folder which will be visible to all consumers.

Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03 14:23:14 -05:00
Lijun Ou
3b5184be89 IB/hns: Fix the IB device name
This patch mainly fix the name for IB device in order
to match with libhns.

Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03 14:20:42 -05:00
Shaobo Xu
afb6b092d6 IB/hns: Fix the bug when free cq
If the resources of cq are freed while executing the user case, hardware
can not been notified in hip06 SoC. Then hardware will hold on when it
writes the cq buffer which has been released.

In order to slove this problem, RoCE driver checks the CQE counter, and
ensure that the outstanding CQE have been written. Then the cq buffer
can be released.

Signed-off-by: Shaobo Xu <xushaobo2@huawei.com>
Reviewed-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03 14:20:42 -05:00
Wei Hu (Xavier)
19a408efa0 IB/hns: Delete the redundant memset operation
It deleted the redundant memset operation because the memory allocated
by ib_alloc_device has been set zero.

Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03 14:20:42 -05:00
Wei Hu (Xavier)
9daed0affa IB/hns: Fix the bug of setting port mtu
In hns_roce driver, we need not call iboe_get_mtu to reduce
IB headers from effective IBoE MTU because hr_dev->caps.max_mtu
has already been reduced.

Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03 14:20:42 -05:00
Shaobo Xu
bfcc681bd0 IB/hns: Fix the bug when free mr
If the resources of mr are freed while executing the user case, hardware
can not been notified in hip06 SoC. Then hardware will hold on when it
reads the payload by the PA which has been released.

In order to slove this problem, RoCE driver creates 8 reserved loopback
QPs to ensure zero wqe when free mr. When the mac address is reset, in
order to avoid loopback failure, we need to release the reserved loopback
QPs and recreate them.

Signed-off-by: Shaobo Xu <xushaobo2@huawei.com>
Reviewed-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03 14:20:42 -05:00
Wei Hu (Xavier)
d838c481e0 IB/hns: Fix the bug when destroy qp
If send queue is still working when qp is in reset state by modify qp
in destroy qp function, hardware will hold on and don't work in hip06
SoC. In current codes, RoCE driver check hardware pointer of sending and
hardware pointer of processing to ensure that hardware has processed all
the dbs of this qp. But while the environment of wire becomes not good,
The checking time maybe too long.

In order to solve this problem, RoCE driver created a workqueue at probe
function. If there is a timeout when checking the status of qp, driver
initialize work entry and push it into the workqueue, Work function will
finish checking and release the related resources later.

Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Dongdong Huang(Donald) <hdd.huang@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03 14:20:42 -05:00
Salil
e84e40be8e IB/hns: Fix for Checkpatch.pl comment style errors
This patch correct the comment style errors caught by
checkpatch.pl script

Signed-off-by: Salil Mehta  <salil.mehta@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03 14:20:42 -05:00
Shaobo Xu
8254746978 IB/hns: Implement the add_gid/del_gid and optimize the GIDs management
IB core has implemented the calculation of GIDs and the management
of GID tables, and it is now responsible to supply query function
for GIDs. So the calculation of GIDs and the management of GID
tables in the RoCE driver is redundant.

The patch is to implement the add_gid/del_gid to set the GIDs in
the RoCE driver, remove the redundant calculation and management of
GIDs in the notifier call of the net device and the inet, and
update the query_gid.

Signed-off-by: Shaobo Xu <xushaobo2@huawei.com>
Reviewed-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Salil Mehta  <salil.mehta@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03 14:20:42 -05:00
Wei Hu (Xavier)
5e6ff78a22 IB/hns: Change qpn allocation to round-robin mode.
When using CM to establish connections, qp number that was freed
just now will be rejected by ib core. To fix these problem, We
change qpn allocation to round-robin mode. We added the round-robin
mode for allocating resources using bitmap. We use round-robin mode
for qp number and non round-robing mode for other resources like
cq number, pd number etc.

Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Salil Mehta  <salil.mehta@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03 14:20:42 -05:00
Wei Hu (Xavier)
dd783a212c IB/hns: Modify query info named port_num when querying RC QP
This patch modified the output query info qp_attr->port_num
to fix bug in hip06.

Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Salil Mehta  <salil.mehta@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03 14:20:42 -05:00
Wei Hu (Xavier)
6b877c32bc IB/hns: Modify the macro for the timeout when cmd process
This patch modified the macro for the timeout when cmd is
processing as follows:
Before modification:
 enum {
	HNS_ROCE_CMD_TIME_CLASS_A       = 10000,
	HNS_ROCE_CMD_TIME_CLASS_B       = 10000,
	HNS_ROCE_CMD_TIME_CLASS_C       = 10000,
 };
After modification:
 #define HNS_ROCE_CMD_TIMEOUT_MSECS	10000

Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Salil Mehta  <salil.mehta@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03 14:20:42 -05:00
Lijun Ou
1dec243ac0 IB/hns: Fix the bug for qp state in hns_roce_v1_m_qp()
In old code, the value of qp state from qpc was assigned for
attr->qp_state. The value may be an error while attr_mask &
IB_QP_STATE is zero.

Signed-off-by: Lijun Ou <oulijun@huawei.com>
Reviewed-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Salil Mehta  <salil.mehta@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03 14:20:42 -05:00
Lijun Ou
80596c6717 IB/hns: Modify the condition of notifying hardware loopback
This patch modified the condition of notifying hardware loopback.

In hip06, RoCE Engine has several ports, one QP is related
to one port. hardware only support loopback in the same port,
not in the different ports.

So, If QP related to port N, the dmac in the QP context equals
the smac of the local port N or the loop_idc is 1, we should
set loopback bit in QP context to notify hardware.

Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Salil Mehta  <salil.mehta@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03 14:20:42 -05:00
Lijun Ou
543bfe6c3c IB/hns: add self loopback for CM
This patch mainly adds self loopback support for CM.

Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Peter Chen <luck.chen@huawei.com>
Reviewed-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Salil Mehta  <salil.mehta@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03 14:20:42 -05:00
Wei Hu (Xavier)
8d497eb0f3 IB/hns: Optimize the logic of allocating memory using APIs
This patch modified the logic of allocating memory using APIs in
hns RoCE driver. We used kcalloc instead of kmalloc_array and
bitmap_zero. And When kcalloc failed, call vzalloc to alloc
memory.

Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Ping Zhang <zhangping5@huawei.com>
Signed-off-by: Salil Mehta  <salil.mehta@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03 14:20:42 -05:00
Wei Hu (Xavier)
8f3e9f3ea0 IB/hns: Add code for refreshing CQ CI using TPTR
This patch added the code for refreshing CQ CI using TPTR in hip06
SoC.

We will send a doorbell to hardware for refreshing CQ CI when user
succeed to poll a cqe. But it will be failed if the doorbell has
been blocked. So hardware will read a special buffer called TPTR
to get the lastest CI value when the cq is almost full.

This patch support the special CI buffer as follows:
a) Alloc the memory for TPTR in the hns_roce_tptr_init function and
   free it in hns_roce_tptr_free function, these two functions will
   be called in probe function and in the remove function.
b) Add the code for computing offset(every cq need 2 bytes) and
   write the dma addr to every cq context to notice hardware in the
   function named hns_roce_v1_write_cqc.
c) Add code for mapping TPTR buffer to user space in function named
   hns_roce_mmap. The mapping distinguish TPTR and UAR of user mode
   by vm_pgoff(0: UAR, 1: TPTR, others:invaild) in hip06.
d) Alloc the code for refreshing CQ CI using TPTR in the function
   named hns_roce_v1_poll_cq.
e) Add some variable definitions to the related structure.

Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Dongdong Huang(Donald) <hdd.huang@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Salil Mehta  <salil.mehta@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03 14:20:42 -05:00
Lijun Ou
9eefa953f4 IB/hns: Add the interface for querying QP1
In old code, It only added the interface for querying non-specific
QP. This patch mainly adds an interface for querying QP1.

Signed-off-by: Lijun Ou <oulijun@huawei.com>
Reviewed-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Salil Mehta  <salil.mehta@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03 14:20:42 -05:00
Leon Romanovsky
f73a1dbc45 infiniband: remove WARN that is not kernel bug
On Mon, Nov 21, 2016 at 09:52:53AM -0700, Jason Gunthorpe wrote:
> On Mon, Nov 21, 2016 at 02:14:08PM +0200, Leon Romanovsky wrote:
> > >
> > > In ib_ucm_write function there is a wrong prefix:
> > >
> > > + pr_err_once("ucm_write: process %d (%s) tried to do something hinky\n",
> >
> > I did it intentionally to have the same errors for all flows.
>
> Lets actually use a good message too please?
>
>  pr_err_once("ucm_write: process %d (%s) changed security contexts after opening FD, this is not allowed.\n",
>
> Jason

>From 70f95b2d35aea42e5b97e7d27ab2f4e8effcbe67 Mon Sep 17 00:00:00 2001
From: Leon Romanovsky <leonro@mellanox.com>
Date: Mon, 21 Nov 2016 13:30:59 +0200
Subject: [PATCH rdma-next V2] IB/{core, qib}: Remove WARN that is not kernel bug

WARNINGs mean kernel bugs, in this case, they are placed
to mark programming errors and/or malicious attempts.

BUG/WARNs that are not kernel bugs hinder automated testing efforts.

Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03 13:17:07 -05:00
Leon Romanovsky
740c330ee6 IB/ocrdma: Remove and fix debug prints after allocation failure
The prints after [k|v][m|z|c]alloc() functions are not needed,
because in case of failure, allocator will print their internal
error prints anyway.

Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03 13:12:52 -05:00
Leon Romanovsky
02d93f8e6b IB/usninc: Remove and fix debug prints after allocation failure
This patch removes unneeded prints after allocation failure
and moves one debug print into the appropriate place.

Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03 13:12:52 -05:00
Leon Romanovsky
870b285245 IB/mthca: Remove debug prints after allocation failure
The prints after [k|v][m|z|c]alloc() functions are not needed,
because in case of failure, allocator will print their internal
error prints anyway.

Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03 13:12:52 -05:00
Leon Romanovsky
2e65835a1b IB/nes: Remove debug prints after allocation failure
The prints after [k|v][m|z|c]alloc() functions are not needed,
because in case of failure, allocator will print their internal
error prints anyway.

Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03 13:12:52 -05:00
Leon Romanovsky
c40a83b978 IB/qib: Remove debug prints after allocation failure
The prints after [k|v][m|z|c]alloc() functions are not needed,
because in case of failure, allocator will print their internal
error prints anyway.

Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03 13:12:52 -05:00
Leon Romanovsky
315b41480b IB/i40iw: Remove debug prints after allocation failure
The prints after [k|v][m|z|c]alloc() functions are not needed,
because in case of failure, allocator will print their internal
error prints anyway.

Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03 13:12:52 -05:00
Leon Romanovsky
9a88f96f21 IB/cxgb4: Remove debug prints after allocation failure
The prints after [k|v][m|z|c]alloc() functions are not needed,
because in case of failure, allocator will print their internal
error prints anyway.

Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03 13:12:52 -05:00
Leon Romanovsky
51ad2bae21 IB/cxgb3: Remove debug prints after allocation failure
The prints after [k|v][m|z|c]alloc() functions are not needed,
because in case of failure, allocator will print their internal
error prints anyway.

Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03 13:12:52 -05:00
Leon Romanovsky
5ce9f115bd IB/hfi1: Remove debug prints after allocation failure
The prints after [k|v][m|z|c]alloc() functions are not needed,
because in case of failure, allocator will print their internal
error prints anyway.

Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03 13:12:52 -05:00
Leon Romanovsky
0886d8f0b7 IB/mlx5: Remove debug prints after allocation failure
The prints after [k|v][m|z|c]alloc() functions are not needed,
because in case of failure, allocator will print their internal
error prints anyway.

Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03 13:12:52 -05:00
Leon Romanovsky
15d4626e49 IB/mlx4: Remove debug prints after allocation failure
The prints after [k|v][m|z|c]alloc() functions are not needed,
because in case of failure, allocator will print their internal
error prints anyway.

Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-03 13:12:52 -05:00
David S. Miller
f9aa9dc7d2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
All conflicts were simple overlapping changes except perhaps
for the Thunder driver.

That driver has a change_mtu method explicitly for sending
a message to the hardware.  If that fails it returns an
error.

Normally a driver doesn't need an ndo_change_mtu method becuase those
are usually just range changes, which are now handled generically.
But since this extra operation is needed in the Thunder driver, it has
to stay.

However, if the message send fails we have to restore the original
MTU before the change because the entire call chain expects that if
an error is thrown by ndo_change_mtu then the MTU did not change.
Therefore code is added to nicvf_change_mtu to remember the original
MTU, and to restore it upon nicvf_update_hw_max_frs() failue.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-22 13:27:16 -05:00
Hariprasad Shenai
ab677ff4ad cxgb4: Allocate Tx queues dynamically
Allocate resources dynamically for Upper layer driver's (ULD) like
cxgbit, iw_cxgb4, cxgb4i and chcr. The resources allocated include Tx
queues which are allocated when ULD register with cxgb4 driver and freed
while un-registering. The Tx queues which are shared by ULD shall be
allocated by first registering driver and un-allocated by last
unregistering driver.

Signed-off-by: Atul Gupta <atul.gupta@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-18 14:04:29 -05:00
Linus Torvalds
57400d3052 First round of -rc fixes
- Misc Intel hfi1 fixes
 - Misc Mellanox mlx4, mlx5, and rxe fixes
 - A couple cxgb4 fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJYLQfQAAoJELgmozMOVy/doFMQAI96k4C9TJhtSNywdUhmqEDP
 09IZFWVPuVFdgB//eFnUlqQackHn70RGNJfM+wDLRuNvyDaIJ21pSTqLeVkPJPaN
 7kHmNo2OiYqo5evq2rFV0Jaaf9mj+zkmQBWE5vLLuNqoYWNBuPrNMY5O88o09TPQ
 umN04md9VYoTjg0eya9ESTE+RUsYO1QL16VEXLZt8HonDGQUe+Z8nGh6VtKBQV+t
 34li0vPRj2DGaWuZXWjgKTSxniHtKrds5uEzTxucNYXfz0NrfLTTlADDgPwHQ7qW
 Utbv18/C8j6hTQgogiUTASSyJCDnYC6g1Ovn9vY8bgu6Vo2FjHCaQyuubQQKGCtl
 IzX8ahf5z+pAm88hU6e6I0Hi+wPMtc8VT8XBJnhKjxC8qxH+OZNCBlNH3NWroIYo
 uC0mV0pzhh/FERHK/cDujeecu4n8V2WiOs59Ta3R6ys8nO5CxwVGup0OOXK2ZG2X
 Qfm+aj3xf0Dk06n03Y77l/iofKnxtEECPm6BqjL6JKUymFbqOZhkCUWO84sKEBbQ
 egqwpBuHkrqQLcVBWPabkkBLtHS5H+7AHKxxCJq8NJQflDgu7t+q+PT4A4YXq6Mb
 jNKdlTvz8ov+SniH8A7KHIiAGgSAzTBQKsTDLYAJdMuzj7HnNXO3oubd1CoAa05H
 8KhN0XDWVB01LeVW7rts
 =qeYK
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma

Pull rmda fixes from Doug Ledford.
 "First round of -rc fixes.

  Due to various issues, I've been away and couldn't send a pull request
  for about three weeks. There were a number of -rc patches that built
  up in the meantime (some where there already from the early -rc
  stages). Obviously, there were way too many to send now, so I tried to
  pare the list down to the more important patches for the -rc cycle.

  Most of the code has had plenty of soak time at the various vendor's
  testing setups, so I doubt there will be another -rc pull request this
  cycle. I also tried to limit the patches to those with smaller
  footprints, so even though a shortlog is longer than I would like, the
  actual diffstat is mostly very small with the exception of just three
  files that had more changes, and a couple files with pure removals.

  Summary:
   - Misc Intel hfi1 fixes
   - Misc Mellanox mlx4, mlx5, and rxe fixes
   - A couple cxgb4 fixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (34 commits)
  iw_cxgb4: invalidate the mr when posting a read_w_inv wr
  iw_cxgb4: set *bad_wr for post_send/post_recv errors
  IB/rxe: Update qp state for user query
  IB/rxe: Clear queue buffer when modifying QP to reset
  IB/rxe: Fix handling of erroneous WR
  IB/rxe: Fix kernel panic in UDP tunnel with GRO and RX checksum
  IB/mlx4: Fix create CQ error flow
  IB/mlx4: Check gid_index return value
  IB/mlx5: Fix NULL pointer dereference on debug print
  IB/mlx5: Fix fatal error dispatching
  IB/mlx5: Resolve soft lock on massive reg MRs
  IB/mlx5: Use cache line size to select CQE stride
  IB/mlx5: Validate requested RQT size
  IB/mlx5: Fix memory leak in query device
  IB/core: Avoid unsigned int overflow in sg_alloc_table
  IB/core: Add missing check for addr_resolve callback return value
  IB/core: Set routable RoCE gid type for ipv4/ipv6 networks
  IB/cm: Mark stale CM id's whenever the mad agent was unregistered
  IB/uverbs: Fix leak of XRC target QPs
  IB/hfi1: Remove incorrect IS_ERR check
  ...
2016-11-17 13:53:02 -08:00
Steve Wise
5c6b2aaf93 iw_cxgb4: invalidate the mr when posting a read_w_inv wr
Also, rearrange things a bit to have a common c4iw_invalidate_mr()
function used everywhere that we need to invalidate.

Fixes: 49b53a93a6 ("iw_cxgb4: add fast-path for small REG_MR operations")
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-16 20:10:36 -05:00
Steve Wise
4ff522ea47 iw_cxgb4: set *bad_wr for post_send/post_recv errors
There are a few cases in c4iw_post_send() and c4iw_post_receive()
where *bad_wr is not set when an error is returned.  This can
cause a crash if the application tries to use bad_wr.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-16 20:10:36 -05:00
Doug Ledford
6fa1f2f0aa Merge branches 'hfi1' and 'mlx' into k.o/for-4.9-rc 2016-11-16 20:05:10 -05:00
Saeed Mahameed
6fa2620820 IB/mlx4: Fix port query for 56Gb Ethernet links
Report the correct speed in the port attributes when using a 56Gbps
ethernet link.  Without this change the field is incorrectly set to 10.

Fixes: a9c766bb75 ('IB/mlx4: Fix info returned when querying IBoE ports')
Fixes: 2e96691c31 ('IB: Use central enum for speed instead of hard-coded values')
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-16 20:04:48 -05:00
Maor Gottlieb
731e0415b4 IB/mlx4: Put non zero value in max_ah device attribute
Use INT_MAX since this is the max value the attribute can hold, though
hardware capability is unlimited.

Fixes: 225c7b1fee ('IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-16 20:04:48 -05:00
Jack Morgenstein
befcabcd53 IB/mlx4: Handle well-known-gid in mad_demux processing
If OpenSM runs over a ConnectX-3, and there are ConnectX-4 or Connect-IB
VFs active on the network, the OpenSM will receive QP1 packets containing
a GRH where the destination GID is the "Well-Known GID" -- which is not a
GID in the HCA Port's GID Table.

This GID must be tested-for separately -- and packets which contain
this destination GID should be routed to slave 0 (the PF).

Fixes: 37bfc7c1e8 ('IB/mlx4: SR-IOV multiplex and demultiplex MADs')
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-16 20:04:48 -05:00
Moni Shoua
850d8fd765 IB/mlx4: Handle IPv4 header when demultiplexing MAD
When MAD arrives to the hypervisor, we need to identify which slave it
should be sent by destination GID. When L3 protocol is IPv4 the
GRH is replaced by an IPv4 header. This patch detects when IPv4 header
needs to be parsed instead of GRH.

Fixes: b6ffaeffae ('mlx4: In RoCE allow guests to have multiple GIDS')
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-16 20:04:48 -05:00
Maor Gottlieb
af4295c117 IB/mlx4: Set traffic class in AH
Set traffic class within sl_tclass_flowlabel when create iboe AH.
Without this the TOS value will be empty when running VLAN tagged
traffic, because the TOS value is taken from the traffic class in the
address handle attributes.

Fixes: 9106c41069 ('IB/mlx4: Fix SL to 802.1Q priority-bits mapping for IBoE')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-16 20:04:48 -05:00
Majd Dibbiny
762f899ae7 IB/mlx5: Limit mkey page size to 2GB
The maximum page size in the mkey context is 2GB.

Until today, we didn't enforce this requirement in the code,
and therefore, if we got a page size larger than 2GB, we
have passed zeros in the log_page_shift instead of the actual value
and the registration failed.

This patch limits the driver to use compound pages of 2GB for mkeys.

Fixes: e126ba97db ('mlx5: Add driver for Mellanox Connect-IB adapters')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-16 20:04:48 -05:00
Eli Cohen
288c01b746 IB/mlx5: Fix reported max SGE calculation
Add the 512 bytes limit of RDMA READ and the size of remote
address to the max SGE calculation.

Fixes: e126ba97db ('mlx5: Add driver for Mellanox Connect-IB adapters')
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-16 20:04:48 -05:00
Eli Cohen
acbda52388 IB/mlx5: Wait for all async command completions to complete
Wait before continuing unload till all pending mkey async creation requests
are done.

Fixes: e126ba97db ('mlx5: Add driver for Mellanox Connect-IB adapters')
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-16 20:04:48 -05:00
Maor Gottlieb
86695a6582 IB/mlx5: Put non zero value in max_ah
We put INT_MAX since this is the max value that can be held.
Though there is no hardware limitation, this is practically
a large enough number so we can use it.

Fixes: e126ba97db ('mlx5: Add driver for Mellanox Connect-IB adapters')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-16 20:04:48 -05:00
Maor Gottlieb
578e72647b IB/mlx5: Fix atomic cap in indirect UMR
Remove from the driver the limitation imposed by firmware check
to not allow change of atomic permissions for indirect UMRs.
In order to avoid failures on old firmware, we only ask for change
of atomic permissions if atomic operations are supported.

Fixes: 968e78dd96 ('IB/mlx5: Enhance UMR support to allow partial page table update')
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-16 20:04:48 -05:00
Max Gurtovoy
2d2215888d IB/mlx5: Replace numerical constant with predefined MACRO
Replace the pre-defined macro signifying inline umr instead
of the numerical constant.

Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-16 20:04:48 -05:00
Matan Barak
593ff73bcf IB/mlx4: Fix create CQ error flow
Currently, if ib_copy_to_udata fails, the CQ
won't be deleted from the radix tree and the HW (HW2SW).

Fixes: 225c7b1fee ('IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters')
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-16 20:03:44 -05:00
Daniel Jurgens
37995116fe IB/mlx4: Check gid_index return value
Check the returned GID index value and return an error if it is invalid.

Fixes: 5070cd2239 ('IB/mlx4: Replace mechanism for RoCE GID management')
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-16 20:03:44 -05:00
Eli Cohen
a1ab8402d1 IB/mlx5: Fix NULL pointer dereference on debug print
For XRC QP CQs may not exist. Check before attempting dereference.

Fixes: e126ba97db ('mlx5: Add driver for Mellanox Connect-IB adapters')
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-16 20:03:44 -05:00
Eli Cohen
dbaaff2a2c IB/mlx5: Fix fatal error dispatching
When an internal error condition is detected, make sure to set the
device inactive after dispatching the event so ULPs can get a
notification of this event.

Fixes: e126ba97db ('mlx5: Add driver for Mellanox Connect-IB adapters')
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-16 20:03:44 -05:00
Moshe Lazer
6bc1a656ab IB/mlx5: Resolve soft lock on massive reg MRs
When calling reg_mr of large MRs (e.g. 4GB) from multiple processes
and MR caches can't supply the required amount of MRs the slow-path
of MR allocation may be used. In this case we need to serialize the
slow-path between the processes to avoid soft lock.

Fixes: e126ba97db ('mlx5: Add driver for Mellanox Connect-IB adapters')
Signed-off-by: Moshe Lazer <moshel@mellanox.com>
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-16 20:03:44 -05:00
Daniel Jurgens
16b0e0695a IB/mlx5: Use cache line size to select CQE stride
When creating kernel CQs use 128B CQE stride if the
cache line size is 128B, 64B otherwise.  This prevents
multiple CQEs from residing in a 128B cache line,
which can cause retries when there are concurrent
read and writes in one cache line.

Tested with IPoIB on PPC64, saw ~5% throughput
improvement.

Fixes: e126ba97db ('mlx5: Add driver for Mellanox Connect-IB adapters')
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-16 20:03:44 -05:00
Maor Gottlieb
efd7f40082 IB/mlx5: Validate requested RQT size
Validate that the requested size of RQT is supported by firmware.

Fixes: c5f9092936 ('IB/mlx5: Add Receive Work Queue Indirection table operations')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-16 20:03:44 -05:00
Majd Dibbiny
90be7c8ab7 IB/mlx5: Fix memory leak in query device
We need to free dev->port when we fail to enable RoCE or
initialize node data.

Fixes: 0837e86a7a ('IB/mlx5: Add per port counters')
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-16 20:03:44 -05:00
Sebastian Sanchez
8af8d2970e IB/hfi1: Optimize pio_buf and send_context structs
Both pio_buf and send_context structs have oversized
fields and have cachelines that can be optimized.

Reduce oversized fields for both structs.
Make sure pio_buf struct fits within a cacheline.
Move read-only fields to their own cacheline in
send_context struct.

All of this will avoid cacheline trading as the ring
progresses and pio buffers/send contexts are used.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-15 16:37:27 -05:00
Sebastian Sanchez
2474d775d9 IB/hfi1: Get rid of divide in pio buffer allocator
The div instruction shows costly in profiles.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-15 16:37:27 -05:00
Easwar Hariharan
fe4d924396 IB/hfi1: Add active channel and backplane support for integrated devices
Use scratch registers within the HFI1 device to recover signal
integrity information that is then used to tune the channel. While
there, update error messages to better convey the result of falling
back to a backup file.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-15 16:37:27 -05:00
Sebastian Sanchez
6e768f0682 IB/hfi1: Optimize devdata cachelines
Profiling shows hot path struct members that need
to be in a minimum set of cachelines.

Group these struct member in the same cacheline:
	sc2vl_lock
	sc2vl
	rhf_rcv_function_map
	rcv_limit
	rhf_offset

Group these struct member in the same cacheline:
	process_pio_send
	process_dma_send
	pport
	rcd
	int_counter
	flags
	num_pports
	first_user_ctxt

Fill holes in struct hfi1_devdata revealed by pahole.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-15 16:37:27 -05:00
Jakub Pawlak
a6cd5f08e0 IB/hfi1: Unify access to GUID entries
This patch consolidates the node GUIDs and the port GUID handling
and unifies access to these items. The knowledge of hfi1 GUIDs'
design and their location are kept in accessors to centralize access.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-15 16:25:59 -05:00
Mike Marciniszyn
99c7abfb62 IB/hfi1: Optimize pio cachelines
Move buffers_allocated pcpu pointer to allocator line.

Move hw_free pointer to releaser line.

Fill other holes revealed by pahole.

Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-15 16:25:59 -05:00
Mike Marciniszyn
63df8e09e1 IB/hfi1: Inline sdma_txclean() for verbs pio
Short circuit sdma_txclean() by adding an __sdma_txclean()
that is only called when the tx has sdma mappings.

Convert internal calls to __sdma_txclean().

This removes a call from the critical path.

Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-15 16:25:59 -05:00
Mike Marciniszyn
4e045572e2 IB/hfi1: Add unique txwait_lock for txreq events
Profiling suggests that the read_seqbegin() in
the txreq put logic is colliding with other uses
of the iowait lock.

The packet at a time use of this lock dictates a unique
lock to avoid reader/writer collisions when the number
of vTxWait events is low.

In order to support a unique lock the iowait struct embedded
in the QP is extended to remember the lock that protects the queue
head.

The QP destroy removes that QP from any wait list.  It doesn't
need to know the head because of the linked list API, but it does
need to know the lock required to protect the head.

This also opens up the wait logic to have unique per resources locks
which needs to be in future refinement.

Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-15 16:25:59 -05:00
Dennis Dalessandro
2b16056f84 IB/hfi1: Remove incorrect IS_ERR check
Remove IS_ERR check from caching code as the function being called does
not actually return error pointers.

Fixes: f19bd643db: "IB/hfi1: Prevent NULL pointer deferences in caching code"
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-15 16:18:57 -05:00
Jianxin Xiong
09a7908b1b IB/hfi1: Prevent hardware counter names from being cut off
Increase the size of the buffer that is used to construct per-VL
and per-SDMA counter names.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-15 16:18:57 -05:00
Dasaratharaman Chandramouli
f2d8a0b367 IB/hfi1: Fix ECN processing in prescan_rxq
When processing ECN via the prescan_rxq path, some fields in the packet
structure are passed uninitialized. This can potentially
cause NULL pointer exceptions during ECN handling.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-15 16:16:46 -05:00
Jakub Pawlak
505efe3e46 IB/hfi1: Fix status error code for unsupported packets
Set the status code BAD_L2 when unsupported type of packet
is received and dropped.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-15 16:16:45 -05:00
Krzysztof Blaszkowski
11501ab9df IB/hfi1: Relocate rcvhdrcnt module parameter check.
Validate the rcvhdrcnt module parameter in a single function at module
load time. This allows proper error reporting.

Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Krzysztof Blaszkowski <krzysztof.blaszkowski@intel.com>
Signed-off-by: Tymoteusz Kielan <tymoteusz.kielan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-15 16:16:45 -05:00
Ira Weiny
458ed666fe IB/hfi1: Fix rnr_timer addition
The new s_rnr_timeout was not properly being set and the code was
incorrectly setting a different timer.

Found by code inspection.

Cc: <stable@vger.kernel.org> # 4.7.x
Fixes: 08279d5c94 ("staging/rdma/hfi1: use new RNR timer")
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-15 16:16:44 -05:00
Easwar Hariharan
f0f98f74c9 IB/hfi1: Delete unused lock
The lock is an unused vestige from qib. Remove it.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-15 16:16:44 -05:00
Easwar Hariharan
26ea2544dd IB/hfi1: Clean up unused argument
hfi1_pcie_ddinit takes the PCI device id as an argument but never
uses it. Clean it up.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-15 16:16:43 -05:00
Dennis Dalessandro
eacc830f95 IB/hfi1: Remove leftover snoop references
A few snoop related variables were missed in the snoop/capture removal
to get out of staging. Go back and clean those up too.

Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-15 16:16:43 -05:00
Jianxin Xiong
4dfe7cceb2 IB/hfi1: Fix a potential memory leak in hfi1_create_ctxts()
In the function hfi1_create_ctxts the array "dd->rcd" is allocated and
then populated with allocated resources in a loop. Previously, if
error happened during the loop, only resource allocated in the current
iteration would be freed. The array itself would then be freed, leaving
the resources that were allocated in previous iterations and referenced
by the array elements in limbo.

This patch makes sure all allocated resources are freed before freeing
the array "dd->rcd". Also the resource allocation now takes account of
the numa node the device is attached to.

Reviewed-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-15 16:16:42 -05:00
Krzysztof Blaszkowski
83fb4af680 IB/hfi1: Return ENODEV for unsupported PCI device ids.
Clean up device type checking.

Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Krzysztof Blaszkowski <krzysztof.blaszkowski@intel.com>
Signed-off-by: Tymoteusz Kielan <tymoteusz.kielan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-15 16:16:42 -05:00
Tadeusz Struk
acd7c8fe14 IB/hfi1: Fix an Oops on pci device force remove
This patch fixes an Oops on device unbind, when the device is used
by a PSM user process. PSM processes access device resources which
are freed on device removal. Similar protection exists in uverbs
in ib_core for Verbs clients, but PSM doesn't use ib_uverbs hence
a separate protection is required for PSM clients.

Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-15 16:16:41 -05:00
Jakub Pawlak
d9ac4555fb IB/hfi1: Fix integrity check flags default values
Prevent setting up integrity check flags when module is loaded
with NO_INTEGRITY capability.

Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-15 16:16:41 -05:00
Tadeusz Struk
39eb2795f1 IB/hfi1: Remove redundant sysfs irq affinity entry
The IRQ affinity entry is not needed after the irq notifier patch has been
added to the hfi1 driver.
The irq affinity settings for SDMA engine should be set using the standard
/proc/irq/<N>/ interface.

Reviewed-by: Jianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-15 16:16:40 -05:00
Hadar Hen Zion
66958ed906 net/mlx5: Support encap id when setting new steering entry
In order to support steering rules which add encapsulation headers,
encap_id parameter is needed.

Add new mlx5_flow_act struct which holds action related parameter:
action, flow_tag and encap_id. Use mlx5_flow_act struct when adding a new
steering rule.
This patch doesn't change any functionality.

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09 13:41:56 -05:00
Hadar Hen Zion
c9f1b073d0 net/mlx5: Add creation flags when adding new flow table
When creating flow tables, allow the caller to specify creation flags.
Currently no flags are used and as such this patch doesn't add any new
functionality.

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09 13:41:56 -05:00