The udata->inlen error path needs to clean up the ref
added by rxe_alloc().
Signed-off-by: Andrew Boyer <andrew.boyer@dell.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Peek at the CQ after arming it so that we can return a hint.
This avoids missed completions due to a race between posting
CQEs and arming the CQ.
For example, CM teardown waits on MAD requests to complete with
ib_cq_poll_work(). Without this fix, the last completion might be
left on the CQ, hanging the kthread doing the teardown.
The console backtraces look like this:
[ 4199.911284] Call Trace:
[ 4199.911401] [<ffffffff9657fe95>] schedule+0x35/0x80
[ 4199.911556] [<ffffffff965830df>] schedule_timeout+0x22f/0x2c0
[ 4199.911727] [<ffffffff9657f7a8>] ? __schedule+0x368/0xa20
[ 4199.911891] [<ffffffff96580903>] wait_for_completion+0xb3/0x130
[ 4199.912067] [<ffffffff960a17e0>] ? wake_up_q+0x70/0x70
[ 4199.912243] [<ffffffffc074a06d>] cm_destroy_id+0x13d/0x450 [ib_cm]
[ 4199.912422] [<ffffffff961615d5>] ? printk+0x57/0x73
[ 4199.912578] [<ffffffffc074a390>] ib_destroy_cm_id+0x10/0x20 [ib_cm]
[ 4199.912759] [<ffffffffc076098c>] rdma_destroy_id+0xac/0x340 [rdma_cm]
[ 4199.912941] [<ffffffffc076f2cc>] 0xffffffffc076f2cc
Signed-off-by: Andrew Boyer <andrew.boyer@dell.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The last_psn algorithm fails in the zero-byte case: it calculates
first_psn = N, last_psn = N-1. This makes the operation unretryable since
the res structure will fail the (first_psn <= psn <= last_psn) test in
find_resource().
While here, use BTH_PSN_MASK to mask the calculated last_psn.
Signed-off-by: Andrew Boyer <andrew.boyer@dell.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
skb_out is decremented in rxe_skb_tx_dtor(), which is not called in the
loopback() path. Move the increment to the send() path rather than
rxe_xmit_packet().
Signed-off-by: Andrew Boyer <andrew.boyer@dell.com>
Acked-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
A client might post a read followed by a send. The partner receives
and acknowledges both transactions, posting an RCQ entry for the
send, but something goes wrong with the read ACK. When the client
retries the read, the partner's responder processes the duplicate
read but incorrectly resets the PSN to the value preceding the
original send. When the duplicate send arrives, the responder cannot
tell that it is a duplicate, so the responder generates a duplicate
RCQ entry, confusing the client.
Signed-off-by: Andrew Boyer <andrew.boyer@dell.com>
Reviewed-by: Yonatan Cohen <yonatanc@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
A simple userspace application might poll the CQ, find a completion,
and then attempt to post a new WQE to the SQ. A spurious error can
occur if the userspace application detects a full SQ in the instant
before the kernel is able to advance the SQ consumer pointer.
This is noticeable when using single-entry SQs with ibv_rc_pingpong
if lots of kernel and userspace library debugging is enabled.
Signed-off-by: Andrew Boyer <andrew.boyer@dell.com>
Reviewed-by: Yonatan Cohen <yonatanc@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Avoid smashing the stack when an ICRC error occurs on an IPv6 network.
Signed-off-by: Andrew Boyer <andrew.boyer@dell.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
It makes me nervous when we cast pointer parameters. I would estimate
that around 50% of the time, it indicates a bug. Here the cast is not
needed becaue u32 and and unsigned int are the same thing. Removing the
cast makes the code more robust and future proof in case any of the
types change.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Acked-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
spinlock can be initialized automatically with DEFINE_SPINLOCK()
rather than explicitly calling spin_lock_init().
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Leon Romanosky <leonro@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
A race condition fix added an rxe_qp structure to the stack in order
to be able to perform rollback in rxe_requester(), but the structure
is large enough to trigger the warning for possible stack overflow:
drivers/infiniband/sw/rxe/rxe_req.c: In function 'rxe_requester':
drivers/infiniband/sw/rxe/rxe_req.c:757:1: error: the frame size of 2064 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
This changes the rollback function to only save the psn inside
the qp, which is the only field we access in the rollback_qp
anyway.
Fixes: 3050b99850 ("IB/rxe: Fix race condition between requester and completer")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Currently when calling mlx5_add_flow_rule we accept
only one flow destination, this commit allows to pass
multiple destinations.
This change forces us to change the return structure to a more
flexible one. We introduce a flow handle (struct mlx5_flow_handle),
it holds internally the number for rules created and holds an array
where each cell points the to a flow rule.
From the consumers (of mlx5_add_flow_rule) point of view this
change is only cosmetic and requires only to change the type
of the returned value they store.
From the core point of view, we now need to use a loop when
allocating and deleting rules (e.g given to us a flow handler).
Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Merge the gup_flags cleanups from Lorenzo Stoakes:
"This patch series adjusts functions in the get_user_pages* family such
that desired FOLL_* flags are passed as an argument rather than
implied by flags.
The purpose of this change is to make the use of FOLL_FORCE explicit
so it is easier to grep for and clearer to callers that this flag is
being used. The use of FOLL_FORCE is an issue as it overrides missing
VM_READ/VM_WRITE flags for the VMA whose pages we are reading
from/writing to, which can result in surprising behaviour.
The patch series came out of the discussion around commit 38e0885465
("mm: check VMA flags to avoid invalid PROT_NONE NUMA balancing"),
which addressed a BUG_ON() being triggered when a page was faulted in
with PROT_NONE set but having been overridden by FOLL_FORCE.
do_numa_page() was run on the assumption the page _must_ be one marked
for NUMA node migration as an actual PROT_NONE page would have been
dealt with prior to this code path, however FOLL_FORCE introduced a
situation where this assumption did not hold.
See
https://marc.info/?l=linux-mm&m=147585445805166
for the patch proposal"
Additionally, there's a fix for an ancient bug related to FOLL_FORCE and
FOLL_WRITE by me.
[ This branch was rebased recently to add a few more acked-by's and
reviewed-by's ]
* gup_flag-cleanups:
mm: replace access_process_vm() write parameter with gup_flags
mm: replace access_remote_vm() write parameter with gup_flags
mm: replace __access_remote_vm() write parameter with gup_flags
mm: replace get_user_pages_remote() write/force parameters with gup_flags
mm: replace get_user_pages() write/force parameters with gup_flags
mm: replace get_vaddr_frames() write/force parameters with gup_flags
mm: replace get_user_pages_locked() write/force parameters with gup_flags
mm: replace get_user_pages_unlocked() write/force parameters with gup_flags
mm: remove write/force parameters from __get_user_pages_unlocked()
mm: remove write/force parameters from __get_user_pages_locked()
mm: remove gup_flags FOLL_WRITE games from __get_user_pages()
This removes the 'write' and 'force' from get_user_pages_remote() and
replaces them with 'gup_flags' to make the use of FOLL_FORCE explicit in
callers as use of this flag can result in surprising behaviour (and
hence bugs) within the mm subsystem.
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This removes the 'write' and 'force' from get_user_pages() and replaces
them with 'gup_flags' to make the use of FOLL_FORCE explicit in callers
as use of this flag can result in surprising behaviour (and hence bugs)
within the mm subsystem.
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJYATXdAAoJELgmozMOVy/dn48P/2lBCAR7pJMU5AC4s1VZsYHr
A7ep5qpkmD5qGnnHNjLA2TIK/8lCy80ACt/HbV7588TxyZYpa+wIaQAdIyuUfUyS
HVdMTLMqdfYOdnPHNDiKKhdvw8Ty8gGlHsnxay32+m3WJtCPxsRObrciJO984lIk
DXKBsYuOQST5Df/1eHWSCPVUn5jHW4bKh7jPM1cs7CtFZ2bJHJQrKECm0SoKvj+3
3BNCg2gVRXeGwfX4KoSYf87nMJCCXBlNzBsqyVPjsB5teJjjk9mXV5y6qsHps9Hu
JrMjMPlvRzkUil8ZP5RiPHx29IlZypwudpswqM9cw6mxfsvvORYtYBD3BVC6Vt4A
WPVXGkx/sEO9XgbasuUJEL0ui4I3UR+lLP8MwefMiPteJ/lGdM/vydS9t57hvk9s
JeL/ep0Us70VX0VSEkc62RvYbKPcRk4qonF8liRq7nit3l45vL5YLvbTQeqe7pbI
CN0lBn83K9Z4GGwPqDzbD3pwiZ2wFV4VvrWXqOeyexT/kNi1iJlQcfNHJcUiI9vg
mkzxWvvWY+KieunrJQGWEQPkuD7fpFF77KFkIYSFVfkHBrSjc+n5a3lAY/xT8k6D
rixIl9ZhA8dMjkCzh0xqGHgEoldh4rO1ctpaTDLg3HsNkedctDEpyx4HFMhiXE2w
INAqVa/uOUC0a/uPlcWr
=Oifo
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull rdma qedr RoCE driver from Doug Ledford:
"Early on in the merge window I mentioned I had a backlog of new
drivers waiting to be reviewed and that, in addition to the hns-roce
driver, I wanted to get possible a couple more reviewed. I ended up
only having the time to complete one of the additional drivers.
During Dave Miller's pull request this go around, there were a series
of 9 patches to the QLogic qed net driver that add basic support for a
paired RoCE driver. That support is currently not functional because
it is missing the matching RoCE driver in the RDMA subsystem. I
managed to finish that review. However, because it goes against part
of Dave's net pull, and a part that was accepted a day or two after
the merge window opened, to apply cleanly it has to be applied to
either the tip of Dave's net branch, or as I did in this case, I just
applied it to your master after you had taken Dave's pull request."
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
qedr: Add events support and register IB device
qedr: Add GSI support
qedr: Add LL2 RoCE interface
qedr: Add support for data path
qedr: Add support for memory registeration verbs
qedr: Add support for QP verbs
qedr: Add support for PD,PKEY and CQ verbs
qedr: Add support for user context verbs
qedr: Add support for RoCE HW init
qedr: Add RoCE driver framework
- Small patch set for hns net driver that the roce patches depend on
- Various fixes to the hns-roce driver
- Add connection manager support to the hns-roce driver
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJYATZdAAoJELgmozMOVy/d1+4P/2UhXiXx7strrr5vYtFAdbdX
9j4jPbmnXgc4hFV1EET7UScdUwYW6iuoYCYa5sJUj6dcux2Ph/pYfPbE4Civld67
xMEISaI86GcEbFy3yqZ0vhDegyReb6wUDguzht1IHKqFwl5uvXBPJhZ0vmY4ZKXd
mVKNLH4FTMbqf4rGO64AmUyN7QIlLE17zO3Nolha6mytRj7RoYHEjP8RbZPTeN5J
58QpZjomO0uz1dvxRWwRBw2eEYgXMxKa3s4W8vYYcGimoKinzbqAHhrWOm0+klHA
Nd3AFqEVDTxYxqZYSBLvhvCT4d9/vgb/Tsf+IB07qVDoM6iv2W2WM17xq9w7vitv
4w7tClX9cvAWX35k3TAhQBkN2QJhaWY9bK9JwTB/AFxQXM2gG1/2f77hi72jdsR4
kcptopV/vZSMqjobfoVe5/ac1qUxv7HM+tAN/+9j7qU3TNvn5+R7d+UBDKrbiP1c
EW5kdffRY3evemdRh/zHfUyuQzr5l/GR4vQ9gLJIBu+ZK3o1d1JNUjKNwwlzOl0r
BbvYvWJ23Na6FTjpNFOTgc3y7K4zSXlGVeHObtqg0ejlWsCU9xu+MMay9tRLy2LI
CQxr81WQbMvcEnfad2yqSUuFAAhut85Q3qYERPGDy78aiF+gNNDZLitwmjU3Q9q8
F7apPH39H41lEzOLfsMr
=PmmI
-----END PGP SIGNATURE-----
Merge tag 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull more rdma updates from Doug Ledford:
"This merge window was the first where Huawei had to try and coordinate
their patches between their net driver and their new roce driver
(similar to mlx4 and mlx5).
They didn't do horribly, but there were some issues (and we knew that
because they simply didn't know what to do in the beginning). As a
result, I had a set of patches that depended on some patches that
normally would have come to you via Dave's tree. Those patches have
been on netdev@ for a while, so I got Dave to give me his approval to
send them to you. As such, the other 29 patches I had behind them are
also now ready to go.
This catches the hns and hns-roce drivers up to current, and for
future patches we are working with them to get them up to speed on how
to do joint driver development so that they don't have these sorts of
cross tree dependency issues again. BTW, Dave gave me permission to
add his Acked-by: to the patches against the net tree, but I've had
this branch through 0day (but not linux-next since it was off by
itself) and I didn't want to rebase the series just to add Dave's ack
for the 8 patches in the net area.
Updates to the hns drivers:
- Small patch set for hns net driver that the roce patches depend on
- Various fixes to the hns-roce driver
- Add connection manager support to the hns-roce driver"
* tag 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (36 commits)
IB/hns: Fix for removal of redundant code
IB/hns: Delete the redundant lines in hns_roce_v1_m_qp()
IB/hns: Fix the bug when platform_get_resource() exec fail
IB/hns: Update the rq head when modify qp state
IB/hns: Cq has not been freed
IB/hns: Validate mtu when modified qp
IB/hns: Some items of qpc need to take user param
IB/hns: The Ack timeout need a lower limit value
IB/hns: Return bad wr while post send failed
IB/hns: Fix bug of memory leakage for registering user mr
IB/hns: Modify the init of iboe lock
IB/hns: Optimize code of aeq and ceq interrupt handle and fix the bug of qpn
IB/hns: Delete the sqp_start from the structure hns_roce_caps
IB/hns: Fix bug of clear hem
IB/hns: Remove unused parameter named qp_type
IB/hns: Simplify function of pd alloc and qp alloc
IB/hns: Fix bug of using uninit refcount and free
IB/hns: Remove parameters of resize cq
IB/hns: Remove unused parameters in some functions
IB/hns: Add node_guid definition to the bindings document
...
Adds a skeletal implementation of the qed* RoCE driver -
basically the ability to communicate with the qede driver and
receive notifications from it regarding various init/exit events.
Signed-off-by: Rajesh Borundia <rajesh.borundia@cavium.com>
Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
A good practice is to prefix the names of functions by the name
of the subsystem.
The kthread worker API is a mix of classic kthreads and workqueues. Each
worker has a dedicated kthread. It runs a generic function that process
queued works. It is implemented as part of the kthread subsystem.
This patch renames the existing kthread worker API to use
the corresponding name from the workqueues API prefixed by
kthread_:
__init_kthread_worker() -> __kthread_init_worker()
init_kthread_worker() -> kthread_init_worker()
init_kthread_work() -> kthread_init_work()
insert_kthread_work() -> kthread_insert_work()
queue_kthread_work() -> kthread_queue_work()
flush_kthread_work() -> kthread_flush_work()
flush_kthread_worker() -> kthread_flush_worker()
Note that the names of DEFINE_KTHREAD_WORK*() macros stay
as they are. It is common that the "DEFINE_" prefix has
precedence over the subsystem names.
Note that INIT() macros and init() functions use different
naming scheme. There is no good solution. There are several
reasons for this solution:
+ "init" in the function names stands for the verb "initialize"
aka "initialize worker". While "INIT" in the macro names
stands for the noun "INITIALIZER" aka "worker initializer".
+ INIT() macros are used only in DEFINE() macros
+ init() functions are used close to the other kthread()
functions. It looks much better if all the functions
use the same scheme.
+ There will be also kthread_destroy_worker() that will
be used close to kthread_cancel_work(). It is related
to the init() function. Again it looks better if all
functions use the same naming scheme.
+ there are several precedents for such init() function
names, e.g. amd_iommu_init_device(), free_area_init_node(),
jump_label_init_type(), regmap_init_mmio_clk(),
+ It is not an argument but it was inconsistent even before.
[arnd@arndb.de: fix linux-next merge conflict]
Link: http://lkml.kernel.org/r/20160908135724.1311726-1-arnd@arndb.de
Link: http://lkml.kernel.org/r/1470754545-17632-3-git-send-email-pmladek@suse.com
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull more vfs updates from Al Viro:
">rename2() work from Miklos + current_time() from Deepa"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs: Replace current_fs_time() with current_time()
fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps
fs: Replace CURRENT_TIME with current_time() for inode timestamps
fs: proc: Delete inode time initializations in proc_alloc_inode()
vfs: Add current_time() api
vfs: add note about i_op->rename changes to porting
fs: rename "rename2" i_op to "rename"
vfs: remove unused i_op->rename
fs: make remaining filesystems use .rename2
libfs: support RENAME_NOREPLACE in simple_rename()
fs: support RENAME_NOREPLACE for local filesystems
ncpfs: fix unused variable warning
- Updates to mlx5
- Updates to mlx4 (two conflicts, both minor and easily resolved)
- Updates to iw_cxgb4 (one conflict, not so obvious to resolve, proper
resolution is to keep the code in cxgb4_main.c as it is in Linus'
tree as attach_uld was refactored and moved into cxgb4_uld.c)
- Improvements to uAPI (moved vendor specific API elements to uAPI area)
- Add hns-roce driver and hns and hns-roce ACPI reset support
- Conversion of all rdma code away from deprecated
create_singlethread_workqueue
- Security improvement: remove unsafe ib_get_dma_mr (breaks lustre in
staging)
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJX+AwSAAoJELgmozMOVy/d0WkQAKxPzVccMWwHv28iZI4ey13u
JwE+VoCNpCAZAVuEgzK5zzFdNHPvAk2jU93H4apA7dfXJBXPatVuj9Lnk+ieEEnW
tbFwJjBpbQ3Zol3+SPfAHnsVMbtax+xmd6WDKExPXXEDl1L6rutwL3KKfmgWEitg
ysX7XOJCiSdyM0hcg4T6UPB9a3jGPff9NLu0oGamV+yoUk5Y0WGoVFxHZ4MKcw8t
OkFBYIxGz4SGwq2tulStuH03HteURX594KngtrA8dyq6l1R2GlGRv+bkJAUEIWUv
aA0ow3VWusOM6fT+jLXPCv8iUwIXM8tR/U6F7X+cmORUUtWvCl+uCUVid113j/aN
BK+Af2nJnfoJ5cDBPsD+bC76l5gQycNZO/Qh8op2kmgJtD+6OpGM3cBXsHx53+kk
0wloJ2lKCGShWxNj+ig8n8rR/rhhs/x3vV3ouCVWNMbOUgOSN3eYHxmK3wGFW4nd
Qx+WYCjj9Yi/J6nmUDcfEQ4NWPR22Q2+0ENAabfhLhV6mDloAO5ILHd4GDqC3IA9
UtxlVjf4ZonaiLnTQQzCnDMGVVk6tT8FJ9D42s0ScwjbdYwjyCW9/rs/g2EhcprR
Cc+AmjqLviCWGtzBSFO0SijqQon8lcQOwdLw61CdFFvPa/mlLdf1rbx9ArIyNVKn
JSrbr3CGyoqyYj6qaEO5
=LC+S
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull main rdma updates from Doug Ledford:
"This is the main pull request for the rdma stack this release. The
code has been through 0day and I had it tagged for linux-next testing
for a couple days.
Summary:
- updates to mlx5
- updates to mlx4 (two conflicts, both minor and easily resolved)
- updates to iw_cxgb4 (one conflict, not so obvious to resolve,
proper resolution is to keep the code in cxgb4_main.c as it is in
Linus' tree as attach_uld was refactored and moved into
cxgb4_uld.c)
- improvements to uAPI (moved vendor specific API elements to uAPI
area)
- add hns-roce driver and hns and hns-roce ACPI reset support
- conversion of all rdma code away from deprecated
create_singlethread_workqueue
- security improvement: remove unsafe ib_get_dma_mr (breaks lustre in
staging)"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (75 commits)
staging/lustre: Disable InfiniBand support
iw_cxgb4: add fast-path for small REG_MR operations
cxgb4: advertise support for FR_NSMR_TPTE_WR
IB/core: correctly handle rdma_rw_init_mrs() failure
IB/srp: Fix infinite loop when FMR sg[0].offset != 0
IB/srp: Remove an unused argument
IB/core: Improve ib_map_mr_sg() documentation
IB/mlx4: Fix possible vl/sl field mismatch in LRH header in QP1 packets
IB/mthca: Move user vendor structures
IB/nes: Move user vendor structures
IB/ocrdma: Move user vendor structures
IB/mlx4: Move user vendor structures
IB/cxgb4: Move user vendor structures
IB/cxgb3: Move user vendor structures
IB/mlx5: Move and decouple user vendor structures
IB/{core,hw}: Add constant for node_desc
ipoib: Make ipoib_warn ratelimited
IB/mlx4/alias_GUID: Remove deprecated create_singlethread_workqueue
IB/ipoib_verbs: Remove deprecated create_singlethread_workqueue
IB/ipoib: Remove deprecated create_singlethread_workqueue
...
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJX+BuzAAoJELgmozMOVy/dUvwP/juvXUJg7Xh3N0PXo0+Qb8Xc
TG3a8EdumYZtDF8jualKiuJboGUdkMN4EDWhfdPxxW3lGBc8EQP7vac2H3NCcE3L
ntRYrEeaDF8x5eUSyIQf2hs9k7HVQ+gkR8KqbeVUFh8iCwNnQ+wjvBXB+WUcqioW
mp+S0//v8t1r1zPyUE7NgBo8EBVWzapiiGrDObNMke2oZxa4CsKsPCUMzSr9USW+
/oSQBqcAZQ6LL8ykvx2oov2jt+bcWKco5fxRb8R7+buZknC4jX98f70WkLUne1+q
oG3fKf1cUmORAjWelNDSxfwz8uWzkZifHqa+5UQHKVR9PcFJIQUKuW/5InMxkU3+
HG+t7YiwCKOX7+99zUC5GHWFilmAOJma567VzfEgzcJjNeRtoCVG0a7smACWKA/y
JWUoLUQd5c9dvg46CDG+qqk79tINYmRdtgvJmht2AhCAczRYI8iMIaw9Fh7yC5Dg
JvUVjLLVtdNjJSP0ixY6RJB4aQVbdVSzo4UUJKXQkaJM6KWRd5HYGcr8rUXn8HEf
PjHNcPEByOJBfH2Fg759nKDPaVAN4r/8nDHZpLfEfnzzEcXWIUk4vDcCYHxN69RF
8EZZ681c4YE/WW49Jr/JulKKRsrZn1xB4FJF5gJmAOvb0WHoJPf82YEr3kcUUOQK
HaECbnzwYqGRQe5tNFiD
=/P6a
-----END PGP SIGNATURE-----
Merge tag 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull more rdma updates from Doug Ledford:
"Minor updates for rxe driver"
[ Starting to do merge window pulls again - the current -git tree does
appear to have some netfilter use-after-free issues, but I've sent
off the report to the proper channels, and I don't want to delay merge
window activity any more ]
* tag 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
IB/rxe: improved debug prints & code cleanup
rdma_rxe: Ensure rdma_rxe init occurs at correct time
IB/rxe: Properly honor max IRD value for rd/atomic.
IB/{rxe,core,rdmavt}: Fix kernel crash for reg MR
IB/rxe: Fix sending out loopback packet on netdev interface.
IB/rxe: Avoid scheduling tasklet for userspace QP
When processing a REG_MR work request, if fw supports the
FW_RI_NSMR_TPTE_WR work request, and if the page list for this
registration is <= 2 pages, and the current state of the mr is INVALID,
then use FW_RI_NSMR_TPTE_WR to pass down a fully populated TPTE for FW
to write. This avoids FW having to do an async read of the TPTE blocking
the SQ until the read completes.
To know if the current MR state is INVALID or not, iw_cxgb4 must track the
state of each fastreg MR. The c4iw_mr struct state is updated as REG_MR
and LOCAL_INV WRs are posted and completed, when a reg_mr is destroyed,
and when RECV completions are processed that include a local invalidation.
This optimization increases small IO IOPS for both iSER and NVMF.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Avoid that mapping an sg-list in which the first element has a
non-zero offset triggers an infinite loop when using FMR. This
patch makes the FMR mapping code similar to that of ib_sg_to_pages().
Note: older Mellanox HCAs do not support non-zero offsets for FMR.
See also commit 8c4037b501 ("IB/srp: always avoid non-zero offsets
into an FMR").
Reported-by: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Document that ib_map_mr_sg() is able to map physically discontiguous
sg-lists as a single MR. Change IB_MR_TYPE_SG_GAPS_REG into
IB_MR_TYPE_SG_GAPS.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@rimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
In MLX qp packets, the LRH (built by the driver) has both a VL field
and an SL field. When building a QP1 packet, the VL field should
reflect the SLtoVL mapping and not arbitrarily contain zero (as is
done now). This bug causes credit problems in IB switches at
high rates of QP1 packets.
The fix is to cache the SL to VL mapping in the driver, and look up
the VL mapped to the SL provided in the send request when sending
QP1 packets.
For FW versions which support generating a port_management_config_change
event with subtype sl-to-vl-table-change, the driver uses that event
to update its sl-to-vl mapping cache. Otherwise, the driver snoops
incoming SMP mads to update the cache.
There remains the case where the FW is running in secure-host mode
(so no QP0 packets are delivered to the driver), and the FW does not
generate the sl2vl mapping change event. To support this case, the
driver updates (via querying the FW) its sl2vl mapping cache when
running in secure-host mode when it receives either a Port Up event
or a client-reregister event (where the port is still up, but there
may have been an opensm failover).
OpenSM modifies the sl2vl mapping before Port Up and Client-reregister
events occur, so if there is a mapping change the driver's cache will
be properly updated.
Fixes: 225c7b1fee ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This patch moves mthca vendor's specific structures to
common UAPI folder which will be visible to all consumers.
These structures are used by user-space library driver
(libmthca) and currently manually copied to that library.
This move will allow cross-compile against these files and
simplify introduction of vendor specific data.
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This patch moves nes vendor's specific structures to
common UAPI folder which will be visible to all consumers.
These structures are used by user-space library driver
(libmlx4) and currently manually copied to that library.
This move will allow cross-compile against these files and
simplify introduction of vendor specific data.
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This patch moves ocrdma vendor's specific structures to
common UAPI folder which will be visible to all consumers.
These structures are used by user-space library driver
(libmlx4) and currently manually copied to that library.
This move will allow cross-compile against these files and
simplify introduction of vendor specific data.
In addition, it changes types to be __uXX instead of uXX.
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Acked-By: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This patch moves mlx4 vendor's specific structures to
common UAPI folder which will be visible to all consumers.
These structures are used by user-space library driver
(libmlx4) and currently manually copied to that library.
This move will allow cross-compile against these files and
simplify introduction of vendor specific data.
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This patch moves cxgb4 vendor's specific structures to
common UAPI folder which will be visible to all consumers.
These structures are used by user-space library driver
(libcxgb4) and currently manually copied to that library.
This move will allow cross-compile against these files and
simplify introduction of vendor specific data.
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This patch moves cxgb3 vendor's specific structures to
common UAPI folder which will be visible to all consumers.
These structures are used by user-space library driver
(libcxgb3) and currently manually copied to that library.
This move will allow cross-compile against these files and
simplify introduction of vendor specific data.
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This patch decouples and moves vendors specific structures to
common UAPI folder which will be visible to all consumers.
These structures are used by user-space library driver
(libmlx5) and currently manually copied to that library.
This move will allow cross-compile against these files and
simplify introduction of vendor specific data.
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
In certain cases it's possible to be flooded by warning messages. To
cope with such situations make the ipoib_warn macro be ratelimited.
To prevent accidental limiting of legitimate, bursty messages make
the limit fairly liberal by allowing up to 100 messages in 10 seconds.
Signed-off-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
alloc_ordered_workqueue() with WQ_MEM_RECLAIM set, replaces
deprecated create_singlethread_workqueue(). This is the identity
conversion.
The workqueue "wq" queues work item that maps to alias_guid_work.
It has been identity converted.
WQ_MEM_RECLAIM has been set to ensure forward progress under
memory pressure.
Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
alloc_ordered_workqueue() with WQ_MEM_RECLAIM set, replaces
deprecated create_singlethread_workqueue(). This is the identity
conversion.
The workqueue "wq" queues mulitple work items viz &priv->restart_task,
&priv->cm.rx_reap_task, &priv->cm.skb_task, &priv->neigh_reap_task,
&priv->ah_reap_task, &priv->mcast_task and &priv->carrier_on_task.
The work items require strict execution ordering.
Hence, an ordered dedicated workqueue has been used.
WQ_MEM_RECLAIM has been set to ensure forward progress under
memory pressure.
Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>