Commit Graph

11713 Commits

Author SHA1 Message Date
Kamal Heib
571a2dd898 RDMA/cxgb4: Fix the reported max_recv_sge value
[ Upstream commit a372173bf314d374da4dd1155549d8ca7fc44709 ]

The max_recv_sge value is wrongly reported when calling query_qp, This is
happening due to a typo when assigning the max_recv_sge value, the value
of sq_max_sges was assigned instead of rq_max_sges.

Fixes: 3e5c02c9ef ("iw_cxgb4: Support query_qp() verb")
Link: https://lore.kernel.org/r/20210114191423.423529-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Potnuri Bharat Teja <bharat@chelsio.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-03 23:28:46 +01:00
Parav Pandit
34b0c04c88 Revert "RDMA/mlx5: Fix devlink deadlock on net namespace deletion"
commit de641d74fb00f5b32f054ee154e31fb037e0db88 upstream.

This reverts commit fbdd0049d9.

Due to commit in fixes tag, netdevice events were received only in one net
namespace of mlx5_core_dev. Due to this when netdevice events arrive in
net namespace other than net namespace of mlx5_core_dev, they are missed.

This results in empty GID table due to RDMA device being detached from its
net device.

Hence, revert back to receive netdevice events in all net namespaces to
restore back RDMA functionality in non init_net net namespace. The
deadlock will have to be addressed in another patch.

Fixes: fbdd0049d9 ("RDMA/mlx5: Fix devlink deadlock on net namespace deletion")
Link: https://lore.kernel.org/r/20210117092633.10690-1-leon@kernel.org
Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-03 23:28:44 +01:00
Bryan Tan
1fa2fa7932 RDMA/vmw_pvrdma: Fix network_hdr_type reported in WC
commit 9f206f7398f6f6ec7dd0198c045c2459b4f720b6 upstream.

The PVRDMA device HW interface defines network_hdr_type according to an
old definition of the internal kernel rdma_network_type enum that has
since changed, resulting in the wrong rdma_network_type being reported.

Fix this by explicitly defining the enum used by the PVRDMA device and
adding a function to convert the pvrdma_network_type to rdma_network_type
enum.

Cc: stable@vger.kernel.org # 5.10+
Fixes: 1c15b4f2a4 ("RDMA/core: Modify enum ib_gid_type and enum rdma_network_type")
Link: https://lore.kernel.org/r/1611026189-17943-1-git-send-email-bryantan@vmware.com
Reviewed-by: Adit Ranadive <aditr@vmware.com>
Signed-off-by: Bryan Tan <bryantan@vmware.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-30 13:55:18 +01:00
Neta Ostrovsky
2cd90971a2 RDMA/cma: Fix error flow in default_roce_mode_store
[ Upstream commit 7c7b3e5d9aeed31d35c5dab0bf9c0fd4c8923206 ]

In default_roce_mode_store(), we took a reference to cma_dev, but didn't
return it with cma_dev_put in the error flow.

Fixes: 1c15b4f2a4 ("RDMA/core: Modify enum ib_gid_type and enum rdma_network_type")
Link: https://lore.kernel.org/r/20210113130214.562108-1-leon@kernel.org
Signed-off-by: Neta Ostrovsky <netao@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-27 11:55:08 +01:00
Aharon Landau
56c1362981 RDMA/umem: Avoid undefined behavior of rounddown_pow_of_two()
[ Upstream commit b79f2dc5ffe17b03ec8c55f0d63f65e87bcac676 ]

rounddown_pow_of_two() is undefined when the input is 0. Therefore we need
to avoid it in ib_umem_find_best_pgsz and return 0.  Otherwise, it could
result in not rejecting an invalid page size which eventually causes a
kernel oops due to the logical inconsistency.

Fixes: 3361c29e92 ("RDMA/umem: Use simpler logic for ib_umem_find_best_pgsz()")
Link: https://lore.kernel.org/r/20210113121703.559778-2-leon@kernel.org
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-27 11:55:08 +01:00
Jason Gunthorpe
1f54a26bdb RDMA/ucma: Do not miss ctx destruction steps in some cases
[ Upstream commit 8ae291cc95e49011b736b641b0cfad502b7a1526 ]

The destruction flow is very complicated here because the cm_id can be
destroyed from the event handler at any time if the device is
hot-removed. This leaves behind a partial ctx with no cm_id in the
xarray, and will let user space leak memory.

Make everything consistent in this flow in all places:

 - Return the xarray back to XA_ZERO_ENTRY before beginning any
   destruction. The thread that reaches this first is responsible to
   kfree, everyone else does nothing.

 - Test the xarray during the special hot-removal case to block the
   queue_work, this has much simpler locking and doesn't require a
   'destroying'

 - Fix the ref initialization so that it is only positive if cm_id !=
   NULL, then rely on that to guide the destruction process in all cases.

Now the new ucma_destroy_private_ctx() can be called in all places that
want to free the ctx, including all the error unwinds, and none of the
details are missed.

Fixes: a1d33b70db ("RDMA/ucma: Rework how new connections are passed through event delivery")
Link: https://lore.kernel.org/r/20210105111327.230270-1-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-27 11:55:05 +01:00
Parav Pandit
5cd483a7e7 IB/mlx5: Fix error unwinding when set_has_smi_cap fails
commit 2cb091f6293df898b47f4e0f2e54324e2bbaf816 upstream.

When set_has_smi_cap() fails, multiport master cleanup is missed. Fix it
by doing the correct error unwinding goto.

Fixes: a989ea01cb ("RDMA/mlx5: Move SMI caps logic")
Link: https://lore.kernel.org/r/20210113121703.559778-3-leon@kernel.org
Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-19 18:27:32 +01:00
Mark Bloch
bc296e4323 RDMA/mlx5: Fix wrong free of blue flame register on error
commit 1c3aa6bd0b823105c2030af85d92d158e815d669 upstream.

If the allocation of the fast path blue flame register fails, the driver
should free the regular blue flame register allocated a statement above,
not the one that it just failed to allocate.

Fixes: 16c1975f10 ("IB/mlx5: Create profile infrastructure to add and remove stages")
Link: https://lore.kernel.org/r/20210113121703.559778-6-leon@kernel.org
Reported-by: Hans Petter Selasky <hanss@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-19 18:27:32 +01:00
Dinghao Liu
3090af5d1f RDMA/usnic: Fix memleak in find_free_vf_and_create_qp_grp
commit a306aba9c8d869b1fdfc8ad9237f1ed718ea55e6 upstream.

If usnic_ib_qp_grp_create() fails at the first call, dev_list
will not be freed on error, which leads to memleak.

Fixes: e3cf00d0a8 ("IB/usnic: Add Cisco VIC low-level hardware driver")
Link: https://lore.kernel.org/r/20201226074248.2893-1-dinghao.liu@zju.edu.cn
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-19 18:27:31 +01:00
Leon Romanovsky
35694924a6 RDMA/restrack: Don't treat as an error allocation ID wrapping
commit 3c638cdb8ecc0442552156e0fed8708dd2c7f35b upstream.

xa_alloc_cyclic() call returns positive number if ID allocation
succeeded but wrapped. It is not an error, so normalize the "ret"
variable to zero as marker of not-an-error.

   drivers/infiniband/core/restrack.c:261 rdma_restrack_add()
   warn: 'ret' can be either negative or positive

Fixes: fd47c2f99f ("RDMA/restrack: Convert internal DB from hash to XArray")
Link: https://lore.kernel.org/r/20201216100753.1127638-1-leon@kernel.org
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-19 18:27:31 +01:00
Tom Rix
0dbfad171b RDMA/ocrdma: Fix use after free in ocrdma_dealloc_ucontext_pd()
commit f2bc3af6353cb2a33dfa9d270d999d839eef54cb upstream.

In ocrdma_dealloc_ucontext_pd() uctx->cntxt_pd is assigned to the variable
pd and then after uctx->cntxt_pd is freed, the variable pd is passed to
function _ocrdma_dealloc_pd() which dereferences pd directly or through
its call to ocrdma_mbx_dealloc_pd().

Reorder the free using the variable pd.

Cc: stable@vger.kernel.org
Fixes: 21a428a019 ("RDMA: Handle PD allocations by IB/core")
Link: https://lore.kernel.org/r/20201230024653.1516495-1-trix@redhat.com
Signed-off-by: Tom Rix <trix@redhat.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-19 18:27:21 +01:00
Weihang Li
85bbe2e64a RDMA/hns: Avoid filling sl in high 3 bits of vlan_id
[ Upstream commit 94a8c4dfcdb2b4fcb3dfafc39c1033a0b4637c86 ]

Only the low 12 bits of vlan_id is valid, and service level has been
filled in Address Vector. So there is no need to fill sl in vlan_id in
Address Vector.

Fixes: 7406c0036f85 ("RDMA/hns: Only record vlan info for HIP08")
Link: https://lore.kernel.org/r/1607650657-35992-5-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-17 14:16:53 +01:00
Jason Gunthorpe
e522a788eb RDMA/siw,rxe: Make emulated devices virtual in the device tree
[ Upstream commit a9d2e9ae953f0ddd0327479c81a085adaa76d903 ]

This moves siw and rxe to be virtual devices in the device tree:

lrwxrwxrwx 1 root root 0 Nov  6 13:55 /sys/class/infiniband/rxe0 -> ../../devices/virtual/infiniband/rxe0/

Previously they were trying to parent themselves to the physical device of
their attached netdev, which doesn't make alot of sense.

My hope is this will solve some weird syzkaller hits related to sysfs as
it could be possible that the parent of a netdev is another netdev, eg
under bonding or some other syzkaller found netdev configuration.

Nesting a ib_device under anything but a physical device is going to cause
inconsistencies in sysfs during destructions.

Link: https://lore.kernel.org/r/0-v1-dcbfc68c4b4a+d6-virtual_dev_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-09 13:46:24 +01:00
Christoph Hellwig
404fa09374 RDMA/core: remove use of dma_virt_ops
[ Upstream commit 5a7a9e038b032137ae9c45d5429f18a2ffdf7d42 ]

Use the ib_dma_* helpers to skip the DMA translation instead.  This
removes the last user if dma_virt_ops and keeps the weird layering
violation inside the RDMA core instead of burderning the DMA mapping
subsystems with it.  This also means the software RDMA drivers now don't
have to mess with DMA parameters that are not relevant to them at all, and
that in the future we can use PCI P2P transfers even for software RDMA, as
there is no first fake layer of DMA mapping that the P2P DMA support.

Link: https://lore.kernel.org/r/20201106181941.1878556-8-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-09 13:46:24 +01:00
Leon Romanovsky
04ca5e7fa4 RDMA/cma: Don't overwrite sgid_attr after device is released
[ Upstream commit e246b7c035d74abfb3507fa10082d0c42cc016c3 ]

As part of the cma_dev release, that pointer will be set to NULL.  In case
it happens in rdma_bind_addr() (part of an error flow), the next call to
addr_handler() will have a call to cma_acquire_dev_by_src_ip() which will
overwrite sgid_attr without releasing it.

  WARNING: CPU: 2 PID: 108 at drivers/infiniband/core/cma.c:606 cma_bind_sgid_attr drivers/infiniband/core/cma.c:606 [inline]
  WARNING: CPU: 2 PID: 108 at drivers/infiniband/core/cma.c:606 cma_acquire_dev_by_src_ip+0x470/0x4b0 drivers/infiniband/core/cma.c:649
  CPU: 2 PID: 108 Comm: kworker/u8:1 Not tainted 5.10.0-rc6+ #257
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  Workqueue: ib_addr process_one_req
  RIP: 0010:cma_bind_sgid_attr drivers/infiniband/core/cma.c:606 [inline]
  RIP: 0010:cma_acquire_dev_by_src_ip+0x470/0x4b0 drivers/infiniband/core/cma.c:649
  Code: 66 d9 4a ff 4d 8b 6e 10 49 8d bd 1c 08 00 00 e8 b6 d6 4a ff 45 0f b6 bd 1c 08 00 00 41 83 e7 01 e9 49 fd ff ff e8 90 c5 29 ff <0f> 0b e9 80 fe ff ff e8 84 c5 29 ff 4c 89 f7 e8 2c d9 4a ff 4d 8b
  RSP: 0018:ffff8881047c7b40 EFLAGS: 00010293
  RAX: ffff888104789c80 RBX: 0000000000000001 RCX: ffffffff820b8ef8
  RDX: 0000000000000000 RSI: ffffffff820b9080 RDI: ffff88810cd4c998
  RBP: ffff8881047c7c08 R08: ffff888104789c80 R09: ffffed10209f4036
  R10: ffff888104fa01ab R11: ffffed10209f4035 R12: ffff88810cd4c800
  R13: ffff888105750e28 R14: ffff888108f0a100 R15: ffff88810cd4c998
  FS:  0000000000000000(0000) GS:ffff888119c00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000000 CR3: 0000000104e60005 CR4: 0000000000370ea0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   addr_handler+0x266/0x350 drivers/infiniband/core/cma.c:3190
   process_one_req+0xa3/0x300 drivers/infiniband/core/addr.c:645
   process_one_work+0x54c/0x930 kernel/workqueue.c:2272
   worker_thread+0x82/0x830 kernel/workqueue.c:2418
   kthread+0x1ca/0x220 kernel/kthread.c:292
   ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296

Fixes: ff11c6cd52 ("RDMA/cma: Introduce and use cma_acquire_dev_by_src_ip()")
Link: https://lore.kernel.org/r/20201213132940.345554-5-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-30 11:53:53 +01:00
Maor Gottlieb
d9a7b8fcd0 RDMA/mlx5: Fix MR cache memory leak
[ Upstream commit e89938902927a54abebccc9537991aca5237dfaf ]

If the MR cache entry invalidation failed, then we detach this entry from
the cache, therefore we must to free the memory as well.

Allcation backtrace for the leaker:

    [<00000000d8e423b0>] alloc_cache_mr+0x23/0xc0 [mlx5_ib]
    [<000000001f21304c>] create_cache_mr+0x3f/0xf0 [mlx5_ib]
    [<000000009d6b45dc>] mlx5_ib_alloc_implicit_mr+0x41/0×210 [mlx5_ib]
    [<00000000879d0d68>] mlx5_ib_reg_user_mr+0x9e/0×6e0 [mlx5_ib]
    [<00000000be74bf89>] create_qp+0x2fc/0xf00 [ib_uverbs]
    [<000000001a532d22>] ib_uverbs_handler_UVERBS_METHOD_COUNTERS_READ+0x1d9/0×230 [ib_uverbs]
    [<0000000070f46001>] rdma_alloc_commit_uobject+0xb5/0×120 [ib_uverbs]
    [<000000006d8a0b38>] uverbs_alloc+0x2b/0xf0 [ib_uverbs]
    [<00000000075217c9>] ksysioctl+0x234/0×7d0
    [<00000000eb5c120b>] __x64_sys_ioctl+0x16/0×20
    [<00000000db135b48>] do_syscall_64+0x59/0×2e0

Fixes: 1769c4c575 ("RDMA/mlx5: Always remove MRs from the cache before destroying them")
Link: https://lore.kernel.org/r/20201213132940.345554-2-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-30 11:53:53 +01:00
Weihang Li
78d22dd989 RDMA/hns: Do shift on traffic class when using RoCEv2
[ Upstream commit 603bee935f38080a3674c763c50787751e387779 ]

The high 6 bits of traffic class in GRH is DSCP (Differentiated Services
Codepoint), the driver should shift it before the hardware gets it when
using RoCEv2.

Fixes: 606bf89e98 ("RDMA/hns: Refactor for hns_roce_v2_modify_qp function")
Fixes: fba429fcf9a5 ("RDMA/hns: Fix missing fields in address vector")
Link: https://lore.kernel.org/r/1607650657-35992-4-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-30 11:53:51 +01:00
Wenpeng Liang
44dd35a017 RDMA/hns: Normalization the judgment of some features
[ Upstream commit 4ddeacf68a3dd05f346b63f4507e1032a15cc3cc ]

Whether to enable the these features should better depend on the enable
flags, not the value of related fields.

Fixes: 5c1f167af1 ("RDMA/hns: Init SRQ table for hip08")
Fixes: 3cb2c996c9 ("RDMA/hns: Add support for SCCC in size of 64 Bytes")
Link: https://lore.kernel.org/r/1607650657-35992-3-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-30 11:53:51 +01:00
Wenpeng Liang
27f2d59a4a RDMA/hns: Limit the length of data copied between kernel and userspace
[ Upstream commit 1c0ca9cd1741687f529498ddb899805fc2c51caa ]

For ib_copy_from_user(), the length of udata may not be the same as that
of cmd. For ib_copy_to_user(), the length of udata may not be the same as
that of resp. So limit the length to prevent out-of-bounds read and write
operations from ib_copy_from_user() and ib_copy_to_user().

Fixes: de77503a59 ("RDMA/hns: RDMA/hns: Assign rq head pointer when enable rq record db")
Fixes: 633fb4d9fd ("RDMA/hns: Use structs to describe the uABI instead of opencoding")
Fixes: ae85bf92ef ("RDMA/hns: Optimize qp param setup flow")
Fixes: 6fd610c573 ("RDMA/hns: Support 0 hop addressing for SRQ buffer")
Fixes: 9d9d4ff788 ("RDMA/hns: Update the kernel header file of hns")
Link: https://lore.kernel.org/r/1607650657-35992-2-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-30 11:53:51 +01:00
Avihai Horon
1cbcdec82c RDMA/uverbs: Fix incorrect variable type
[ Upstream commit e0da68994d16b46384cce7b86eb645f1ef7c51ef ]

Fix incorrect type of max_entries in UVERBS_METHOD_QUERY_GID_TABLE -
max_entries is of type size_t although it can take negative values.

The following static check revealed it:

drivers/infiniband/core/uverbs_std_types_device.c:338 ib_uverbs_handler_UVERBS_METHOD_QUERY_GID_TABLE() warn: 'max_entries' unsigned <= 0

Fixes: 9f85cbe50a ("RDMA/uverbs: Expose the new GID query API to user space")
Link: https://lore.kernel.org/r/20201208073545.9723-4-leon@kernel.org
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-30 11:53:47 +01:00
Jack Morgenstein
53e9a5a692 RDMA/core: Do not indicate device ready when device enablement fails
[ Upstream commit 779e0bf47632c609c59f527f9711ecd3214dccb0 ]

In procedure ib_register_device, procedure kobject_uevent is called
(advertising that the device is ready for userspace usage) even when
device_enable_and_get() returned an error.

As a result, various RDMA modules attempted to register for the device
even while the device driver was preparing to unregister the device.

Fix this by advertising the device availability only after enabling the
device succeeds.

Fixes: e7a5b4aafd ("RDMA/device: Don't fire uevent before device is fully initialized")
Link: https://lore.kernel.org/r/20201208073545.9723-3-leon@kernel.org
Suggested-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-30 11:53:47 +01:00
Yangyang Li
482d2345bf RDMA/hns: Bugfix for calculation of extended sge
[ Upstream commit d34895c319faa1e0fc1a48c3b06bba6a8a39ba44 ]

Page alignment is required when setting the number of extended sge
according to the hardware's achivement. If the space of needed extended
sge is greater than one page, the roundup_pow_of_two() can ensure
that. But if the needed extended sge isn't 0 and can not be filled in a
whole page, the driver should align it specifically.

Fixes: 54d6638765 ("RDMA/hns: Optimize WQE buffer size calculating process")
Link: https://lore.kernel.org/r/1606558959-48510-3-git-send-email-liweihang@huawei.com
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-30 11:53:28 +01:00
Lang Cheng
a5c7bc6097 RDMA/hns: Fix 0-length sge calculation error
[ Upstream commit 0fd0175e30e487f8d70ecb2cdd67fbb514fdf50f ]

One RC SQ WQE can store 2 sges but UD can't, so ignore 2 valid sges of
wr.sglist for RC which have been filled in WQE before setting extended
sge.  Either of RC and UD can not contain 0-length sges, so these 0-length
sges should be skipped.

Fixes: 54d6638765 ("RDMA/hns: Optimize WQE buffer size calculating process")
Link: https://lore.kernel.org/r/1606558959-48510-2-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-30 11:53:28 +01:00
Leon Romanovsky
deaeb67c9d RDMA/core: Track device memory MRs
[ Upstream commit b47a98efa97889c5b16d17e77eed3dc4500674eb ]

Device memory (DM) are registered as MR during initialization flow, these
MRs were not tracked by resource tracker and had res->valid set as a
false. Update the code to manage them too.

Before this change:
$ ibv_rc_pingpong -j &
$ rdma res show mr <-- shows nothing

After this change:
$ ibv_rc_pingpong -j &
$ rdma res show mr
dev ibp0s9 mrn 0 mrlen 4096 pdn 3 pid 734 comm ibv_rc_pingpong

Fixes: be934cca9e ("IB/uverbs: Add device memory registration ioctl support")
Link: https://lore.kernel.org/r/20201117070148.1974114-2-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-30 11:53:24 +01:00
Weihang Li
af7ae24594 RDMA/hns: Avoid setting loopback indicator when smac is same as dmac
[ Upstream commit 3631dadfb118821236098a215e59fb5d3e1c30a8 ]

The loopback flag will be set to 1 by the hardware when the source mac
address is same as the destination mac address. So the driver don't need
to compare them.

Fixes: d6a3627e31 ("RDMA/hns: Optimize wqe buffer set flow for post send")
Link: https://lore.kernel.org/r/1605526408-6936-4-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-30 11:53:23 +01:00
Weihang Li
6b8a015eda RDMA/hns: Fix missing fields in address vector
[ Upstream commit fba429fcf9a5e0c4ec2523ecf4cf18bc0507fcbc ]

Traffic class and hop limit in address vector is not assigned from GRH,
but it will be filled into UD SQ WQE. So the hardware will get a wrong
value.

Fixes: 82e620d9c3 ("RDMA/hns: Modify the data structure of hns_roce_av")
Link: https://lore.kernel.org/r/1605526408-6936-3-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-30 11:53:23 +01:00
Weihang Li
ba9479d75e RDMA/hns: Only record vlan info for HIP08
[ Upstream commit 7406c0036f851ee1cd93cb08349f24b051b4cbf8 ]

Information about vlan is stored in GMV(GID/MAC/VLAN) table for HIP09, so
there is no need to copy it to address vector.

Link: https://lore.kernel.org/r/1605526408-6936-2-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-30 11:53:23 +01:00
Jason Gunthorpe
f85d05c0a5 RDMA/cma: Fix deadlock on &lock in rdma_cma_listen_on_all() error unwind
[ Upstream commit dd37d2f59eb839d51b988f6668ce5f0d533b23fd ]

rdma_detroy_id() cannot be called under &lock - we must instead keep the
error'd ID around until &lock can be released, then destroy it.

This is complicated by the usual way listen IDs are destroyed through
cma_process_remove() which can run at any time and will asynchronously
destroy the same ID.

Remove the ID from visiblity of cma_process_remove() before going down the
destroy path outside the locking.

Fixes: c80a0c52d85c ("RDMA/cma: Add missing error handling of listen_id")
Link: https://lore.kernel.org/r/20201118133756.GK244516@ziepe.ca
Reported-by: syzbot+1bc48bf7f78253f664a9@syzkaller.appspotmail.com
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-30 11:53:22 +01:00
Kamal Heib
623793c8da RDMA/cxgb4: Validate the number of CQEs
[ Upstream commit 6d8285e604e0221b67bd5db736921b7ddce37d00 ]

Before create CQ, make sure that the requested number of CQEs is in the
supported range.

Fixes: cfdda9d764 ("RDMA/cxgb4: Add driver for Chelsio T4 RNIC")
Link: https://lore.kernel.org/r/20201108132007.67537-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-30 11:53:19 +01:00
Leon Romanovsky
70ba8b1697 RDMA/cma: Add missing error handling of listen_id
[ Upstream commit c80a0c52d85c49a910d0dc0e342e8d8898677dc0 ]

Don't silently continue if rdma_listen() fails but destroy previously
created CM_ID and return an error to the caller.

Fixes: d02d1f5359 ("RDMA/cma: Fix deadlock destroying listen requests")
Link: https://lore.kernel.org/r/20201104144008.3808124-5-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-30 11:53:12 +01:00
Arnd Bergmann
ec6a178521 RDMa/mthca: Work around -Wenum-conversion warning
[ Upstream commit fbb7dc5db6dee553b5a07c27e86364a5223e244c ]

gcc points out a suspicious mixing of enum types in a function that
converts from MTHCA_OPCODE_* values to IB_WC_* values:

drivers/infiniband/hw/mthca/mthca_cq.c: In function 'mthca_poll_one':
drivers/infiniband/hw/mthca/mthca_cq.c:607:21: warning: implicit conversion from 'enum <anonymous>' to 'enum ib_wc_opcode' [-Wenum-conversion]
  607 |    entry->opcode    = MTHCA_OPCODE_INVALID;

Nothing seems to ever check for MTHCA_OPCODE_INVALID again, no idea if
this is meaningful, but it seems harmless as it deals with an invalid
input.

Remove MTHCA_OPCODE_INVALID and set the ib_wc_opcode to 0xFF, which is
still bogus, but at least doesn't make compiler warnings.

Fixes: 2a4443a699 ("[PATCH] IB/mthca: fill in opcode field for send completions")
Link: https://lore.kernel.org/r/20201026211311.3887003-1-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-30 11:53:08 +01:00
Jason Gunthorpe
c5c1af1107 RDMA/mlx5: Fix corruption of reg_pages in mlx5_ib_rereg_user_mr()
[ Upstream commit fc3325701a6353594083f08e297d4c1965c601aa ]

reg_pages should always contain mr->npage since when the mr is finally
de-reg'd it is always subtracted out.

If there were any error exits then mlx5_ib_rereg_user_mr() would leave the
reg_pages adjusted and this will cause it to be double subtracted
eventually.

The manipulation of reg_pages is inherently connected to the umem, so lift
it out of set_mr_fields() and only adjust it around creating/destroying a
umem.

reg_pages is only used for diagnostics in sysfs.

Fixes: 7d0cc6edcc ("IB/mlx5: Add MR cache for large UMR regions")
Link: https://lore.kernel.org/r/20201026131936.1335664-3-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-30 11:53:01 +01:00
Guoqing Jiang
459c15dd68 RDMA/rtrs-srv: Don't guard the whole __alloc_srv with srv_mutex
[ Upstream commit d715ff8acbd5876549ef2b21b755ed919f40dcc1 ]

The purpose of srv_mutex is to protect srv_list as in put_srv, so no need
to hold it when allocate memory for srv since it could be time consuming.

Otherwise if one machine has limited memory, rsrv_close_work could be
blocked for a longer time due to the mutex is held by get_or_create_srv
since it can't get memory in time.

  INFO: task kworker/1:1:27478 blocked for more than 120 seconds.
        Tainted: G           O    4.14.171-1-storage #4.14.171-1.3~deb9
  "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  kworker/1:1     D    0 27478      2 0x80000000
  Workqueue: rtrs_server_wq rtrs_srv_close_work [rtrs_server]
  Call Trace:
   ? __schedule+0x38c/0x7e0
   schedule+0x32/0x80
   schedule_preempt_disabled+0xa/0x10
   __mutex_lock.isra.2+0x25e/0x4d0
   ? put_srv+0x44/0x100 [rtrs_server]
   put_srv+0x44/0x100 [rtrs_server]
   rtrs_srv_close_work+0x16c/0x280 [rtrs_server]
   process_one_work+0x1c5/0x3c0
   worker_thread+0x47/0x3e0
   kthread+0xfc/0x130
   ? trace_event_raw_event_workqueue_execute_start+0xa0/0xa0
   ? kthread_create_on_node+0x70/0x70
   ret_from_fork+0x1f/0x30

Let's move all the logics from __find_srv_and_get and __alloc_srv to
get_or_create_srv, and remove the two functions. Then it should be safe
for multiple processes to access the same srv since it is protected with
srv_mutex.

And since we don't want to allocate chunks with srv_mutex held, let's
check the srv->refcount after get srv because the chunks could not be
allocated yet.

Fixes: 9cb8374804 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20201023074353.21946-6-jinpu.wang@cloud.ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-30 11:52:59 +01:00
Gioh Kim
4f64797899 RDMA/rtrs-clt: Missing error from rtrs_rdma_conn_established
[ Upstream commit f553e7601df9566ba7644541fc09152a3a81f793 ]

When rtrs_rdma_conn_established returns error (non-zero value), the error
value is stored in con->cm_err and it cannot trigger
rtrs_rdma_error_recovery. Finally the error of rtrs_rdma_con_established
will be forgot.

Fixes: 6a98d71dae ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20201023074353.21946-5-jinpu.wang@cloud.ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-30 11:52:59 +01:00
Danil Kipnis
78bd070fa3 RDMA/rtrs-clt: Remove destroy_con_cq_qp in case route resolving failed
[ Upstream commit 2b3062e4d997f201c1ad2bbde88b7271dd9ef35f ]

We call destroy_con_cq_qp(con) in rtrs_rdma_addr_resolved() in case route
couldn't be resolved and then again in create_cm() because nothing
happens.

Don't call destroy_con_cq_qp from rtrs_rdma_addr_resolved, create_cm()
does the clean up already.

Fixes: 6a98d71dae ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20201023074353.21946-2-jinpu.wang@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-30 11:52:58 +01:00
Bob Pearson
965f559393 RDMA/rxe: Compute PSN windows correctly
[ Upstream commit bb3ab2979fd69db23328691cb10067861df89037 ]

The code which limited the number of unacknowledged PSNs was incorrect.
The PSNs are limited to 24 bits and wrap back to zero from 0x00ffffff.
The test was computing a 32 bit value which wraps at 32 bits so that
qp->req.psn can appear smaller than the limit when it is actually larger.

Replace '>' test with psn_compare which is used for other PSN comparisons
and correctly handles the 24 bit size.

Fixes: 8700e3e7c4 ("Soft RoCE driver")
Link: https://lore.kernel.org/r/20201013170741.3590-1-rpearson@hpe.com
Signed-off-by: Bob Pearson <rpearson@hpe.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-30 11:52:58 +01:00
Jing Xiangfeng
e7c49c634a RDMA/core: Fix error return in _ib_modify_qp()
[ Upstream commit 5333499c6014224756e97fa1a1047dfa592d76d3 ]

Fix to return error code PTR_ERR() from the error handling case instead of
0.

Fixes: 51aab12631 ("RDMA/core: Get xmit slave for LAG")
Link: https://lore.kernel.org/r/20201016075845.129562-1-jingxiangfeng@huawei.com
Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-30 11:52:58 +01:00
Selvin Xavier
ae3739fcfe RDMA/bnxt_re: Fix entry size during SRQ create
[ Upstream commit b898d5c50cab1f985e77d053eb5c4d2c4a7694ae ]

Only static WQE is supported for SRQ. So always use the max supported SGEs
while calculating SRQ entry size.

Fixes: 2bb3c32c5c ("RDMA/bnxt_re: Change wr posting logic to accommodate variable wqes")
Link: https://lore.kernel.org/r/1602569752-12745-1-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-30 11:52:57 +01:00
Kamal Heib
4d281791be RDMA/bnxt_re: Set queue pair state when being queried
[ Upstream commit 53839b51a7671eeb3fb44d479d541cf3a0f2dd45 ]

The API for ib_query_qp requires the driver to set cur_qp_state on return,
add the missing set.

Fixes: 1ac5a40479 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Link: https://lore.kernel.org/r/20201021114952.38876-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-30 11:52:57 +01:00
Leon Romanovsky
340b940ea0 RDMA/cm: Fix an attempt to use non-valid pointer when cleaning timewait
If cm_create_timewait_info() fails, the timewait_info pointer will contain
an error value and will be used in cm_remove_remote() later.

  general protection fault, probably for non-canonical address 0xdffffc0000000024: 0000 [#1] SMP KASAN PTI
  KASAN: null-ptr-deref in range [0×0000000000000120-0×0000000000000127]
  CPU: 2 PID: 12446 Comm: syz-executor.3 Not tainted 5.10.0-rc5-5d4c0742a60e #27
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  RIP: 0010:cm_remove_remote.isra.0+0x24/0×170 drivers/infiniband/core/cm.c:978
  Code: 84 00 00 00 00 00 41 54 55 53 48 89 fb 48 8d ab 2d 01 00 00 e8 7d bf 4b fe 48 89 ea 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <0f> b6 04 02 48 89 ea 83 e2 07 38 d0 7f 08 84 c0 0f 85 fc 00 00 00
  RSP: 0018:ffff888013127918 EFLAGS: 00010006
  RAX: dffffc0000000000 RBX: fffffffffffffff4 RCX: ffffc9000a18b000
  RDX: 0000000000000024 RSI: ffffffff82edc573 RDI: fffffffffffffff4
  RBP: 0000000000000121 R08: 0000000000000001 R09: ffffed1002624f1d
  R10: 0000000000000003 R11: ffffed1002624f1c R12: ffff888107760c70
  R13: ffff888107760c40 R14: fffffffffffffff4 R15: ffff888107760c9c
  FS:  00007fe1ffcc1700(0000) GS:ffff88811a600000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000001b2ff21000 CR3: 000000010f504001 CR4: 0000000000370ee0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   cm_destroy_id+0x189/0×15b0 drivers/infiniband/core/cm.c:1155
   cma_connect_ib drivers/infiniband/core/cma.c:4029 [inline]
   rdma_connect_locked+0x1100/0×17c0 drivers/infiniband/core/cma.c:4107
   rdma_connect+0x2a/0×40 drivers/infiniband/core/cma.c:4140
   ucma_connect+0x277/0×340 drivers/infiniband/core/ucma.c:1069
   ucma_write+0x236/0×2f0 drivers/infiniband/core/ucma.c:1724
   vfs_write+0x220/0×830 fs/read_write.c:603
   ksys_write+0x1df/0×240 fs/read_write.c:658
   do_syscall_64+0x33/0×40 arch/x86/entry/common.c:46
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: a977049dac ("[PATCH] IB: Add the kernel CM implementation")
Link: https://lore.kernel.org/r/20201204064205.145795-1-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Reported-by: Amit Matityahu <mitm@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-12-09 15:51:35 -04:00
Gal Pressman
e432c04c17 RDMA/core: Fix empty gid table for non IB/RoCE devices
The query_gid_table ioctl skips non IB/RoCE ports, which as a result
returns an empty gid table for devices such as EFA which have a GID table,
but are not IB/RoCE.

Fixes: c4b4d548fa ("RDMA/core: Introduce new GID table query API")
Link: https://lore.kernel.org/r/20201206153238.34878-1-galpress@amazon.com
Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-12-07 16:32:04 -04:00
Gal Pressman
93416ab0f9 RDMA/efa: Use the correct current and new states in modify QP
The local variables cur_state and new_state hold the state that should be
used for the modify QP operation instead of the ones in the ib_qp_attr
struct.

Fixes: 40909f664d ("RDMA/efa: Add EFA verbs implementation")
Link: https://lore.kernel.org/r/20201201091724.37016-1-galpress@amazon.com
Reviewed-by: Firas JahJah <firasj@amazon.com>
Reviewed-by: Yossi Leybovich <sleybo@amazon.com>
Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-12-01 20:53:48 -04:00
Alok Prasad
0191c271de RDMA/qedr: iWARP invalid(zero) doorbell address fix
This patch fixes issue introduced by a previous commit where iWARP
doorbell address wasn't initialized, causing call trace when any RDMA
application wants to use this interface:

  Illegal doorbell address: 0000000000000000. Legal range for doorbell addresses is [0000000011431e08..00000000ec3799d3]
  WARNING: CPU: 11 PID: 11990 at drivers/net/ethernet/qlogic/qed/qed_dev.c:93 qed_db_rec_sanity.isra.12+0x48/0x70 [qed]
  ...
   hpsa scsi_transport_sas [last unloaded: crc8]
  CPU: 11 PID: 11990 Comm: rping Tainted: G S                5.10.0-rc1 #29
  Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 01/22/2018
  RIP: 0010:qed_db_rec_sanity.isra.12+0x48/0x70 [qed]
  ...
  RSP: 0018:ffffafc28458fa88 EFLAGS: 00010286
  RAX: 0000000000000000 RBX: ffff8d0d4c620000 RCX: 0000000000000000
  RDX: ffff8d10afde7d50 RSI: ffff8d10afdd8b40 RDI: ffff8d10afdd8b40
  RBP: ffffafc28458fe38 R08: 0000000000000003 R09: 0000000000007fff
  R10: 0000000000000001 R11: ffffafc28458f888 R12: 0000000000000000
  R13: 0000000000000000 R14: ffff8d0d43ccbbd0 R15: ffff8d0d48dae9c0
  FS:  00007fbd5267e740(0000) GS:ffff8d10afdc0000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007fbd4f258fb8 CR3: 0000000108d96003 CR4: 00000000001706e0
  Call Trace:
   qed_db_recovery_add+0x6d/0x1f0 [qed]
   qedr_create_user_qp+0x57e/0xd30 [qedr]
   qedr_create_qp+0x5f3/0xab0 [qedr]
   ? lookup_get_idr_uobject.part.12+0x45/0x90 [ib_uverbs]
   create_qp+0x45d/0xb30 [ib_uverbs]
   ? ib_uverbs_cq_event_handler+0x30/0x30 [ib_uverbs]
   ib_uverbs_create_qp+0xb9/0xe0 [ib_uverbs]
   ib_uverbs_write+0x3f9/0x570 [ib_uverbs]
   ? security_mmap_file+0x62/0xe0
   vfs_write+0xb7/0x200
   ksys_write+0xaf/0xd0
   ? syscall_trace_enter.isra.25+0x152/0x200
   do_syscall_64+0x2d/0x40
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 06e8d1df46 ("RDMA/qedr: Add support for user mode XRC-SRQ's")
Link: https://lore.kernel.org/r/20201127163251.14533-1-palok@marvell.com
Signed-off-by: Michal Kalderon <mkalderon@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Alok Prasad <palok@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-12-01 20:52:07 -04:00
Yixian Liu
17475e104d RDMA/hns: Bugfix for memory window mtpt configuration
When a memory window is bound to a memory region, the local write access
should be set for its mtpt table.

Fixes: c7c2819140 ("RDMA/hns: Add MW support for hip08")
Link: https://lore.kernel.org/r/1606386372-21094-1-git-send-email-liweihang@huawei.com
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-26 10:57:32 -04:00
Wenpeng Liang
ab6f7248cc RDMA/hns: Fix retry_cnt and rnr_cnt when querying QP
The maximum number of retransmission should be returned when querying QP,
not the value of retransmission counter.

Fixes: 99fcf82521 ("RDMA/hns: Fix the wrong value of rnr_retry when querying qp")
Fixes: 926a01dc00 ("RDMA/hns: Add QP operations support for hip08 SoC")
Link: https://lore.kernel.org/r/1606382977-21431-1-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-26 10:57:32 -04:00
Wenpeng Liang
ebed7b7ca4 RDMA/hns: Fix wrong field of SRQ number the device supports
The SRQ capacity is got from the firmware, whose field should be ended at
bit 19.

Fixes: ba6bb7e974 ("RDMA/hns: Add interfaces to get pf capabilities from firmware")
Link: https://lore.kernel.org/r/1606382812-23636-1-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-26 10:57:32 -04:00
Dennis Dalessandro
3d2a9d6425 IB/hfi1: Ensure correct mm is used at all times
Two earlier bug fixes have created a security problem in the hfi1
driver. One fix aimed to solve an issue where current->mm was not valid
when closing the hfi1 cdev. It attempted to do this by saving a cached
value of the current->mm pointer at file open time. This is a problem if
another process with access to the FD calls in via write() or ioctl() to
pin pages via the hfi driver. The other fix tried to solve a use after
free by taking a reference on the mm.

To fix this correctly we use the existing cached value of the mm in the
mmu notifier. Now we can check in the insert, evict, etc. routines that
current->mm matched what the notifier was registered for. If not, then
don't allow access. The register of the mmu notifier will save the mm
pointer.

Since in do_exit() the exit_mm() is called before exit_files(), which
would call our close routine a reference is needed on the mm. We rely on
the mmgrab done by the registration of the notifier, whereas before it was
explicit. The mmu notifier deregistration happens when the user context is
torn down, the creation of which triggered the registration.

Also of note is we do not do any explicit work to protect the interval
tree notifier. It doesn't seem that this is going to be needed since we
aren't actually doing anything with current->mm. The interval tree
notifier stuff still has a FIXME noted from a previous commit that will be
addressed in a follow on patch.

Cc: <stable@vger.kernel.org>
Fixes: e0cf75deab ("IB/hfi1: Fix mm_struct use after free")
Fixes: 3faa3d9a30 ("IB/hfi1: Make use of mm consistent")
Link: https://lore.kernel.org/r/20201125210112.104301.51331.stgit@awfm-01.aw.intel.com
Suggested-by: Jann Horn <jannh@google.com>
Reported-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-25 20:30:46 -04:00
Shiraz Saleem
2ed381439e RDMA/i40iw: Address an mmap handler exploit in i40iw
i40iw_mmap manipulates the vma->vm_pgoff to differentiate a push page mmap
vs a doorbell mmap, and uses it to compute the pfn in remap_pfn_range
without any validation. This is vulnerable to an mmap exploit as described
in: https://lore.kernel.org/r/20201119093523.7588-1-zhudi21@huawei.com

The push feature is disabled in the driver currently and therefore no push
mmaps are issued from user-space. The feature does not work as expected in
the x722 product.

Remove the push module parameter and all VMA attribute manipulations for
this feature in i40iw_mmap. Update i40iw_mmap to only allow DB user
mmapings at offset = 0. Check vm_pgoff for zero and if the mmaps are bound
to a single page.

Cc: <stable@kernel.org>
Fixes: d374984179 ("i40iw: add files for iwarp interface")
Link: https://lore.kernel.org/r/20201125005616.1800-2-shiraz.saleem@intel.com
Reported-by: Di Zhu <zhudi21@huawei.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-25 10:38:11 -04:00
Xiongfeng Wang
6830ff853a IB/mthca: fix return value of error branch in mthca_init_cq()
We return 'err' in the error branch, but this variable may be set as zero
by the above code. Fix it by setting 'err' as a negative value before we
goto the error label.

Fixes: 74c2174e7b ("IB uverbs: add mthca user CQ support")
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Link: https://lore.kernel.org/r/1605837422-42724-1-git-send-email-wangxiongfeng2@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-23 16:22:34 -04:00
Zhang Changzhong
dabbd6abcd IB/hfi1: Fix error return code in hfi1_init_dd()
Fix to return a negative error code from the error handling case instead
of 0, as done elsewhere in this function.

Fixes: 4730f4a6c6 ("IB/hfi1: Activate the dummy netdev")
Link: https://lore.kernel.org/r/1605249747-17942-1-git-send-email-zhangchangzhong@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Acked-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-13 12:25:21 -04:00