Conversion from IDR to XArray missed the fact that idr_alloc() returned
index as a return value, this index was saved in port variable and used as
query index later on. This caused to the following error.
BUG: KASAN: use-after-free in cma_check_port+0x86a/0xa20 [rdma_cm]
Read of size 8 at addr ffff888069fde998 by task ucmatose/387
CPU: 3 PID: 387 Comm: ucmatose Not tainted 5.1.0-rc2+ #253
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
Call Trace:
dump_stack+0x7c/0xc0
print_address_description+0x6c/0x23c
? cma_check_port+0x86a/0xa20 [rdma_cm]
kasan_report.cold.3+0x1c/0x35
? cma_check_port+0x86a/0xa20 [rdma_cm]
? cma_check_port+0x86a/0xa20 [rdma_cm]
cma_check_port+0x86a/0xa20 [rdma_cm]
rdma_bind_addr+0x11bc/0x1b00 [rdma_cm]
? find_held_lock+0x33/0x1c0
? cma_ndev_work_handler+0x180/0x180 [rdma_cm]
? wait_for_completion+0x3d0/0x3d0
ucma_bind+0x120/0x160 [rdma_ucm]
? ucma_resolve_addr+0x1a0/0x1a0 [rdma_ucm]
ucma_write+0x1f8/0x2b0 [rdma_ucm]
? ucma_open+0x260/0x260 [rdma_ucm]
vfs_write+0x157/0x460
ksys_write+0xb8/0x170
? __ia32_sys_read+0xb0/0xb0
? trace_hardirqs_off_caller+0x5b/0x160
? do_syscall_64+0x18/0x3c0
do_syscall_64+0x95/0x3c0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Allocated by task 381:
__kasan_kmalloc.constprop.5+0xc1/0xd0
cma_alloc_port+0x4d/0x160 [rdma_cm]
rdma_bind_addr+0x14e7/0x1b00 [rdma_cm]
ucma_bind+0x120/0x160 [rdma_ucm]
ucma_write+0x1f8/0x2b0 [rdma_ucm]
vfs_write+0x157/0x460
ksys_write+0xb8/0x170
do_syscall_64+0x95/0x3c0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Freed by task 381:
__kasan_slab_free+0x12e/0x180
kfree+0xed/0x290
rdma_destroy_id+0x6b6/0x9e0 [rdma_cm]
ucma_close+0x110/0x300 [rdma_ucm]
__fput+0x25a/0x740
task_work_run+0x10e/0x190
do_exit+0x85e/0x29e0
do_group_exit+0xf0/0x2e0
get_signal+0x2e0/0x17e0
do_signal+0x94/0x1570
exit_to_usermode_loop+0xfa/0x130
do_syscall_64+0x327/0x3c0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Reported-by: <syzbot+2e3e485d5697ea610460@syzkaller.appspotmail.com>
Reported-by: Ran Rozenstein <ranro@mellanox.com>
Fixes: 638267537a ("cma: Convert portspace IDRs to XArray")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Tested-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Fix the following warning:
drivers/net/ethernet/mellanox/mlx5/core//fs_core.c:845:5:
warning: 'err' may be used uninitialized in this function
[-Wmaybe-uninitialized]
No real issue here. This is only a false compiler warning.
The 'err' variable is guaranteed to be init by time of usage.
gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-4)
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Expose PRM layout for handling MPEIN (Management PCIE Info). It will be
used in the downstream patch for querying MPEIN via the driver.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add rate limited print macros for warning and info level. This protects
the system from burst of prints depleting HW resources and spamming dmesg.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add bar_addr field to store bar-0 address to avoid calling
pci_resource_start with hard-coded bar-0 as parameter.
Also note that different mlx5 device types will have bar_addr
on different bars.
This patch does not change any functionality.
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Replace pci dev_err/warn/info messages with mlx5_core_err/warn/info
messages to provide a better report/debug of different mlx5 device types.
This patch does not change any functionality.
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Use mlx5_core mdev private name in message instead of using pci dev_name
to provide a better report/debug of different mlx5 device types.
This patch does not change any functionality.
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Detach mlx5_core mdev messages from pci device mdev->pdev messages and
provide a better report/debug of different mlx5 device types.
This patch does not change any functionality.
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Using foundation from previous patches to factor mlx5_load_one flow
into three stages:
1. mlx5_function_setup() from previous patch to setup function
2. mlx5_init_once() from previous patch to init software objects
according to hw caps
3. New mlx5_load() to load mlx5 components
This provides a better logical separation of mlx5 core device
initialization flow and will help to seamlessly support creating different
mlx5 device types such as PF, VF and SF mlx5 sub-function virtual device.
This patch does not change any functionality.
Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Function setup and teardown procedures are the basic procedure that
each mlx5 pci function should perform to boot up a mlx5 device function
and initialize basic communication with FW, before allocating any higher
level software/firmware resources.
This provides a better logical separation of mlx5 core device
initialization flow and will help to seamlessly support creating different
mlx5 device types such as PF, VF and SF mlx5 sub-function virtual device.
This patch does not change any functionality.
Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Software structure initialization should be in mdev_init stage.
This provides a better logical separation of mlx5 core device
initialization flow and will help to seamlessly support creating different
mlx5 device types such as PF, VF and SF mlx5 sub-function virtual device.
This patch does not change any functionality.
Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Separate resources initialization from pci initialization.
This provides a better logical separation of mlx5 core device
initialization flow and will help to seamlessly support creating different
mlx5 device types such as PF, VF and SF mlx5 sub-function virtual device.
This patch does not change any functionality.
Signed-off-by: Vu Pham <vuhuong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
As there is no user of mlx5_write64 that passes a spinlock to
mlx5_write64, remove this functionality and simplify the function.
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
MLX5_*_DOORBELL_LOCK macros provided a way to avoid locking for
mlx5_write64 on 64-bit platforms where it's not necessary. Currently all
calls to mlx5_write64 don't use a spinlock, so the macros became unused.
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
port_pd is treated as le32 in declaration and read, fix assignment to be
in le32 too. This change fixes the following compilation warnings.
drivers/infiniband/hw/hns/hns_roce_ah.c:67:24: warning: incorrect type
in assignment (different base types)
drivers/infiniband/hw/hns/hns_roce_ah.c:67:24: expected restricted __le32 [usertype] port_pd
drivers/infiniband/hw/hns/hns_roce_ah.c:67:24: got restricted __be32 [usertype]
Fixes: 9a4435375c ("IB/hns: Add driver files for hns RoCE driver")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Gal Pressman <galpress@amazon.com>
Reviewed-by: Lijun Ou <ouliun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Now when ib_udata is passed to all the driver's object create/destroy APIs
the ib_udata will carry the ib_ucontext for every user command. There is
no need to also pass the ib_ucontext via the functions prototypes.
Make ib_udata the only argument psssed.
Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Now that we have the udata passed to all the ib_xxx object destroy APIs
and the additional macro 'rdma_udata_to_drv_context' to get the
ib_ucontext from ib_udata stored in uverbs_attr_bundle, we can finally
start to remove the dependency of the drivers in the
ib_xxx->uobject->context.
Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The uverbs_attr_bundle with the ucontext is sent down to the drivers ib_x
destroy path as ib_udata. The next patch will use the ib_udata to free the
drivers destroy path from the dependency in 'uobject->context' as we
already did for the create path.
Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Pass uverbs_attr_bundle down the uobject destroy path. The next patch will
use this to eliminate the dependecy of the drivers in ib_x->uobject
pointers.
Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
the Attempt to use the below commit to initialize the ucontext for the
uobject destroy path has shown that the below commit is incomplete.
Parts were reverted and the ucontext set up in the uverbs_attr_bundle was
moved to rdma_lookup_get_uobject which is called from the uobj_get_XXX
macros and rdma_alloc_begin_uobject which is called when uobject is
created.
Fixes: 3d9dfd0603 ("IB/uverbs: Add ib_ucontext to uverbs_attr_bundle sent from ioctl and cmd flows")
Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Also remove qib_devs_list.
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Also remove hfi1_devs_list.
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
There is no need to perform extra comparison after boolean AND.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Delete structure which is not connected due to removal in commit
cited in Fixes line.
Fixes: a78e8723a5 ("RDMA/cma: Remove CM_ID statistics provided by rdma-cm module")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Also fully initialise the qp before storing it in the XArray.
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Change the locking to not disable interrupts as the lookup in interrupt
context will not see a freed CQ, thanks to the synchronize_irq() call
before freeing the CQ.
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Add netlink command that enables/disables sharing rdma device among
multiple net namespaces.
Using rdma tool,
$rdma sys set netns shared (default mode)
When rdma subsystem netns mode is set to shared mode, rdma devices
will be accessible in all net namespaces.
Using rdma tool,
$rdma sys set netns exclusive
When rdma subsystem netns mode is set to exclusive mode, devices
will be accessible in only one net namespace at any given
point of time.
If there are any net namespaces other than default init_net exists,
while executing this command, it will fail and mode cannot be changed.
To change this mode, netlink command is used instead of sysctl, because
netlink command allows to auto load a module.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Add an interface via netlink command to query whether rdma devices are
shared among multiple net namespaces or not. When using RDMAtool, it can
be queried as,
$rdma system show netns
netns shared
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Extend ib_device_get_by_index() API to check device access for
net namespace for serving netlink commands.
Also enforce net ns check on dumpit commands which iterate over all
registered rdma devices and which don't call ib_device_get_by_index().
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Introduce an API rdma_dev_access_netns() to check whether a rdma device
can be accessed from the specified net namespace or not.
Use rdma_dev_access_netns() while opening character uverbs, umad network
device and also check while rdma cm_id binds to rdma device.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Add module parameter to change a sharing mode of ib_core early in the
boot process. This parameter helps to those systems where modern up
to date rdma tool (iproute2) package may not be available during
kernel upgrade cycle.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Now that sysfs compatibility layer for non init_net exists, add core port
attributes such as pkey and gid table to non init_net ns.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Implement compatibility layer sysfs entries of ib_core so that non
init_net net namespaces can also discover rdma devices.
Each non init_net net namespace has ib_core_device created in it.
Such ib_core_device sysfs tree resembles rdma devices found in
init_net namespace.
This allows discovering rdma devices in multiple non init_net net
namespaces via sysfs entries and helpful to rdma-core userspace.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This is a preparation patch to provide isolation of rdma device in a
network namespace.
As first step, make rdma device visible only in init net namespace.
Subsequent patch will enable rdma device visibility back in multiple net
namespaces using compat ib_core_device device/sysfs tree.
Given that the IB subsystem depends on net stack, it needs to be
initialized after netdev and since it support devices, it needs to be
initialized before the device subsystem; therefore, change initcall
sequence to fs_initcall, so that when ib_core is compiled in the kernel
image, the right init sequence is followed.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
In order to support sysfs entries in multiple net namespaces for a rdma
device, introduce a ib_core_device whose scope is limited to hold core
device and per port sysfs related entries.
This is preparation patch so that multiple ib_core_devices in each net
namespace can be created in subsequent patch who all can share ib_device.
(a) Move sysfs specific fields to ib_core_device.
(b) Make sysfs and device life cycle related routines to work on
ib_core_device.
(c) Introduce and use rdma_init_coredev() helper to initialize
coredev fields.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The buffer that holds the page DMA addresses is sized off umem->nmap.
This can potentially cause out of bound accesses on the PBL array when
iterating the umem DMA-mapped SGL. This is because if umem pages are
combined, umem->nmap can be much lower than the number of system pages
in umem.
Use ib_umem_num_pages() to size this buffer.
Cc: Dennis Dalessandro <dennis.dalessandro@intel.com>
Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
Cc: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The buffer that holds the page DMA addresses is sized off umem->nmap.
This can potentially cause out of bound accesses on the PBL array when
iterating the umem DMA-mapped SGL. This is because if umem pages are
combined, umem->nmap can be much lower than the number of system pages
in umem.
Use ib_umem_num_pages() to size this buffer.
Cc: Moni Shoua <monis@mellanox.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The buffer that holds the page DMA addresses is sized off umem->nmap.
This can potentially cause out of bound accesses on the PBL array when
iterating the umem DMA-mapped SGL. This is because if umem pages are
combined, umem->nmap can be much lower than the number of system pages
in umem.
Use ib_umem_num_pages() to size this buffer.
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The PBL array that hold the page DMA address is sized off umem->nmap.
This can potentially cause out of bound accesses on the PBL array when
iterating the umem DMA-mapped SGL. This is because if umem pages are
combined, umem->nmap can be much lower than the number of system pages
in umem.
Use ib_umem_num_pages() to size this array.
Cc: Potnuri Bharat Teja <bharat@chelsio.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
umem->nmap is used while allocating internal buffer for storing
page DMA addresses. This causes out of bounds array access while iterating
the umem DMA-mapped SGL with umem page combining as umem->nmap can be
less than number of system pages in umem.
Use ib_umem_num_pages() instead of umem->nmap to size the page array.
Add a new structure (bnxt_qplib_sg_info) to pass sglist, npages and nmap.
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This patch avoids that a compiler warning is reported when building with
W=1.
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Fixes: 49c0e2414b ("IB/qib: Change SDMA progression mode depending on single- or multi-rail")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Enable format string checking for hfi1_cdbg() and fix the resulting
compiler warnings.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Avoid that sparse complains about a missing declaration.
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Fixes: 6bf8f22aea ("IB/mlx5: Introduce MLX5_IB_OBJECT_DEVX_ASYNC_CMD_FD")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This patch avoids that sparse reports the following warnings:
drivers/infiniband/core/uverbs_std_types_flow_action.c:442:30: warning: symbol 'uverbs_def_obj_flow_action' was not declared. Should it be static?
drivers/infiniband/core/uverbs_std_types_dm.c:112:30: warning: symbol 'uverbs_def_obj_dm' was not declared. Should it be static?
drivers/infiniband/core/uverbs_std_types_counters.c:153:30: warning: symbol 'uverbs_def_obj_counters' was not declared. Should it be static?
drivers/infiniband/core/uverbs_std_types_mr.c:213:30: warning: symbol 'uverbs_def_obj_mr' was not declared. Should it be static?
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Fixes: 0bd01f3d09 ("RDMA/uverbs: Require all objects to have a driver destroy function")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This patch avoids that sparse complains about a mismatch between the
returned value and the function return type.
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Fixes: c3bea3d2dc ("RDMA/uverbs: Use the iterator for ib_uverbs_unmarshall_recv()")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>