linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2025-01-23 13:59:40 +07:00

Author	SHA1	Message	Date
Weihang Li	30661322b8	RDMA/hns: Extend capability flags for HIP08_C 12 bits is not enough for HIP08_C, so extend a new field in length of 16 bits for it. Link: https://lore.kernel.org/r/1588674607-25337-3-git-send-email-liweihang@huawei.com Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-05-12 20:37:06 -03:00
Potnuri Bharat Teja	c8b1f340e5	RDMA/iw_cxgb4: Fix incorrect function parameters While reading the TCB field in t4_tcb_get_field32() the wrong mask is passed as a parameter which leads the driver eventually to a kernel panic/app segfault from access to an illegal SRQ index while flushing the SRQ completions during connection teardown. Fixes: `11a27e2121` ("iw_cxgb4: complete the cached SRQ buffers") Link: https://lore.kernel.org/r/20200511185608.5202-1-bharat@chelsio.com Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-05-12 11:47:48 -03:00
Mike Marciniszyn	fa8dac3968	IB/hfi1: Fix another case where pq is left on waitlist The commit noted below fixed a case where a pq is left on the sdma wait list. It however missed another case. user_sdma_send_pkts() has two calls from hfi1_user_sdma_process_request(). If the first one fails as indicated by -EBUSY, the pq will be placed on the waitlist as by design. If the second call then succeeds, the pq is still on the waitlist setting up a race with the interrupt handler if a subsequent request uses a different SDMA engine Fix by deleting the first call. The use of pcount and the intent to send a short burst of packets followed by the larger balance of packets was never correctly implemented, because the two calls always send pcount packets no matter what. A subsequent patch will correct that issue. Fixes: `9a293d1e21` ("IB/hfi1: Ensure pq is not left on waitlist") Link: https://lore.kernel.org/r/20200504130917.175613.43231.stgit@awfm-01.aw.intel.com Cc: <stable@vger.kernel.org> Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-05-12 11:47:48 -03:00
Denis V. Lunev	856ec7f646	IB/i40iw: Remove bogus call to netdev_master_upper_dev_get() Local variable netdev is not used in these calls. It should be noted, that this change is required to work in bonded mode. Otherwise we would get the following assert: "RTNL: assertion failed at net/core/dev.c (5665)" With the calltrace as follows: dump_stack+0x19/0x1b netdev_master_upper_dev_get+0x61/0x70 i40iw_addr_resolve_neigh+0x1e8/0x220 i40iw_make_cm_node+0x296/0x700 ? i40iw_find_listener.isra.10+0xcc/0x110 i40iw_receive_ilq+0x3d4/0x810 i40iw_puda_poll_completion+0x341/0x420 i40iw_process_ceq+0xa5/0x280 i40iw_ceq_dpc+0x1e/0x40 tasklet_action+0x83/0x140 __do_softirq+0x125/0x2bb call_softirq+0x1c/0x30 do_softirq+0x65/0xa0 irq_exit+0x105/0x110 do_IRQ+0x56/0xf0 common_interrupt+0x16a/0x16a ? cpuidle_enter_state+0x57/0xd0 cpuidle_idle_call+0xde/0x230 arch_cpu_idle+0xe/0xc0 cpu_startup_entry+0x14a/0x1e0 start_secondary+0x1f7/0x270 start_cpu+0x5/0x14 Link: https://lore.kernel.org/r/20200428131511.11049-1-den@openvz.org Signed-off-by: Denis V. Lunev <den@openvz.org> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-05-12 11:47:48 -03:00
Jack Morgenstein	6693ca95bd	IB/mlx4: Test return value of calls to ib_get_cached_pkey In the mlx4_ib_post_send() flow, some functions call ib_get_cached_pkey() without checking its return value. If ib_get_cached_pkey() returns an error code, these functions should return failure. Fixes: `1ffeb2eb8b` ("IB/mlx4: SR-IOV IB context objects and proxy/tunnel SQP support") Fixes: `225c7b1fee` ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters") Fixes: `e622f2f4ad` ("IB: split struct ib_send_wr") Link: https://lore.kernel.org/r/20200426075921.130074-1-leon@kernel.org Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-05-12 11:47:48 -03:00
Colin Ian King	52c81f47f0	RDMA/mlx5: Remove duplicated assignment to variable rcqe_sz The variable rcqe_sz is being unnecessarily assigned twice, fix this by removing one of the duplicates. Fixes: `8bde2c509e` ("RDMA/mlx5: Update all DRIVER QP places to use QP subtype") Link: https://lore.kernel.org/r/20200507151610.52636-1-colin.king@canonical.com Addresses-Coverity: ("Evaluation order violation") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-05-07 21:02:58 -03:00
David S. Miller	3793faad7b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Conflicts were all overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-06 22:10:13 -07:00
Mark Bloch	42caf9cb59	RDMA/mlx5: Allow only raw Ethernet QPs when RoCE isn't enabled When operating in switchdev mode or using devlink to disable RoCE only raw Ethernet QPs are allowed to be created. When in switchdev mode this can lead to passing an invalid port number as part of the modify qp firmware cmd and will lead to a syndrome reported back to the user, such as: * mlx5_cmd_check:803:(pid 50148): RST2INIT_QP(0x502) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x177405). Internal UD QP might be used to test for write combining support (even if externally we report RoCE as disabled) check for that specific flag and allow is specifically. Fixes: `b5ca15ad7e` ("IB/mlx5: Add proper representors support") Link: https://lore.kernel.org/r/20200506071602.7177-3-leon@kernel.org Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-05-06 17:52:01 -03:00
Mark Bloch	8d93efb8c5	RDMA/mlx5: Assign profile before calling stages Assign the profile to the IB device before executing stages. This will allow to check which profile is being used from within a stage. Link: https://lore.kernel.org/r/20200506071602.7177-2-leon@kernel.org Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-05-06 17:52:01 -03:00
Leon Romanovsky	029e88fd1e	RDMA/mlx5: Move all WR logic from qp.c to separate file Split qp.c by removing all WR logic to separate file. Link: https://lore.kernel.org/r/20200506065513.4668-4-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-05-06 17:42:45 -03:00
Max Gurtovoy	6671cde83d	RDMA/mlx5: Refactor mlx5_post_send() to improve readability Add small helpers in order to avoid code duplication and improve code readability. Decrease the amount of code in the gigantic post_send function and divide it to readable methods that will help in code maintenance in the future. Link: https://lore.kernel.org/r/20200506065513.4668-3-leon@kernel.org Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-05-06 17:42:45 -03:00
Leon Romanovsky	31578defe4	RDMA/mlx5: Update mlx5_ib to use new cmd interface Reuse newly introduced mlx5_cmd_exec_in() and mlx5_cmd_exec_inout() to reduce code duplication in mlx5_ib module. Link: https://lore.kernel.org/r/20200506065513.4668-2-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-05-06 17:42:45 -03:00
Wenpeng Liang	e4faa478c6	RDMA/hns: Remove redundant assignment of caps These caps are assigned in query_pf_caps() or set_default_caps(), and should not be assigned out of these two functions. Link: https://lore.kernel.org/r/1588242691-12913-4-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-05-06 17:33:20 -03:00
Weihang Li	b713128de7	RDMA/hns: Adjust lp_pktn_ini dynamically lp_pktn_ini means the number of loopback slice packets for long messages, it should depend on MTU(fixed to 4096B currently) and max size of SQ inline. Link: https://lore.kernel.org/r/1588242691-12913-3-git-send-email-liweihang@huawei.com Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-05-06 17:33:20 -03:00
Weihang Li	23190b8f47	RDMA/hns: Fix comments with non-English symbols There is a comments with some chinese semicolons that cause encoding issues each time hns_roc_hw_v2.h was modified from a IDE. So fix this by using correct symbols. Link: https://lore.kernel.org/r/1588242691-12913-2-git-send-email-liweihang@huawei.com Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-05-06 17:33:19 -03:00
Xi Wang	67954a6e37	RDMA/hns: Optimize SRQ buffer size calculating process Optimize the SRQ's WQE buffer parameters calculating process to make the codes more readable by using new functions about multi-hop addressing to calculating capabilities of SRQ. Link: https://lore.kernel.org/r/1588071823-40200-6-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-05-06 17:26:44 -03:00
Yixian Liu	ffb1308b88	RDMA/hns: Move SRQ code to the reasonable place Just move the SRQ related code to more reasonable place, and unify format of some prints. Link: https://lore.kernel.org/r/1588071823-40200-5-git-send-email-liweihang@huawei.com Signed-off-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-05-06 17:26:43 -03:00
Xi Wang	54d6638765	RDMA/hns: Optimize WQE buffer size calculating process Optimize the QP's WQE buffer parameters calculating process to make the codes more readable mainly by merging calculation of extended sge space of kernel and userspace. In addition, add some inline functions to simply codes about multi-hop addressing. Link: https://lore.kernel.org/r/1588071823-40200-4-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-05-06 17:26:43 -03:00
Xi Wang	2929c40f08	RDMA/hns: Remove unused MTT functions The MTT (Memory Translate Table) interface is no longer used to configure the buffer address to BT (Base Address Table) that requires driver mapping. Because the MTT is not compatible with multi-hop addressing of the hip08, it is replaced by MTR (Memory Translate Region) interface, and all the MTT functions should be removed. Link: https://lore.kernel.org/r/1588071823-40200-3-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-05-06 17:26:43 -03:00
Xi Wang	9b2cf76c9f	RDMA/hns: Optimize PBL buffer allocation process PBL table has its own implementation for multi-hop addressing currently, but for the hardware, all table's addressing use the same logic, there is no need to implement repeatedly. So optimize the PBL buffer allocation process by using the mtr's interfaces. Link: https://lore.kernel.org/r/1588071823-40200-2-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-05-06 17:26:43 -03:00
Mark Zhang	5ac55dfc6d	RDMA/mlx5: Set UDP source port based on the grh.flow_label Calculate UDP source port based on the grh.flow_label. If grh.flow_label is not valid, we will use minimal supported UDP source port. Link: https://lore.kernel.org/r/20200504051935.269708-6-leon@kernel.org Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-05-06 16:51:44 -03:00
Mark Zhang	2b880b2e5e	RDMA/mlx5: Define RoCEv2 udp source port when set path Calculate and set UDP source port based on the flow label. If flow label is not defined in GRH then calculate it based on lqpn/rqpn. Link: https://lore.kernel.org/r/20200504051935.269708-4-leon@kernel.org Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-05-06 16:51:44 -03:00
Dan Carpenter	37e31d2d26	i40iw: Fix error handling in i40iw_manage_arp_cache() The i40iw_arp_table() function can return -EOVERFLOW if i40iw_alloc_resource() fails so we can't just test for "== -1". Fixes: `4e9042e647` ("i40iw: add hw and utils files") Link: https://lore.kernel.org/r/20200422092211.GA195357@mwanda Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-05-04 14:12:35 -03:00
Gal Pressman	f86e34374a	RDMA/efa: Count admin commands errors Add a new stat that counts admin commands failures, which might help when debugging different issues. Link: https://lore.kernel.org/r/20200420062213.44577-4-galpress@amazon.com Reviewed-by: Daniel Kranzdorf <dkkranzd@amazon.com> Reviewed-by: Yossi Leybovich <sleybo@amazon.com> Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-05-02 20:32:14 -03:00
Gal Pressman	eca5757f80	RDMA/efa: Count mmap failures Add a new stat that counts mmap failures, which might help when debugging different issues. Link: https://lore.kernel.org/r/20200420062213.44577-3-galpress@amazon.com Reviewed-by: Firas JahJah <firasj@amazon.com> Reviewed-by: Yossi Leybovich <sleybo@amazon.com> Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-05-02 20:32:14 -03:00
Gal Pressman	b2ea69b3b4	RDMA/efa: Report create CQ error counter Create CQ errors are already being counted, report them along all other counters. Link: https://lore.kernel.org/r/20200420062213.44577-2-galpress@amazon.com Reviewed-by: Firas JahJah <firasj@amazon.com> Reviewed-by: Yossi Leybovich <sleybo@amazon.com> Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-05-02 20:32:14 -03:00
Maor Gottlieb	cfc1a89e44	RDMA/mlx5: Set lag tx affinity according to slave The patch sets the lag tx affinity of the data QPs and the GSI QPs according to the LAG xmit slave. For GSI QPs, in case the link layer is Ethenet (RoCE) we create two GSI QPs, one for each physical port. When the driver selects the GSI QP, it will consider the port affinity result. For connected QPs, the driver sets the affinity of the xmit slave. The above, ensures that RC QP and it's corresponding GSI QP will transmit from the same physical port. Link: https://lore.kernel.org/r/20200430192146.12863-17-maorg@mellanox.com Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-05-02 20:19:54 -03:00
Maor Gottlieb	5163b2743a	RDMA/mlx5: Refactor affinity related code Move affinity related code in modify qp to function. It's a preparation for next patch the extend the affinity calculation to consider the xmit slave. Link: https://lore.kernel.org/r/20200430192146.12863-16-maorg@mellanox.com Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-05-02 20:19:54 -03:00
Maor Gottlieb	fa5d010c56	RDMA: Group create AH arguments in struct Following patch adds additional argument to the create AH function, so it make sense to group ah_attr and flags arguments in struct. Link: https://lore.kernel.org/r/20200430192146.12863-13-maorg@mellanox.com Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Acked-by: Devesh Sharma <devesh.sharma@broadcom.com> Acked-by: Gal Pressman <galpress@amazon.com> Acked-by: Weihang Li <liweihang@huawei.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-05-02 20:19:53 -03:00
Aharon Landau	0eacc574aa	RDMA/mlx5: Verify that QP is created with RQ or SQ RAW packet QP and underlay QP must be created with either RQ or SQ, check that. Fixes: `e126ba97db` ("mlx5: Add driver for Mellanox Connect-IB adapters") Link: https://lore.kernel.org/r/20200427154636.381474-37-leon@kernel.org Signed-off-by: Aharon Landau <aharonl@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-30 18:45:46 -03:00
Leon Romanovsky	968f0b6f9c	RDMA/mlx5: Consolidate into special function all create QP calls Finish separation to blocks of mlx5_ib_create_qp() functions, so all internal create QP implementation are located in one place. Link: https://lore.kernel.org/r/20200427154636.381474-36-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-30 18:45:46 -03:00
Leon Romanovsky	6367da46d3	RDMA/mlx5: Remove redundant destroy QP call After major refactoring in create QP flow, it is no needed to call to destroy QP in XRC_TGT flow. Link: https://lore.kernel.org/r/20200427154636.381474-35-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-30 18:45:46 -03:00
Leon Romanovsky	08d5397660	RDMA/mlx5: Copy response to the user in one place Update all the places in create QP flows to copy response to the user in one place. Link: https://lore.kernel.org/r/20200427154636.381474-34-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-30 18:45:46 -03:00
Leon Romanovsky	6f2cf76e6e	RDMA/mlx5: Handle udate outlen checks in one place Place in one function all udata size checks. This will allow us move ib_copy_to_udata() in general place and ensure that it will be performed after call to the FW. Link: https://lore.kernel.org/r/20200427154636.381474-33-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-30 18:45:45 -03:00
Leon Romanovsky	5d6fffed1c	RDMA/mlx5: Promote RSS RAW QP flags check to higher level Move check that user didn't supplied RSS RAW QP unsupported command flags to the function that checks all such flags. Link: https://lore.kernel.org/r/20200427154636.381474-32-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-30 18:45:45 -03:00
Leon Romanovsky	f78d358cec	RDMA/mlx5: Group all create QP parameters to simplify in-kernel interfaces The amount of parameters passed in and out between internal mlx5 create QP functions is too large to easily follow the flow. Change it by grouping all create QP parameter into one structure. Link: https://lore.kernel.org/r/20200427154636.381474-31-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-30 18:45:45 -03:00
Leon Romanovsky	747c519cdb	RDMA/mlx5: Reduce amount of duplication in QP destroy Delete both PD argument and checks if udata was provided, in favour of unified destroy QP functions. Link: https://lore.kernel.org/r/20200427154636.381474-30-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-30 18:45:45 -03:00
Leon Romanovsky	98fc1126c4	RDMA/mlx5: Separate to user/kernel create QP flows The kernel and user create QP flows have very little common code, separate them to simplify the future work of creating per-type create_*_qp() functions. Link: https://lore.kernel.org/r/20200427154636.381474-29-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-30 18:45:44 -03:00
Leon Romanovsky	04bcc1c2d0	RDMA/mlx5: Separate XRC_TGT QP creation from common flow XRC_TGT QP doesn't fail into kernel or user flow separation. It is initiated by the user, but is created through in-kernel verbs flow and doesn't have PD and udata in similar way to kernel QPs. So let's separate creation of that QP type from the common flow. Link: https://lore.kernel.org/r/20200427154636.381474-28-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-30 18:45:44 -03:00
Leon Romanovsky	21aad80b17	RDMA/mlx5: Globally parse DEVX UID Remove duplication in parsing of DEVX UID. Link: https://lore.kernel.org/r/20200427154636.381474-27-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-30 18:45:44 -03:00
Leon Romanovsky	0ce300b15a	RDMA/mlx5: Delete impossible inlen check The inlen is set to be above zero in all flows before and can't be negative at this stage. Link: https://lore.kernel.org/r/20200427154636.381474-26-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-30 18:45:44 -03:00
Leon Romanovsky	03c4077b28	RDMA/mlx5: Rely on existence of udata to separate kernel/user flows Instead of keeping special field to separate kernel/user create/destroy flows, rely on existence of udata pointer. All allocation flows are using kzalloc() and leave uninitialized pointers as NULL which makes MLX5_QP_EMPTY and MLX5_QP_KERNEL flows to be the same. Link: https://lore.kernel.org/r/20200427154636.381474-25-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-30 18:45:43 -03:00
Leon Romanovsky	76883a6cc1	RDMA/mlx5: Remove second user copy in create_user_qp Combine copy_from_user() from create_user_qp() and general code. Link: https://lore.kernel.org/r/20200427154636.381474-24-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-30 18:45:43 -03:00
Leon Romanovsky	5ce0592b0e	RDMA/mlx5: Combine copy of create QP command in RSS RAW QP Change the create QP flow to handle all copy_from_user() operations in one place. Link: https://lore.kernel.org/r/20200427154636.381474-23-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-30 18:45:43 -03:00
Leon Romanovsky	266424eba6	RDMA/mlx5: Promote RSS RAW QP attribute check in higher level Perform check of attributes of RAW PACKET QP in separate function. Link: https://lore.kernel.org/r/20200427154636.381474-22-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-30 18:45:43 -03:00
Leon Romanovsky	7aede1a25f	RDMA/mlx5: Store QP type in the vendor QP structure QP type is stored in the IB/core QP struct, but it doesn't have all the needed information, like internal QP type used in the driver itself. Update mlx5_ib to have cached QP type which includes both IBTA and Mellanox specific one. Such change allows us to make even further cleanup of QP creation flow. Link: https://lore.kernel.org/r/20200427154636.381474-21-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-30 18:45:42 -03:00
Leon Romanovsky	3ae7e66a01	RDMA/mlx5: Delete unsupported QP types There is no need to explicitly check unsupported QP types, rely on "default" keyword in switch-case to catch them. Link: https://lore.kernel.org/r/20200427154636.381474-20-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-30 18:45:42 -03:00
Jason Gunthorpe	dfb25edd97	Merge branch 'mlx5_ib_qp_refactor_1' into rdma.git for-next Leon Romanovsky says: ==================== This is first part of series which tries to return some sanity to mlx5_ib_create_qp() function. Such refactoring is required to make extension of that function with less worries of breaking driver. Extra goal of such refactoring is to ensure that QP is allocated at the beginning of function and released at the end. It will allow us to move QP allocation to be under IB/core responsibility. ==================== Based on the mlx5-next branch at git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Due to dependencies * branch 'mlx5_ib_qp_refactor_1': (66 commits) RDMA/mlx5: Process all vendor flags in one place RDMA/mlx5: Return all configured create flags through query QP RDMA/mlx5: Change scatter CQE flag to be set like other vendor flags RDMA/mlx5: Use flags_en mechanism to mark QP created with WQE signature RDMA/mlx5: Process create QP flags in one place RDMA/mlx5: Delete create QP flags obfuscation RDMA/mlx5: Initial separation of RAW_PACKET QP from common flow RDMA/mlx5: Remove second copy from user for non RSS RAW QPs RDMA/mlx5: Move DRIVER QP flags check into separate function RDMA/mlx5: Update all DRIVER QP places to use QP subtype RDMA/mlx5: Split scatter CQE configuration for DCT QP RDMA/mlx5: Separate create QP flows to be based on type RDMA/mlx5: Set QP subtype immediately when it is known RDMA/mlx5: Avoid setting redundant NULL for XRC QPs RDMA/mlx5: Prepare QP allocation for future removal RDMA/mlx5: Perform check if QP creation flow is valid RDMA/mlx5: Delete impossible GSI port check RDMA/mlx5: Organize QP types checks in one place Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-28 21:44:51 -03:00
Leon Romanovsky	37518fa49f	RDMA/mlx5: Process all vendor flags in one place Check that vendor flags provided through ucmd are valid. Link: https://lore.kernel.org/r/20200427154636.381474-19-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-28 20:42:24 -03:00
Leon Romanovsky	a8f3ea61e1	RDMA/mlx5: Return all configured create flags through query QP The "flags" field in struct mlx5_ib_qp contains all UAPI flags configured at the create QP stage. Return all the data as is without masking. Link: https://lore.kernel.org/r/20200427154636.381474-18-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-28 20:42:24 -03:00
Leon Romanovsky	90ecb37a75	RDMA/mlx5: Change scatter CQE flag to be set like other vendor flags In similar way to wqe_sig, the scat_cqe was treated differently from other create QP vendor flags. Change it to be similar to other flags and use flags_en mechanism. Link: https://lore.kernel.org/r/20200427154636.381474-17-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-28 20:42:23 -03:00
Leon Romanovsky	c95e6d5397	RDMA/mlx5: Use flags_en mechanism to mark QP created with WQE signature MLX5_QP_FLAG_SIGNATURE is exposed to the users but in the kernel the create_qp flow treated it differently from other MLX5_QP_FLAG_*s. Fix it by ditching wq_sig boolean variable and use general flag_en mechanism. Link: https://lore.kernel.org/r/20200427154636.381474-16-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-28 20:42:23 -03:00
Leon Romanovsky	2978975ce7	RDMA/mlx5: Process create QP flags in one place create_flags is checked in too many places and scattered across all the code, consolidate all the checks inside one function, so we will be easily see the flow. As part of such change, delete unreachable code, because IB/core is responsible sanitize the input. Link: https://lore.kernel.org/r/20200427154636.381474-15-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-28 20:42:23 -03:00
Leon Romanovsky	2be08c308f	RDMA/mlx5: Delete create QP flags obfuscation There is no point in redefinition of stable and exposed to users create flags. Their values won't be changed and it is equal to used by the mlx5. Delete the mlx5 definitions and use IB/core fields. Link: https://lore.kernel.org/r/20200427154636.381474-14-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-28 20:42:23 -03:00
Leon Romanovsky	5d0dc3d96c	RDMA/mlx5: Initial separation of RAW_PACKET QP from common flow Create initial function for IB_QPT_RAW_PACKET flow. Link: https://lore.kernel.org/r/20200427154636.381474-13-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-28 20:42:22 -03:00
Leon Romanovsky	2dfac92dbb	RDMA/mlx5: Remove second copy from user for non RSS RAW QPs Change the common code to use already copied user command buffer. Link: https://lore.kernel.org/r/20200427154636.381474-12-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-28 20:42:22 -03:00
Leon Romanovsky	2fdddbd5c9	RDMA/mlx5: Move DRIVER QP flags check into separate function Perform validation of DRIVER QP in relevant function. Link: https://lore.kernel.org/r/20200427154636.381474-11-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-28 20:42:22 -03:00
Leon Romanovsky	8bde2c509e	RDMA/mlx5: Update all DRIVER QP places to use QP subtype Instead of overwriting QP init attributes with driver QP subtype, use that subtype directly. This change will allow us to remove logic which cached QP init attributes. Link: https://lore.kernel.org/r/20200427154636.381474-10-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-28 20:42:22 -03:00
Leon Romanovsky	fd9dab7edc	RDMA/mlx5: Split scatter CQE configuration for DCT QP DCT QPs have separate creation flow and can be easily extracted from configure_responder_scat_cqe(), this makes both updated functions more clear. Link: https://lore.kernel.org/r/20200427154636.381474-9-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-28 20:42:22 -03:00
Leon Romanovsky	47c806121a	RDMA/mlx5: Separate create QP flows to be based on type Move driver QP creation flow to separate functions to simplify the create_qp() and allow future separation of create_qp_common() to subtypes. Link: https://lore.kernel.org/r/20200427154636.381474-8-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-28 20:42:21 -03:00
Leon Romanovsky	318d2b06fb	RDMA/mlx5: Set QP subtype immediately when it is known There is no need to delay QP subtype assignment to the end of the create_qp() function and it is better to move it to be immediately after it is checked so we would be able to rewrite later checks to be based on it and not on over-written struct ib_qp_init_attr. Link: https://lore.kernel.org/r/20200427154636.381474-7-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-28 20:42:21 -03:00
Leon Romanovsky	c86936e6eb	RDMA/mlx5: Avoid setting redundant NULL for XRC QPs There is no need to set NULL in recv_cq and send_cq, they are already set to NULL by the IB/core logic. Link: https://lore.kernel.org/r/20200427154636.381474-6-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-28 20:42:21 -03:00
Leon Romanovsky	9c2ba4ede4	RDMA/mlx5: Prepare QP allocation for future removal Unify the QP memory allocation across different paths, so it will be in one place. Link: https://lore.kernel.org/r/20200427154636.381474-5-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-28 20:42:21 -03:00
Leon Romanovsky	2242cc25ce	RDMA/mlx5: Perform check if QP creation flow is valid Fast check that kernel and user flows provides enough data to create QP. Link: https://lore.kernel.org/r/20200427154636.381474-4-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-28 20:42:20 -03:00
Leon Romanovsky	1265d9f7a5	RDMA/mlx5: Delete impossible GSI port check GSI QP is created in the kernel with very strict parameters, there is no possible way that port number will be wrong in such flow. Link: https://lore.kernel.org/r/20200427154636.381474-3-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-28 20:42:20 -03:00
Leon Romanovsky	6eb7edffb2	RDMA/mlx5: Organize QP types checks in one place Perform check if QP type is supported in one place at the beginning of the create_qp function instead of current implementation with checks buried inside of the code. Link: https://lore.kernel.org/r/20200427154636.381474-2-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-28 20:42:20 -03:00
Raed Salem	244faedfd4	net/mlx5: Refactor imm_inval_pkey field in cqe struct The imm_inval_pkey field can hold four different types of data, depends on the usage, the data could be one of the below: - Immediate field of the received message - Invalidate rkey - Pkey of the packet - Flow table metadata Current implementation doesn't reflect the intended usage of the field at usage time. Reflect the different types by replace this field with a union, modify code where this field is used to reflect its intended usage. Signed-off-by: Raed Salem <raeds@mellanox.com> Reviewed-by: Huy Nguyen <huyn@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-04-28 12:45:15 -07:00
Erez Shitrit	dff8e2d152	net/mlx5: Use aligned variable while allocating ICM memory The alignment value is part of the input structure, so use it and spare extra memory allocation when is not needed. Now, using the new ability when allocating icm for Direct-Rule insertion. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-04-28 12:45:10 -07:00
Huy Nguyen	d65dbedfd2	net/mlx5: Add support for COPY steering action Add COPY type to modify_header action. IPsec feature is the first feature that needs COPY steering action. Signed-off-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Raed Salem <raeds@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Leon Romanovsky <leonro@mellanox.com>	2020-04-28 12:44:44 -07:00
Lang Cheng	a97bf49f82	RDMA/hns: Simplify the status judgment code of hns_roce_v1_m_qp() Use status table to reduce cyclomatic complexity. Link: https://lore.kernel.org/r/1586938475-37049-7-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-24 10:19:11 -03:00
Lang Cheng	357f342946	RDMA/hns: Simplify the state judgment code of qp Use state table to make the qp state migrate code more readable. Link: https://lore.kernel.org/r/1586938475-37049-6-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-24 10:19:11 -03:00
Lang Cheng	7c044adca2	RDMA/hns: Simplify the cqe code of poll cq Encapsulate codes to get status of cqe into a function and use map table instead of switch-case to reduce cyclomatic complexity of hns_roce_v2_poll_one(). Link: https://lore.kernel.org/r/1586938475-37049-5-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-24 10:19:10 -03:00
Lang Cheng	a3de9e8381	RDMA/hns: Simplify the qp state convert code Use type map table to reduce the cyclomatic complexity. Link: https://lore.kernel.org/r/1586938475-37049-4-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-24 10:19:10 -03:00
Lijun Ou	375898e83d	RDMA/hns: Optimize hns_roce_v2_set_mac() Removes the unnecessary memset opertaion and adjust style of some lines in hns_roce_v2_set_mac(). Link: https://lore.kernel.org/r/1586938475-37049-3-git-send-email-liweihang@huawei.com Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-24 10:19:10 -03:00
Lijun Ou	9976ea27b5	RDMA/hns: Optimize hns_roce_config_link_table() Remove the unnecessary memset operation and adjust style of some lines in hns_roce_config_link_table(). Link: https://lore.kernel.org/r/1586938475-37049-2-git-send-email-liweihang@huawei.com Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-24 10:19:10 -03:00
Leon Romanovsky	e0b4b4722d	net/mlx5: Update transobj.c new cmd interface Do mass update of transobj.c to reuse newly introduced mlx5_cmd_exec_in*() interfaces. Signed-off-by: Leon Romanovsky <leonro@mellanox.com>	2020-04-23 21:42:16 +03:00
Leon Romanovsky	5d1c9a114a	net/mlx5: Update vport.c to new cmd interface Do mass update of vport.c to reuse newly introduced mlx5_cmd_exec_in*() interfaces. Signed-off-by: Leon Romanovsky <leonro@mellanox.com>	2020-04-23 21:42:02 +03:00
Leon Romanovsky	322f3d45a1	RDMA/bnxt: Delete 'nq_ptr' variable which is not used The variable "nq_ptr" is set but never used, this generates the following warning while compiling kernel with W=1 option. drivers/infiniband/hw/bnxt_re/qplib_fp.c: In function 'bnxt_qplib_service_nq': drivers/infiniband/hw/bnxt_re/qplib_fp.c:303:25: warning: variable 'nq_ptr' set but not used [-Wunused-but-set-variable] 303 \| struct nq_base nqe, *nq_ptr; \| Fixes: `fddcbbb02a` ("RDMA/bnxt_re: Simplify obtaining queue entry from hw ring") Link: https://lore.kernel.org/r/20200419132046.123887-1-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-22 17:05:13 -03:00
Xi Wang	744b7bdfa7	RDMA/hns: Support 0 hop addressing for CQE buffer Add the zero hop addressing support by using mtr interface for CQE buffer, so the hns driver can support addressing hopnum between 0 to 3 for CQE. Link: https://lore.kernel.org/r/1586779091-51410-7-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-22 16:22:11 -03:00
Xi Wang	6fd610c573	RDMA/hns: Support 0 hop addressing for SRQ buffer Add the zero hop addressing support by using mtr interface for SRQ buffer, so the hns driver can support addressing hopnum between 0 to 3 for SRQ. Link: https://lore.kernel.org/r/1586779091-51410-6-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-22 16:21:37 -03:00
Xi Wang	d563099e3e	RDMA/hns: Support 0 hop addressing for WQE buffer Add the zero hop addressing support by using new mtr interface for WQE buffer and simple mtr invoking process, so WQE buffer can support hopnum between 0 to 3. Link: https://lore.kernel.org/r/1586779091-51410-5-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-22 15:59:54 -03:00
Xi Wang	477a0a3870	RDMA/hns: Optimize 0 hop addressing for EQE buffer Use the new mtr interface to simple the hop 0 addressing and multihop addressing process. Link: https://lore.kernel.org/r/1586779091-51410-4-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-22 15:59:54 -03:00
Xi Wang	cc23267aed	RDMA/hns: Optimize hns buffer allocation flow When the value of nbufs is 1, the buffer is in direct mode, which may cause confusion. So optimizes current codes to make it easier to maintain and understand. Link: https://lore.kernel.org/r/1586779091-51410-3-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-22 15:59:54 -03:00
Xi Wang	3c873161a0	RDMA/hns: Add support for addressing when hopnum is 0 Currently, WQE and EQE table have already used the mtr interface to config and access memory by multi-hop addressing when hopnum is from 1 to 3. But if hopnum is 0, each table need write its own but repetitive logic, and many duplicate code exists in the mtr interfaces invoke process. So wraps the public logic as 3 functions: hns_roce_mtr_create(), hns_roce_mtr_destroy() and hns_roce_mtr_map() to support hopnum ranges from 0 to 3. In addition, makes the mtr interfaces easier to use. Link: https://lore.kernel.org/r/1586779091-51410-2-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-22 15:59:53 -03:00
Aharon Landau	2d7e3ff7b6	RDMA/mlx5: Set GRH fields in query QP on RoCE GRH fields such as sgid_index, hop limit, et. are set in the QP context when QP is created/modified. Currently, when query QP is performed, we fill the GRH fields only if the GRH bit is set in the QP context, but this bit is not set for RoCE. Adjust the check so we will set all relevant data for the RoCE too. Since this data is returned to userspace, the below is an ABI regression. Fixes: `d8966fcd4c` ("IB/core: Use rdma_ah_attr accessor functions") Link: https://lore.kernel.org/r/20200413132028.930109-1-leon@kernel.org Signed-off-by: Aharon Landau <aharonl@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-22 15:43:46 -03:00
Leon Romanovsky	333fbaa025	net/mlx5: Move QP logic to mlx5_ib The mlx5_core doesn't need any functionality coded in qp.c, so move that file to drivers/infiniband/ be under mlx5_ib responsibility. Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>	2020-04-19 15:53:21 +03:00
Leon Romanovsky	42f9bbd112	RDMA/mlx5: Alphabetically sort build artifacts Sort .o objects in makefile to make addition of new object less cumbersome. Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>	2020-04-19 15:53:20 +03:00
Leon Romanovsky	9c275ee4ad	net/mlx5: Delete not-used cmd header The structures defined in the cmd header are not used and can be safely removed from the driver. This patch removes that file and deletes all relevant includes. Signed-off-by: Leon Romanovsky <leonro@mellanox.com>	2020-04-19 15:53:20 +03:00
Leon Romanovsky	bfd745f8f3	RDMA/mlx5: Delete Q counter allocations command Remove mlx5_ib implementation of Q counter allocation logic together with cleaning boolean which controlled validity of the counter. It is not needed, because counter_id == 0 means that counter is not valid. Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>	2020-04-19 15:53:20 +03:00
Leon Romanovsky	66247fbb28	net/mlx5: Remove Q counter low level helper APIs mlx5 core users are encouraged to use low level API (mlx5_cmd_exec) without the need of helper functions, do this for q counters, remove helper functions and call mlx5_cmd_exec directly from users. This will help reduce the total amount of code and reduction of the mlx5_core symbol table. Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>	2020-04-19 15:53:20 +03:00
Alaa Hleihel	c08cfb2d8d	RDMA/mlx4: Initialize ib_spec on the stack Initialize ib_spec on the stack before using it, otherwise we will have garbage values that will break creating default rules with invalid parsing error. Fixes: `a37a1a4284` ("IB/mlx4: Add mechanism to support flow steering over IB links") Link: https://lore.kernel.org/r/20200413132235.930642-1-leon@kernel.org Signed-off-by: Alaa Hleihel <alaa@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-14 20:42:31 -03:00
Devesh Sharma	8ce111d00e	RDMA/bnxt_re: Remove dead code from rcfw In the previous refactoring serise there were few leftover functions which are not is use anymore. Removed them as it is a dead code. Fixes: `6f53196bc5` ("RDMA/bnxt_re: Refactor doorbell management functions") Link: https://lore.kernel.org/r/1585851136-2316-5-git-send-email-devesh.sharma@broadcom.com Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-14 16:39:35 -03:00
Devesh Sharma	fddcbbb02a	RDMA/bnxt_re: Simplify obtaining queue entry from hw ring Restructring the data path and control path queue management code to simplify the way a queue element is extracted from the hardware ring. Introduced a new function which will give a pointer to the next ring item depending upon the current cons/prod index in the hardware queue. Further, there are hardcoding when size of queue entry is calculated, replacing it with an inline function. This function would be easier to expand if need going forward. The code section to initialize the PSN search areas has also been restructured and couple of functions has been added there. Link: https://lore.kernel.org/r/1585851136-2316-4-git-send-email-devesh.sharma@broadcom.com Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-14 16:39:35 -03:00
Devesh Sharma	c78671a4e6	RDMA/bnxt_re: Update missing hsi data structures Adding fast path support data structure into hardware HSI. These structures are header only definition of RQE/SRQE/SQE. This is to help calculating the size of hardware wqe size. Link: https://lore.kernel.org/r/1585851136-2316-3-git-send-email-devesh.sharma@broadcom.com Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-14 16:39:35 -03:00
Devesh Sharma	99bf84e24e	RDMA/bnxt_re: Reduce device page size detection code Getting rid of the repeated code in the driver when deciding on the page size of the hardware ring memory. A new common function would translate the ring page size into device specific page size. Link: https://lore.kernel.org/r/1585851136-2316-2-git-send-email-devesh.sharma@broadcom.com Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-14 16:39:34 -03:00
Zou Wei	4f95308911	IB/qib: Remove unused variable ret This patch fixes below warnings reported by coccicheck drivers/infiniband/hw/qib/qib_iba7322.c:6878:8-11: Unneeded variable: "ret". Return "0" on line 6907 drivers/infiniband/hw/qib/qib_iba7322.c:2378:5-8: Unneeded variable: "ret". Return "0" on line 2513 Link: https://lore.kernel.org/r/1586745724-107477-1-git-send-email-zou_wei@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zou Wei <zou_wei@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-14 16:34:17 -03:00
Yishai Hadas	cf26deff90	RDMA/mlx5: Fix udata response upon SRQ creation Fix udata response upon SRQ creation to use the UAPI structure (i.e. mlx5_ib_create_srq_resp). It did not zero the reserved field in userspace. Fixes: `e126ba97db` ("mlx5: Add driver for Mellanox Connect-IB adapters") Link: https://lore.kernel.org/r/20200406173540.1466477-1-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-14 15:56:34 -03:00
Colin Ian King	f70968f05d	i40iw: fix null pointer dereference on a null wqe pointer Currently the null check for wqe is incorrect and lets a null wqe be passed to set_64bit_val and this indexes into the null pointer causing a null pointer dereference. Fix this by fixing the null pointer check to return an error if wqe is null. Link: https://lore.kernel.org/r/20200401224921.405279-1-colin.king@canonical.com Addresses-Coverity: ("dereference after a null check") Fixes: `4b34e23f4e` ("i40iw: Report correct firmware version") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-04-14 15:44:50 -03:00
Linus Torvalds	919dce2470	RDMA 5.7 pull request The majority of the patches are cleanups, refactorings and clarity improvements - Various driver updates for siw, bnxt_re, rxe, efa, mlx5, hfi1 - Lots of cleanup patches for hns - Convert more places to use refcount - Aggressively lock the RDMA CM code that syzkaller says isn't working - Work to clarify ib_cm - Use the new ib_device lifecycle model in bnxt_re - Fix mlx5's MR cache which seems to be failing more often with the new ODP code - mlx5 'dynamic uar' and 'tx steering' user interfaces -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl6CSr0ACgkQOG33FX4g mxrtKg//XovbOfYAO7nC05FtGz9iEkIUBiwQOjUojgNSi6RMNDqRW1bmqKUugm1o 9nXA6tw+fueEvUNSD541SCcxkUZJzWvubO9wHB6N3Fgy68N3Vf2rKV3EBBTh99rK Cb7rnmTTN6izRyI1wdyP2sjDJGyF8zvsgIrG2sibzLnqlyScrnD98YS0FdPZUfOa a1mmXBN/T7eaQ4TbE3lLLzGnifRlGmZ5vxEvOQmAlOdqAfIKQdbbW7oCRLVIleso gfQlOOvIgzHRwQ3VrFa3i6ETYtywXq7EgmQxCjqPVJQjCA79n5TLBkP1iRhvn8xi 3+LO4YCkiSJ/NjTA2d9KwT6K4djj3cYfcueuqo2MoXXr0YLiY6TLv1OffKcUIq7c LM3d4CSwIAG+C2FZwaQrdSGa2h/CNfLAEeKxv430zggeDNKlwHJPV5w3rUJ8lT56 wlyT7Lzosl0O9Z/1BSLYckTvbBCtYcmanVyCfHG8EJKAM1/tXy5LS8btJ3e51rPu XekR9ELrTTA2CTuoSCQGP6J0dBD2U7qO4XRCQ9N5BYLrI6RdP7Z4xYzzey49Z3Cx JaF86eurM7nS5biUszTtwww8AJMyYicB+0VyjBfk+mhv90w8tS1vZ1aZKzaQ1L6Z jWn8WgIN4rWY0YGQs6PiovT1FplyGs3p1wNmjn92WO0wZZ3WsmQ= =ae+a -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull rdma updates from Jason Gunthorpe: "The majority of the patches are cleanups, refactorings and clarity improvements. This cycle saw some more activity from Syzkaller, I think we are now clean on all but one of those bugs, including the long standing and obnoxious rdma_cm locking design defect. Continue to see many drivers getting cleanups, with a few new user visible features. Summary: - Various driver updates for siw, bnxt_re, rxe, efa, mlx5, hfi1 - Lots of cleanup patches for hns - Convert more places to use refcount - Aggressively lock the RDMA CM code that syzkaller says isn't working - Work to clarify ib_cm - Use the new ib_device lifecycle model in bnxt_re - Fix mlx5's MR cache which seems to be failing more often with the new ODP code - mlx5 'dynamic uar' and 'tx steering' user interfaces" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (144 commits) RDMA/bnxt_re: make bnxt_re_ib_init static IB/qib: Delete struct qib_ivdev.qp_rnd RDMA/hns: Fix uninitialized variable bug RDMA/hns: Modify the mask of QP number for CQE of hip08 RDMA/hns: Reduce the maximum number of extend SGE per WQE RDMA/hns: Reduce PFC frames in congestion scenarios RDMA/mlx5: Add support for RDMA TX flow table net/mlx5: Add support for RDMA TX steering IB/hfi1: Call kobject_put() when kobject_init_and_add() fails IB/hfi1: Fix memory leaks in sysfs registration and unregistration IB/mlx5: Move to fully dynamic UAR mode once user space supports it IB/mlx5: Limit the scope of struct mlx5_bfreg_info to mlx5_ib IB/mlx5: Extend QP creation to get uar page index from user space IB/mlx5: Extend CQ creation to get uar page index from user space IB/mlx5: Expose UAR object and its alloc/destroy commands IB/hfi1: Get rid of a warning RDMA/hns: Remove redundant judgment of qp_type RDMA/hns: Remove redundant assignment of wc->smac when polling cq RDMA/hns: Remove redundant qpc setup operations RDMA/hns: Remove meaningless prints ...	2020-04-01 18:18:18 -07:00
Linus Torvalds	29d9f30d4c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from David Miller: "Highlights: 1) Fix the iwlwifi regression, from Johannes Berg. 2) Support BSS coloring and 802.11 encapsulation offloading in hardware, from John Crispin. 3) Fix some potential Spectre issues in qtnfmac, from Sergey Matyukevich. 4) Add TTL decrement action to openvswitch, from Matteo Croce. 5) Allow paralleization through flow_action setup by not taking the RTNL mutex, from Vlad Buslov. 6) A lot of zero-length array to flexible-array conversions, from Gustavo A. R. Silva. 7) Align XDP statistics names across several drivers for consistency, from Lorenzo Bianconi. 8) Add various pieces of infrastructure for offloading conntrack, and make use of it in mlx5 driver, from Paul Blakey. 9) Allow using listening sockets in BPF sockmap, from Jakub Sitnicki. 10) Lots of parallelization improvements during configuration changes in mlxsw driver, from Ido Schimmel. 11) Add support to devlink for generic packet traps, which report packets dropped during ACL processing. And use them in mlxsw driver. From Jiri Pirko. 12) Support bcmgenet on ACPI, from Jeremy Linton. 13) Make BPF compatible with RT, from Thomas Gleixnet, Alexei Starovoitov, and your's truly. 14) Support XDP meta-data in virtio_net, from Yuya Kusakabe. 15) Fix sysfs permissions when network devices change namespaces, from Christian Brauner. 16) Add a flags element to ethtool_ops so that drivers can more simply indicate which coalescing parameters they actually support, and therefore the generic layer can validate the user's ethtool request. Use this in all drivers, from Jakub Kicinski. 17) Offload FIFO qdisc in mlxsw, from Petr Machata. 18) Support UDP sockets in sockmap, from Lorenz Bauer. 19) Fix stretch ACK bugs in several TCP congestion control modules, from Pengcheng Yang. 20) Support virtual functiosn in octeontx2 driver, from Tomasz Duszynski. 21) Add region operations for devlink and use it in ice driver to dump NVM contents, from Jacob Keller. 22) Add support for hw offload of MACSEC, from Antoine Tenart. 23) Add support for BPF programs that can be attached to LSM hooks, from KP Singh. 24) Support for multiple paths, path managers, and counters in MPTCP. From Peter Krystad, Paolo Abeni, Florian Westphal, Davide Caratti, and others. 25) More progress on adding the netlink interface to ethtool, from Michal Kubecek" * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2121 commits) net: ipv6: rpl_iptunnel: Fix potential memory leak in rpl_do_srh_inline cxgb4/chcr: nic-tls stats in ethtool net: dsa: fix oops while probing Marvell DSA switches net/bpfilter: remove superfluous testing message net: macb: Fix handling of fixed-link node net: dsa: ksz: Select KSZ protocol tag netdevsim: dev: Fix memory leak in nsim_dev_take_snapshot_write net: stmmac: add EHL 2.5Gbps PCI info and PCI ID net: stmmac: add EHL PSE0 & PSE1 1Gbps PCI info and PCI ID net: stmmac: create dwmac-intel.c to contain all Intel platform net: dsa: bcm_sf2: Support specifying VLAN tag egress rule net: dsa: bcm_sf2: Add support for matching VLAN TCI net: dsa: bcm_sf2: Move writing of CFP_DATA(5) into slicing functions net: dsa: bcm_sf2: Check earlier for FLOW_EXT and FLOW_MAC_EXT net: dsa: bcm_sf2: Disable learning for ASP port net: dsa: b53: Deny enslaving port 7 for 7278 into a bridge net: dsa: b53: Prevent tagged VLAN on port 7 for 7278 net: dsa: b53: Restore VLAN entries upon (re)configuration net: dsa: bcm_sf2: Fix overflow checks hv_netvsc: Remove unnecessary round_up for recv_completion_cnt ...	2020-03-31 17:29:33 -07:00
Linus Torvalds	a776c270a0	Merge branch 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI updates from Ingo Molnar: "The EFI changes in this cycle are much larger than usual, for two (positive) reasons: - The GRUB project is showing signs of life again, resulting in the introduction of the generic Linux/UEFI boot protocol, instead of x86 specific hacks which are increasingly difficult to maintain. There's hope that all future extensions will now go through that boot protocol. - Preparatory work for RISC-V EFI support. The main changes are: - Boot time GDT handling changes - Simplify handling of EFI properties table on arm64 - Generic EFI stub cleanups, to improve command line handling, file I/O, memory allocation, etc. - Introduce a generic initrd loading method based on calling back into the firmware, instead of relying on the x86 EFI handover protocol or device tree. - Introduce a mixed mode boot method that does not rely on the x86 EFI handover protocol either, and could potentially be adopted by other architectures (if another one ever surfaces where one execution mode is a superset of another) - Clean up the contents of 'struct efi', and move out everything that doesn't need to be stored there. - Incorporate support for UEFI spec v2.8A changes that permit firmware implementations to return EFI_UNSUPPORTED from UEFI runtime services at OS runtime, and expose a mask of which ones are supported or unsupported via a configuration table. - Partial fix for the lack of by-VA cache maintenance in the decompressor on 32-bit ARM. - Changes to load device firmware from EFI boot service memory regions - Various documentation updates and minor code cleanups and fixes" * 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (114 commits) efi/libstub/arm: Fix spurious message that an initrd was loaded efi/libstub/arm64: Avoid image_base value from efi_loaded_image partitions/efi: Fix partition name parsing in GUID partition entry efi/x86: Fix cast of image argument efi/libstub/x86: Use ULONG_MAX as upper bound for all allocations efi: Fix a mistype in comments mentioning efivar_entry_iter_begin() efi/libstub: Avoid linking libstub/lib-ksyms.o into vmlinux efi/x86: Preserve %ebx correctly in efi_set_virtual_address_map() efi/x86: Ignore the memory attributes table on i386 efi/x86: Don't relocate the kernel unless necessary efi/x86: Remove extra headroom for setup block efi/x86: Add kernel preferred address to PE header efi/x86: Decompress at start of PE image load address x86/boot/compressed/32: Save the output address instead of recalculating it efi/libstub/x86: Deal with exit() boot service returning x86/boot: Use unsigned comparison for addresses efi/x86: Avoid using code32_start efi/x86: Make efi32_pe_entry() more readable efi/x86: Respect 32-bit ABI in efi32_pe_entry() efi/x86: Annotate the LOADED_IMAGE_PROTOCOL_GUID with SYM_DATA ...	2020-03-30 16:13:08 -07:00
YueHaibing	b4d8ddf835	RDMA/bnxt_re: make bnxt_re_ib_init static Fix sparse warning: drivers/infiniband/hw/bnxt_re/main.c:1313:5: warning: symbol 'bnxt_re_ib_init' was not declared. Should it be static? Link: https://lore.kernel.org/r/20200330110219.24448-1-yuehaibing@huawei.com Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-30 15:03:19 -03:00
Saeed Mahameed	e999a7343d	Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux * 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: mlx5: Remove uninitialized use of key in mlx5_core_create_mkey {IB,net}/mlx5: Move asynchronous mkey creation to mlx5_ib {IB,net}/mlx5: Assign mkey variant in mlx5_ib only {IB,net}/mlx5: Setup mkey variant before mr create command invocation Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-03-29 23:42:11 -07:00
David S. Miller	f0b5989745	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Minor comment conflict in mac80211. Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-29 21:25:29 -07:00
George Spelvin	3e87f43130	IB/qib: Delete struct qib_ivdev.qp_rnd I was checking the field to see if it needed the full get_random_bytes() and discovered it's unused. Only compile-tested, as I don't have the hardware, but I'm still pretty confident. Link: https://lore.kernel.org/r/202003281643.02SGh6eG002694@sdf.org Signed-off-by: George Spelvin <lkml@sdf.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-29 11:15:16 -03:00
Gustavo A. R. Silva	d35dc58dd2	RDMA/hns: Fix uninitialized variable bug There is a potential execution path in which variable ret is returned without being properly initialized, previously. Fix this by initializing variable ret to 0. Link: https://lore.kernel.org/r/20200328023539.GA32016@embeddedor Addresses-Coverity-ID: 1491917 ("Uninitialized scalar variable") Fixes: `2f49de21f3` ("RDMA/hns: Optimize mhop get flow for multi-hop addressing") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-29 11:09:55 -03:00
Lang Cheng	90e735aecc	RDMA/hns: Modify the mask of QP number for CQE of hip08 The hip08 supports up to 1M QPs, so the qpn mask of cqe should be modified. Link: https://lore.kernel.org/r/1585194018-4381-4-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-29 11:04:21 -03:00
Lang Cheng	019cd05ce5	RDMA/hns: Reduce the maximum number of extend SGE per WQE Just reduce the default number to 64 for backward compatibility, the driver can still get this configuration from the firmware. Link: https://lore.kernel.org/r/1585194018-4381-3-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-29 11:04:21 -03:00
Jihua Tao	9d04d56c47	RDMA/hns: Reduce PFC frames in congestion scenarios The original value means sending 16 packets at a time, and it should be configured to 0 which means sending 1 packet instead. It is modified to reduce the number of PFC frames to make sure the performance meets expectations when flow control is enabled on hip08. Link: https://lore.kernel.org/r/1585194018-4381-2-git-send-email-liweihang@huawei.com Signed-off-by: Jihua Tao <taojihua4@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-29 11:04:21 -03:00
Jason Gunthorpe	dbdf8909d0	Merge branch 'mlx5_tx_steering' into rdma.git for-next Leon Romanovsky says: ==================== Those two patches from Michael extends mlx5_core and mlx5_ib flow steering to support RDMA TX in similar way to already supported RDMA RX. ==================== Based on the mlx5-next branch at git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Due to dependencies * branch 'mlx5_tx_steering': RDMA/mlx5: Add support for RDMA TX flow table net/mlx5: Add support for RDMA TX steering	2020-03-27 13:26:59 -03:00
Michael Guralnik	af9c38411d	RDMA/mlx5: Add support for RDMA TX flow table Enable user application to add rules for RDMA TX steering table. Rules in this steering table will allow to steer transmitted RDMA traffic. Link: https://lore.kernel.org/r/20200324061425.1570190-3-leon@kernel.org Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-27 13:24:48 -03:00
Kaike Wan	dfb5394f80	IB/hfi1: Call kobject_put() when kobject_init_and_add() fails When kobject_init_and_add() returns an error in the function hfi1_create_port_files(), the function kobject_put() is not called for the corresponding kobject, which potentially leads to memory leak. This patch fixes the issue by calling kobject_put() even if kobject_init_and_add() fails. Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200326163813.21129.44280.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-27 13:13:36 -03:00
Kaike Wan	5c15abc432	IB/hfi1: Fix memory leaks in sysfs registration and unregistration When the hfi1 driver is unloaded, kmemleak will report the following issue: unreferenced object 0xffff8888461a4c08 (size 8): comm "kworker/0:0", pid 5, jiffies 4298601264 (age 2047.134s) hex dump (first 8 bytes): 73 64 6d 61 30 00 ff ff sdma0... backtrace: [<00000000311a6ef5>] kvasprintf+0x62/0xd0 [<00000000ade94d9f>] kobject_set_name_vargs+0x1c/0x90 [<0000000060657dbb>] kobject_init_and_add+0x5d/0xb0 [<00000000346fe72b>] 0xffffffffa0c5ecba [<000000006cfc5819>] 0xffffffffa0c866b9 [<0000000031c65580>] 0xffffffffa0c38e87 [<00000000e9739b3f>] local_pci_probe+0x41/0x80 [<000000006c69911d>] work_for_cpu_fn+0x16/0x20 [<00000000601267b5>] process_one_work+0x171/0x380 [<0000000049a0eefa>] worker_thread+0x1d1/0x3f0 [<00000000909cf2b9>] kthread+0xf8/0x130 [<0000000058f5f874>] ret_from_fork+0x35/0x40 This patch fixes the issue by: - Releasing dd->per_sdma[i].kobject in hfi1_unregister_sysfs(). - This will fix the memory leak. - Calling kobject_put() to unwind operations only for those entries in dd->per_sdma[] whose operations have succeeded (including the current one that has just failed) in hfi1_verbs_register_sysfs(). Cc: <stable@vger.kernel.org> Fixes: `0cb2aa690c` ("IB/hfi1: Add sysfs interface for affinity setup") Link: https://lore.kernel.org/r/20200326163807.21129.27371.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-27 13:13:36 -03:00
Yishai Hadas	0a2fd01c28	IB/mlx5: Move to fully dynamic UAR mode once user space supports it Move to fully dynamic UAR mode once user space supports it. In this case we prevent any legacy mode of UARs on the allocated context and prevent redundant allocation of the static ones. Link: https://lore.kernel.org/r/20200324060143.1569116-6-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Michael Guralnik <michaelgur@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-27 12:59:05 -03:00
Leon Romanovsky	2152862298	IB/mlx5: Limit the scope of struct mlx5_bfreg_info to mlx5_ib struct mlx5_bfreg_info is used by mlx5_ib only but is exposed to both RDMA and netdev parts of mlx5 driver. Move that struct to mlx5_ib namespace, clean vertical space alignment and convert lib_uar_4k from bool to bitfield. Link: https://lore.kernel.org/r/20200324060143.1569116-5-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-27 12:59:04 -03:00
Yishai Hadas	ac42a5ee92	IB/mlx5: Extend QP creation to get uar page index from user space Extend QP creation to get uar page index from user space, this mode can be used with the UAR dynamic mode APIs to allocate/destroy a UAR object. As part of enabling this option blocked the weird/un-supported cross channel option which uses index 0 hard-coded. This QP flag wasn't exposed to user space as part of any formal upstream release, the dynamic option can allow having valid UAR page index instead. Link: https://lore.kernel.org/r/20200324060143.1569116-4-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Michael Guralnik <michaelgur@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-27 12:59:04 -03:00
Yishai Hadas	64d99f6a62	IB/mlx5: Extend CQ creation to get uar page index from user space Extend CQ creation to get uar page index from user space, this mode can be used with the UAR dynamic mode APIs to allocate/destroy a UAR object. Link: https://lore.kernel.org/r/20200324060143.1569116-3-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Michael Guralnik <michaelgur@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-27 12:59:04 -03:00
Yishai Hadas	342ee59de9	IB/mlx5: Expose UAR object and its alloc/destroy commands Expose UAR object and its alloc/destroy commands to be used over the ioctl interface by user space applications. This API supports both BF & NC modes and enables a dynamic allocation of UARs once really needed. As the number of driver objects were limited by the core ones when the merged tree is prepared, had to decrease the number of core objects to enable the new UAR object usage. Link: https://lore.kernel.org/r/20200324060143.1569116-2-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Michael Guralnik <michaelgur@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-27 12:59:04 -03:00
Weihang Li	e0b0722643	RDMA/hns: Remove redundant judgment of qp_type Type of qp has been checked in check_send_valid(), so this judgment should be removed. Link: https://lore.kernel.org/r/1584674622-52773-11-git-send-email-liweihang@huawei.com Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-26 16:52:29 -03:00
Weihang Li	cd4a70bb7d	RDMA/hns: Remove redundant assignment of wc->smac when polling cq The field smac in ib_wc was used for create AH and then it will be treated as destination mac address in UD sqwqe, but related code about filling smac into AH has been removed in core. Actually, the dmac in UD sqwqe is parsed from the dgid in grh which is passed in by ULP now, so this assignment should be removed. Link: https://lore.kernel.org/r/1584674622-52773-10-git-send-email-liweihang@huawei.com Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-26 16:52:29 -03:00
Lang Cheng	f4c5d869c8	RDMA/hns: Remove redundant qpc setup operations Before calling modify_qp_reset_to_init(), the entire qpc mask has been cleared, so it is no longer necessary to clear the specific fields in the mask. Link: https://lore.kernel.org/r/1584674622-52773-9-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-26 16:52:28 -03:00
Wenpeng Liang	bceda6e67b	RDMA/hns: Remove meaningless prints ceq and aeq is a ring buffer, consumer index of them will be set to zero after reaching the maximum value. The warning should be removed or it may mislead the users. Link: https://lore.kernel.org/r/1584674622-52773-8-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-26 16:52:28 -03:00
Lang Cheng	f91b919687	RDMA/hns: Remove definition of cq doorbell structure The struct hns_roce_v2_cq_db is unused, it should be removed. Link: https://lore.kernel.org/r/1584674622-52773-7-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-26 16:52:28 -03:00
Lang Cheng	fd72926c33	RDMA/hns: Adjust the qp status value sequence of the hardware Interchange SQD and SQE to match the protocol. Link: https://lore.kernel.org/r/1584674622-52773-6-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-26 16:52:28 -03:00
Lijun Ou	99e713f8da	RDMA/hns: Optimize hns_roce_alloc_vf_resource() The capbilities of hardware should be got at first and then used in hns_roce_alloc_vf_resource(). Also removes an unnecessary if ... else condition in it. Link: https://lore.kernel.org/r/1584674622-52773-5-git-send-email-liweihang@huawei.com Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-26 16:52:27 -03:00
Lang Cheng	d398d4ca5f	RDMA/hns: Simplify attribute judgment code Combine attribute flags before masking them. Link: https://lore.kernel.org/r/1584674622-52773-4-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-26 16:52:27 -03:00
Weihang Li	30d41e18c3	RDMA/hns: Fix a wrong judgment of return value hns_roce_alloc_mtt_range() never return -1, ret should be checked whether it is zero instead of -1. Fixes: `1ceb0b11a8` ("RDMA/hns: Fix non-standard error codes") Link: https://lore.kernel.org/r/1584674622-52773-3-git-send-email-liweihang@huawei.com Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-26 16:52:27 -03:00
Lijun Ou	ae1c61489c	RDMA/hns: Unify format of prints Use ibdev_err/dbg/warn() instead of dev_err/dbg/warn(), and modify some prints into format of "failed to do something, ret = n". Link: https://lore.kernel.org/r/1584674622-52773-2-git-send-email-liweihang@huawei.com Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-26 16:52:26 -03:00
Takashi Iwai	23ab5261e2	IB/hfi1: Use scnprintf() for avoiding potential buffer overflow Since snprintf() returns the would-be-output size instead of the actual output size, the succeeding calls may go beyond the given buffer limit. Fix it by replacing with scnprintf(). Link: https://lore.kernel.org/r/20200319154641.23711-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-26 15:06:14 -03:00
Maor Gottlieb	ba80013fba	RDMA/mlx5: Block delay drop to unprivileged users It has been discovered that this feature can globally block the RX port, so it should be allowed for highly privileged users only. Fixes: 03404e8ae652("IB/mlx5: Add support to dropless RQ") Link: https://lore.kernel.org/r/20200322124906.1173790-1-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-25 09:56:30 -03:00
Yishai Hadas	1f3db16188	IB/mlx5: Generally use the WC auto detection test result Now that we have direct and reliable detection of WC support by the system, use is broadly. The only case we have to worry about is when the WC autodetector cannot run. For this fringe case generally assume that that WC is available, except in the well defined case of no PAT support on x86 which is tested by calling arch_can_pci_mmap_wc(). If WC is wrongly assumed to be available then it causes a small performance hit on paths in userspace that are tuned to the assumption that WC is available. There is no functional loss. It is very unlikely that any platforms exist that lack WC and also care about the micro optimization of WC in the fringe case where autodetection does not work. By removing the fairly bogus CONFIG tests this makes WC work broadly on all arches and all platforms. Link: https://lore.kernel.org/r/20200318100323.46659-1-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Michael Guralnik <michaelgur@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-24 20:22:21 -03:00
Xi Wang	38dcb35048	RDMA/hns: Optimize mhop put flow for multi-hop addressing Optimizes hns_roce_table_mhop_get() by encapsulating code about clearing hem into clear_mhop_hem(), which will make the code flow clearer. Link: https://lore.kernel.org/r/1584417324-2255-3-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-24 20:18:56 -03:00
Xi Wang	2f49de21f3	RDMA/hns: Optimize mhop get flow for multi-hop addressing Splits hns_roce_table_mhop_get() into 4 sub-functions to make the code flow clearer. Link: https://lore.kernel.org/r/1584417324-2255-2-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-24 20:18:56 -03:00
Selvin Xavier	b1d56fdcb6	RDMA/bnxt_re: Wait for all the CQ events before freeing CQ data structures Destroy CQ command to firmware returns the num_cnq_events as a response. This indicates the driver about the number of CQ events generated for this CQ. Driver should wait for all these events before freeing the CQ host structures. Also, add routine to clean all the pending notification for the CQs getting destroyed. This avoids the possibility of accessing the CQ data structures after its freed. Fixes: `1ac5a40479` ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Link: https://lore.kernel.org/r/1584120842-3200-1-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-24 20:15:36 -03:00
Leon Romanovsky	950bf4f177	RDMA/mlx5: Fix access to wrong pointer while performing flush due to error The main difference between send and receive SW completions is related to separate treatment of WQ queue. For receive completions, the initial index to be flushed is stored in "tail", while for send completions, it is in deleted "last_poll". CPU: 54 PID: 53405 Comm: kworker/u161:0 Kdump: loaded Tainted: G OE --------- -t - 4.18.0-147.el8.ppc64le #1 Workqueue: ib-comp-unb-wq ib_cq_poll_work [ib_core] NIP: c000003c7c00a000 LR: c00800000e586af4 CTR: c000003c7c00a000 REGS: c0000036cc9db940 TRAP: 0400 Tainted: G OE --------- -t - (4.18.0-147.el8.ppc64le) MSR: 9000000010009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 24004488 XER: 20040000 CFAR: c00800000e586af0 IRQMASK: 0 GPR00: c00800000e586ab4 c0000036cc9dbbc0 c00800000e5f1a00 c0000037d8433800 GPR04: c000003895a26800 c0000037293f2000 0000000000000201 0000000000000011 GPR08: c000003895a26c80 c000003c7c00a000 0000000000000000 c00800000ed30438 GPR12: c000003c7c00a000 c000003fff684b80 c00000000017c388 c00000396ec4be40 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: c00000000151e498 0000000000000010 c000003895a26848 0000000000000010 GPR24: 0000000000000010 0000000000010000 c000003895a26800 0000000000000000 GPR28: 0000000000000010 c0000037d8433800 c000003895a26c80 c000003895a26800 NIP [c000003c7c00a000] 0xc000003c7c00a000 LR [c00800000e586af4] __ib_process_cq+0xec/0x1b0 [ib_core] Call Trace: [c0000036cc9dbbc0] [c00800000e586ab4] __ib_process_cq+0xac/0x1b0 [ib_core] (unreliable) [c0000036cc9dbc40] [c00800000e586c88] ib_cq_poll_work+0x40/0xb0 [ib_core] [c0000036cc9dbc70] [c000000000171f44] process_one_work+0x2f4/0x5c0 [c0000036cc9dbd10] [c000000000172a0c] worker_thread+0xcc/0x760 [c0000036cc9dbdc0] [c00000000017c52c] kthread+0x1ac/0x1c0 [c0000036cc9dbe30] [c00000000000b75c] ret_from_kernel_thread+0x5c/0x80 Fixes: `8e3b688301` ("RDMA/mlx5: Delete unreachable handle_atomic code by simplifying SW completion") Link: https://lore.kernel.org/r/20200318091640.44069-1-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-24 19:54:57 -03:00
Dan Carpenter	a766fa8473	IB/mlx5: Fix a NULL vs IS_ERR() check The kzalloc() function returns NULL, not error pointers. Fixes: `30f2fe40c7` ("IB/mlx5: Introduce UAPIs to manage packet pacing") Link: https://lore.kernel.org/r/20200320132641.GF95012@mwanda Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-24 19:47:55 -03:00
Mike Marciniszyn	9a293d1e21	IB/hfi1: Ensure pq is not left on waitlist The following warning can occur when a pq is left on the dmawait list and the pq is then freed: WARNING: CPU: 47 PID: 3546 at lib/list_debug.c:29 __list_add+0x65/0xc0 list_add corruption. next->prev should be prev (ffff939228da1880), but was ffff939cabb52230. (next=ffff939cabb52230). Modules linked in: mmfs26(OE) mmfslinux(OE) tracedev(OE) 8021q garp mrp ib_isert iscsi_target_mod target_core_mod crc_t10dif crct10dif_generic opa_vnic rpcrdma ib_iser libiscsi scsi_transport_iscsi ib_ipoib(OE) bridge stp llc iTCO_wdt iTCO_vendor_support intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crct10dif_pclmul crct10dif_common crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd ast ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm pcspkr joydev drm_panel_orientation_quirks i2c_i801 mei_me lpc_ich mei wmi ipmi_si ipmi_devintf ipmi_msghandler nfit libnvdimm acpi_power_meter acpi_pad hfi1(OE) rdmavt(OE) rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core binfmt_misc numatools(OE) xpmem(OE) ip_tables nfsv3 nfs_acl nfs lockd grace sunrpc fscache igb ahci libahci i2c_algo_bit dca libata ptp pps_core crc32c_intel [last unloaded: i2c_algo_bit] CPU: 47 PID: 3546 Comm: wrf.exe Kdump: loaded Tainted: G W OE ------------ 3.10.0-957.41.1.el7.x86_64 #1 Hardware name: HPE.COM HPE SGI 8600-XA730i Gen10/X11DPT-SB-SG007, BIOS SBED1229 01/22/2019 Call Trace: [<ffffffff91f65ac0>] dump_stack+0x19/0x1b [<ffffffff91898b78>] __warn+0xd8/0x100 [<ffffffff91898bff>] warn_slowpath_fmt+0x5f/0x80 [<ffffffff91a1dabe>] ? ___slab_alloc+0x24e/0x4f0 [<ffffffff91b97025>] __list_add+0x65/0xc0 [<ffffffffc03926a5>] defer_packet_queue+0x145/0x1a0 [hfi1] [<ffffffffc0372987>] sdma_check_progress+0x67/0xa0 [hfi1] [<ffffffffc03779d2>] sdma_send_txlist+0x432/0x550 [hfi1] [<ffffffff91a20009>] ? kmem_cache_alloc+0x179/0x1f0 [<ffffffffc0392973>] ? user_sdma_send_pkts+0xc3/0x1990 [hfi1] [<ffffffffc0393e3a>] user_sdma_send_pkts+0x158a/0x1990 [hfi1] [<ffffffff918ab65e>] ? try_to_del_timer_sync+0x5e/0x90 [<ffffffff91a3fe1a>] ? __check_object_size+0x1ca/0x250 [<ffffffffc0395546>] hfi1_user_sdma_process_request+0xd66/0x1280 [hfi1] [<ffffffffc034e0da>] hfi1_aio_write+0xca/0x120 [hfi1] [<ffffffff91a4245b>] do_sync_readv_writev+0x7b/0xd0 [<ffffffff91a4409e>] do_readv_writev+0xce/0x260 [<ffffffff918df69f>] ? pick_next_task_fair+0x5f/0x1b0 [<ffffffff918db535>] ? sched_clock_cpu+0x85/0xc0 [<ffffffff91f6b16a>] ? __schedule+0x13a/0x860 [<ffffffff91a442c5>] vfs_writev+0x35/0x60 [<ffffffff91a4447f>] SyS_writev+0x7f/0x110 [<ffffffff91f78ddb>] system_call_fastpath+0x22/0x27 The issue happens when wait_event_interruptible_timeout() returns a value <= 0. In that case, the pq is left on the list. The code continues sending packets and potentially can complete the current request with the pq still on the dmawait list provided no descriptor shortage is seen. If the pq is torn down in that state, the sdma interrupt handler could find the now freed pq on the list with list corruption or memory corruption resulting. Fix by adding a flush routine to ensure that the pq is never on a list after processing a request. A follow-up patch series will address issues with seqlock surfaced in: https://lore.kernel.org/r/20200320003129.GP20941@ziepe.ca The seqlock use for sdma will then be converted to a spin lock since the list_empty() doesn't need the protection afforded by the sequence lock currently in use. Fixes: `a0d406934a` ("staging/rdma/hfi1: Add page lock limit check for SDMA requests") Link: https://lore.kernel.org/r/20200320200200.23203.37777.stgit@awfm-01.aw.intel.com Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-23 21:57:57 -03:00
Leon Romanovsky	fa8a44f6b2	RDMA/efa: Use in-kernel offsetofend() to check field availability Remove custom and duplicated variant of offsetofend(). Link: https://lore.kernel.org/r/20200310091438.248429-4-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-18 21:06:37 -03:00
Kaike Wan	5ab17a24cb	IB/hfi1: Remove kobj from hfi1_devdata The field kobj was added to hfi1_devdata structure to manage the life time of the hfi1_devdata structure for PSM accesses: commit `e11ffbd575` ("IB/hfi1: Do not free hfi1 cdev parent structure early") Later another mechanism user_refcount/user_comp was introduced to provide the same functionality: commit `acd7c8fe14` ("IB/hfi1: Fix an Oops on pci device force remove") This patch will remove this kobj field, as it is no longer needed. Link: https://lore.kernel.org/r/20200316210500.7753.4145.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-18 19:53:47 -03:00
Lang Cheng	026ded3734	RDMA/hns: Check if depth of qp is 0 before configure Depth of qp shouldn't be allowed to be set to zero, after ensuring that, subsequent process can be simplified. And when qp is changed from reset to reset, the capability of minimum qp depth was used to identify hardware of hip06, it should be changed into a more readable form. Link: https://lore.kernel.org/r/1584006624-11846-1-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-18 19:30:36 -03:00
Sindhu, Devale	4b34e23f4e	i40iw: Report correct firmware version The driver uses a hard-coded value for FW version and reports an inconsistent FW version between ibv_devinfo and /sys/class/infiniband/i40iw/fw_ver. Retrieve the FW version via a Control QP (CQP) operation and report it consistently across sysfs and query device. Fixes: `d374984179` ("i40iw: add files for iwarp interface") Link: https://lore.kernel.org/r/20200313214406.2159-1-shiraz.saleem@intel.com Reported-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: Sindhu, Devale <sindhu.devale@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-18 13:53:44 -03:00
Xi Wang	d6a3627e31	RDMA/hns: Optimize wqe buffer set flow for post send Splits hns_roce_v2_post_send() into three sub-functions: set_rc_wqe(), set_ud_wqe() and update_sq_db() to simplify the code. Link: https://lore.kernel.org/r/1583839084-31579-6-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-18 10:23:12 -03:00
Xi Wang	1133401412	RDMA/hns: Optimize base address table config flow for qp buffer Currently, before the qp is created, a page size needs to be calculated for the base address table to store all base addresses in the mtr. As a result, the parameter configuration of the mtr is complex. So integrate the process of calculating the base table page size into the hem related interface to simplify the process of using mtr. Link: https://lore.kernel.org/r/1583839084-31579-5-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-18 10:23:12 -03:00
Xi Wang	e363f7de4e	RDMA/hns: Optimize the wr opcode conversion from ib to hns Simplify the wr opcode conversion from ib to hns by using a map table instead of the switch-case statement. Link: https://lore.kernel.org/r/1583839084-31579-4-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-18 10:23:12 -03:00
Xi Wang	00a59d30f3	RDMA/hns: Optimize wqe buffer filling process for post send Encapsulates the wqe buffer process details for datagram seg, fast mr seg and atomic seg. Link: https://lore.kernel.org/r/1583839084-31579-3-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-18 10:23:12 -03:00
Xi Wang	6c6e39212b	RDMA/hns: Rename wqe buffer related functions There are serval global functions related to wqe buffer in the hns driver and are called in different files. These symbols cannot directly represent the namespace they belong to. So add prefix 'hns_roce_' to 3 wqe buffer related global functions: get_recv_wqe(), get_send_wqe(), and get_send_extend_sge(). Link: https://lore.kernel.org/r/1583839084-31579-2-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-18 10:23:11 -03:00
Selvin Xavier	4e88cef11d	RDMA/bnxt_re: Remove unnecessary sched count Since the lifetime of bnxt_re_task is controlled by the kref of device, sched_count is no longer required. Remove it. Link: https://lore.kernel.org/r/1584117207-2664-4-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-17 20:15:03 -03:00
Jason Gunthorpe	8a6c617047	RDMA/bnxt_re: Fix lifetimes in bnxt_re_task A work queue cannot just rely on the ib_device not being freed, it must hold a kref on the memory so that the BNXT_RE_FLAG_IBDEV_REGISTERED check works. Fixes: `1ac5a40479` ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Link: https://lore.kernel.org/r/1584117207-2664-3-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-17 20:15:03 -03:00
Jason Gunthorpe	3cae58047c	RDMA/bnxt_re: Use ib_device_try_get() There are a couple places in this driver running from a work queue that need the ib_device to be registered. Instead of using a broken internal bit rely on the new core code to guarantee device registration. Link: https://lore.kernel.org/r/1584117207-2664-2-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-17 20:15:03 -03:00
Weihang Li	9e57a9aa69	RDMA/hns: Fix wrong judgments of udata->outlen These judgments were used to keep the compatibility with older versions of userspace that don't have the field named "cap_flags" in structure hns_roce_ib_create_cq_resp. But it will be wrong to compare outlen with the size of resp if another new field were added in resp. oulen should be compared with the end offset of cap_flags in resp. Fixes: `4f8f0d5e33` ("RDMA/hns: Package the flow of creating cq") Link: https://lore.kernel.org/r/1583845569-47257-1-git-send-email-liweihang@huawei.com Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-13 11:36:58 -03:00
Jason Gunthorpe	d613bd64c6	Merge branch 'mlx5_mr_cache' into rdma.git for-next Leon Romanovsky says: ==================== This series fixes various corner cases in the mlx5_ib MR cache implementation, see specific commit messages for more information. ==================== Based on the mlx5-next branch at git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Due to dependencies * branch 'mlx5_mr-cache': RDMA/mlx5: Allow MRs to be created in the cache synchronously RDMA/mlx5: Revise how the hysteresis scheme works for cache filling RDMA/mlx5: Fix locking in MR cache work queue RDMA/mlx5: Lock access to ent->available_mrs/limit when doing queue_work RDMA/mlx5: Fix MR cache size and limit debugfs RDMA/mlx5: Always remove MRs from the cache before destroying them RDMA/mlx5: Simplify how the MR cache bucket is located RDMA/mlx5: Rename the tracking variables for the MR cache RDMA/mlx5: Replace spinlock protected write with atomic var {IB,net}/mlx5: Move asynchronous mkey creation to mlx5_ib {IB,net}/mlx5: Assign mkey variant in mlx5_ib only {IB,net}/mlx5: Setup mkey variant before mr create command invocation	2020-03-13 11:11:07 -03:00
Jason Gunthorpe	aad719dcf3	RDMA/mlx5: Allow MRs to be created in the cache synchronously If the cache is completely out of MRs, and we are running in cache mode, then directly, and synchronously, create an MR that is compatible with the cache bucket using a sleeping mailbox command. This ensures that the thread that is waiting for the MR absolutely will get one. When a MR allocated in this way becomes freed then it is compatible with the cache bucket and will be recycled back into it. Deletes the very buggy ent->compl scheme to create a synchronous MR allocation. Link: https://lore.kernel.org/r/20200310082238.239865-13-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-13 11:08:02 -03:00
Jason Gunthorpe	1c78a21a0c	RDMA/mlx5: Revise how the hysteresis scheme works for cache filling Currently if the work queue is running then it is in 'hysteresis' mode and will fill until the cache reaches the high water mark. This implicit state is very tricky and doesn't interact with pending very well. Instead of self re-scheduling the work queue after the add_keys() has started to create the new MR, have the queue scheduled from reg_mr_callback() only after the requested MR has been added. This avoids the bad design of an in-rush of queue'd work doing back to back add_keys() until EAGAIN then sleeping. The add_keys() will be paced one at a time as they complete, slowly filling up the cache. Also, fix pending to be only manipulated under lock. Link: https://lore.kernel.org/r/20200310082238.239865-12-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-13 11:08:02 -03:00
Jason Gunthorpe	b9358bdbc7	RDMA/mlx5: Fix locking in MR cache work queue All of the members of mlx5_cache_ent must be accessed while holding the spinlock, add the missing spinlock in the __cache_work_func(). Using cache->stopped and flush_workqueue() is an inherently racy way to shutdown self-scheduling work on a queue. Replace it with ent->disabled under lock, and always check disabled before queuing any new work. Use cancel_work_sync() to shutdown the queue. Use READ_ONCE/WRITE_ONCE for dev->last_add to manage concurrency as coherency is less important here. Split fill_delay from the bitfield. C bitfield updates are not atomic and this is just a mess. Use READ_ONCE/WRITE_ONCE, but this could also use test_bit()/set_bit(). Link: https://lore.kernel.org/r/20200310082238.239865-11-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-13 11:08:02 -03:00
Jason Gunthorpe	ad2d3ef46d	RDMA/mlx5: Lock access to ent->available_mrs/limit when doing queue_work Accesses to these members needs to be locked. There is no reason not to hold a spinlock while calling queue_work(), so move the tests into a helper and always call it under lock. The helper should be called when available_mrs is adjusted. Link: https://lore.kernel.org/r/20200310082238.239865-10-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-13 11:08:01 -03:00
Jason Gunthorpe	a1d8854aae	RDMA/mlx5: Fix MR cache size and limit debugfs The size_write function is supposed to adjust the total_mr's to match the user's request, but lacks locking and safety checking. total_mrs can only be adjusted by at most available_mrs. mrs already assigned to users cannot be revoked. Ensure that the user provides a target value within the range of available_mrs and within the high/low water mark. limit_write has confusing and wrong sanity checking, and doesn't have the ability to deallocate on limit reduction. Since both functions use the same algorithm to adjust the available_mrs, consolidate it into one function and write it correctly. Fix the locking and by holding the spinlock for all accesses to ent->X. Always fail if the user provides a malformed string. Fixes: `e126ba97db` ("mlx5: Add driver for Mellanox Connect-IB adapters") Link: https://lore.kernel.org/r/20200310082238.239865-9-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-13 11:08:01 -03:00
Jason Gunthorpe	1769c4c575	RDMA/mlx5: Always remove MRs from the cache before destroying them The cache bucket tracks the total number of MRs that exists, both inside and outside of the cache. Removing a MR from the cache (by setting cache_ent to NULL) without updating total_mrs will cause the tracking to leak and be inflated. Further fix the rereg_mr path to always destroy the MR. reg_create will always overwrite all the MR data in mlx5_ib_mr, so the MR must be completely destroyed, in all cases, before this function can be called. Detach the MR from the cache and unconditionally destroy it to avoid leaking HW mkeys. Fixes: `afd1417404` ("IB/mlx5: Use direct mkey destroy command upon UMR unreg failure") Fixes: `56e11d628c` ("IB/mlx5: Added support for re-registration of MRs") Link: https://lore.kernel.org/r/20200310082238.239865-8-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-13 11:08:01 -03:00
Jason Gunthorpe	b91e1751fb	RDMA/mlx5: Simplify how the MR cache bucket is located There are many bad APIs here that are accepting a cache bucket index instead of a bucket pointer. Many of the callers already have a bucket pointer, so this results in a lot of confusing uses of order2idx(). Pass the struct mlx5_cache_ent into add_keys(), remove_keys(), and alloc_cached_mr(). Once the MR is in the cache, store the cache bucket pointer directly in the MR, replacing the 'bool allocated_from cache'. In the end there is only one place that needs to form index from order, alloc_mr_from_cache(). Increase the safety of this function by disallowing it from accessing cache entries in the ODP special area. Link: https://lore.kernel.org/r/20200310082238.239865-7-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-13 11:08:01 -03:00
Jason Gunthorpe	7c8691a396	RDMA/mlx5: Rename the tracking variables for the MR cache The old names do not clearly indicate the intent. Link: https://lore.kernel.org/r/20200310082238.239865-6-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-13 11:08:01 -03:00
Saeed Mahameed	f743ff3b37	RDMA/mlx5: Replace spinlock protected write with atomic var mkey variant calculation was spinlock protected to make it atomic, replace that with one atomic variable. Link: https://lore.kernel.org/r/20200310082238.239865-4-leon@kernel.org Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-13 11:08:00 -03:00
Michael Guralnik	a3cfdd3928	{IB,net}/mlx5: Move asynchronous mkey creation to mlx5_ib As mlx5_ib is the only user of the mlx5_core_create_mkey_cb, move the logic inside mlx5_ib and cleanup the code in mlx5_core. Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>	2020-03-13 15:48:10 +02:00
Saeed Mahameed	fc6a9f86f0	{IB,net}/mlx5: Assign mkey variant in mlx5_ib only mkey variant is not required for mlx5_core use, move the mkey variant counter to mlx5_ib. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>	2020-03-13 15:48:04 +02:00
Saeed Mahameed	54c62e13ad	{IB,net}/mlx5: Setup mkey variant before mr create command invocation On reg_mr_callback() mlx5_ib is recalculating the mkey variant which is wrong and will lead to using a different key variant than the one submitted to firmware on create mkey command invocation. To fix this, we store the mkey variant before invoking the firmware command and use it later on completion (reg_mr_callback). Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>	2020-03-13 15:48:00 +02:00
Leon Romanovsky	a762d460a0	RDMA/mlx5: Use offsetofend() instead of duplicated variant Convert mlx5 driver to use offsetofend() instead of its duplicated variant. Link: https://lore.kernel.org/r/20200310091438.248429-5-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-13 10:45:12 -03:00
Leon Romanovsky	282e79c1c6	RDMA/mlx4: Delete duplicated offsetofend implementation Convert mlx4 to use in-kernel offsetofend() instead of its duplicated implementation. Link: https://lore.kernel.org/r/20200310091438.248429-3-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-13 10:42:48 -03:00
David S. Miller	1d34357931	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Minor overlapping changes, nothing serious. Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-12 22:34:48 -07:00
David S. Miller	bf3347c4d1	Merge branch 'ct-offload' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux	2020-03-12 12:34:23 -07:00
Alex Vesker	41e684ef3f	IB/mlx5: Replace tunnel mpls capability bits for tunnel_offloads Until now the flex parser capability was used in ib_query_device() to indicate tunnel_offloads_caps support for mpls_over_gre/mpls_over_udp. Newer devices and firmware will have configurations with the flexparser but without mpls support. Testing for the flex parser capability was a mistake, the tunnel_stateless capability was intended for detecting mpls and was introduced at the same time as the flex parser capability. Otherwise userspace will be incorrectly informed that a future device supports MPLS when it does not. Link: https://lore.kernel.org/r/20200305123841.196086-1-leon@kernel.org Cc: <stable@vger.kernel.org> # 4.17 Fixes: `e818e255a5` ("IB/mlx5: Expose MPLS related tunneling offloads") Signed-off-by: Alex Vesker <valex@mellanox.com> Reviewed-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-10 14:41:35 -03:00
Erez Shitrit	0897f301bc	RDMA/mlx5: Remove duplicate definitions of SW_ICM macros Those macros are already defined in include/linux/mlx5/driver.h, so delete their duplicate variants. Link: https://lore.kernel.org/r/20200310075706.238592-1-leon@kernel.org Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-10 14:39:09 -03:00
Mark Zhang	ec16b6bbda	RDMA/mlx5: Fix the number of hwcounters of a dynamic counter When we read the global counter and there's any dynamic counter allocated, the value of a hwcounter is the sum of the default counter and all dynamic counters. So the number of hwcounters of a dynamically allocated counter must be same as of the default counter, otherwise there will be read violations. This fixes the KASAN slab-out-of-bounds bug: BUG: KASAN: slab-out-of-bounds in rdma_counter_get_hwstat_value+0x36d/0x390 [ib_core] Read of size 8 at addr ffff8884192a5778 by task rdma/10138 CPU: 7 PID: 10138 Comm: rdma Not tainted 5.5.0-for-upstream-dbg-2020-02-06_18-30-19-27 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0xb7/0x10b print_address_description.constprop.4+0x1e2/0x400 ? rdma_counter_get_hwstat_value+0x36d/0x390 [ib_core] __kasan_report+0x15c/0x1e0 ? mlx5_ib_query_q_counters+0x13f/0x270 [mlx5_ib] ? rdma_counter_get_hwstat_value+0x36d/0x390 [ib_core] kasan_report+0xe/0x20 rdma_counter_get_hwstat_value+0x36d/0x390 [ib_core] ? rdma_counter_query_stats+0xd0/0xd0 [ib_core] ? memcpy+0x34/0x50 ? nla_put+0xe2/0x170 nldev_stat_get_doit+0x9c7/0x14f0 [ib_core] ... do_syscall_64+0x95/0x490 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7fcc457fe65a Code: bb 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 8b 05 fa f1 2b 00 45 89 c9 4c 63 d1 48 63 ff 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 76 f3 c3 0f 1f 40 00 41 55 41 54 4d 89 c5 55 RSP: 002b:00007ffc0586f868 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fcc457fe65a RDX: 0000000000000020 RSI: 00000000013db920 RDI: 0000000000000003 RBP: 00007ffc0586fa90 R08: 00007fcc45ac10e0 R09: 000000000000000c R10: 0000000000000000 R11: 0000000000000246 R12: 00000000004089c0 R13: 0000000000000000 R14: 00007ffc0586fab0 R15: 00000000013dc9a0 Allocated by task 9700: save_stack+0x19/0x80 __kasan_kmalloc.constprop.7+0xa0/0xd0 mlx5_ib_counter_alloc_stats+0xd1/0x1d0 [mlx5_ib] rdma_counter_alloc+0x16d/0x3f0 [ib_core] rdma_counter_bind_qpn_alloc+0x216/0x4e0 [ib_core] nldev_stat_set_doit+0x8c2/0xb10 [ib_core] rdma_nl_rcv_msg+0x3d2/0x730 [ib_core] rdma_nl_rcv+0x2a8/0x400 [ib_core] netlink_unicast+0x448/0x620 netlink_sendmsg+0x731/0xd10 sock_sendmsg+0xb1/0xf0 __sys_sendto+0x25d/0x2c0 __x64_sys_sendto+0xdd/0x1b0 do_syscall_64+0x95/0x490 entry_SYSCALL_64_after_hwframe+0x49/0xbe Fixes: `18d422ce8c` ("IB/mlx5: Add counter_alloc_stats() and counter_update_stats() support") Link: https://lore.kernel.org/r/20200305124052.196688-1-leon@kernel.org Signed-off-by: Mark Zhang <markz@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-10 14:36:47 -03:00
Christophe JAILLET	24a5b0ce71	RDMA/bnxt_re: Remove a redundant 'memset' 'wqe' is already zeroed at the top of the 'while' loop, just a few lines below, and is not used outside of the loop. So there is no need to zero it again, or for the variable to be declared outside the loop. Link: https://lore.kernel.org/r/20200308065442.5415-1-christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-10 14:33:21 -03:00
Jason Gunthorpe	6f00a54c2c	Linux 5.6-rc5 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl5lkYceHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGpHQH/RJrzcaZHo4lw88m Jf7vBZ9DYUlRgqE0pxTHWmodNObKRqpwOUGflUcWbb/7GD2LQUfeqhSECVQyTID9 N9y7FcPvx321Qhc3EkZ24DBYk0+DQ0K2FVUrSa/PxO0n7czxxXWaLRDmlSULEd3R D4pVs3zEWOBXJHUAvUQ5R+lKfkeWKNeeepeh+rezuhpdWFBRNz4Jjr5QUJ8od5xI sIwobYmESJqTRVBHqW8g2T2/yIsFJ78GCXs8DZLe1wxh40UbxdYDTA0NDDTHKzK6 lxzBgcmKzuge+1OVmzxLouNWMnPcjFlVgXWVerpSy3/SIFFkzzUWeMbqm6hKuhOn wAlcIgI= =VQUc -----END PGP SIGNATURE----- Merge tag 'v5.6-rc5' into rdma.git for-next Required due to dependencies in following patches. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-10 12:49:09 -03:00
Jason Gunthorpe	3e3cf2e82c	Merge branch 'mlx5_packet_pacing' into rdma.git for-next Yishai Hadas Says: ==================== Expose raw packet pacing APIs to be used by DEVX based applications. The existing code was refactored to have a single flow with the new raw APIs. ==================== Based on the mlx5-next branch at git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Due to dependencies * branch 'mlx5_packet_pacing': IB/mlx5: Introduce UAPIs to manage packet pacing net/mlx5: Expose raw packet pacing APIs	2020-03-10 11:54:17 -03:00
Yishai Hadas	30f2fe40c7	IB/mlx5: Introduce UAPIs to manage packet pacing Introduce packet pacing uobject and its alloc and destroy methods. This uobject holds mlx5 packet pacing context according to the device specification and enables managing packet pacing device entries that are needed by DEVX applications. Link: https://lore.kernel.org/r/20200219190518.200912-3-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-10 11:53:52 -03:00
Ingo Molnar	6120681bdf	Merge branch 'efi/urgent' into efi/core, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2020-03-08 09:57:58 +01:00
Colin Ian King	0aeb3622ea	RDMA/hns: fix spelling mistake "attatch" -> "attach" There is a spelling mistake in an error message. Fix it. Link: https://lore.kernel.org/r/20200304081045.81164-1-colin.king@canonical.com Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-04 14:36:13 -04:00
Parav Pandit	79db784e79	IB/mlx5: Fix missing congestion control debugfs on rep rdma device Cited commit missed to include low level congestion control related debugfs stage initialization. This resulted in missing debugfs entries for cc_params of a RDMA device. Add them back. Fixes: `b5ca15ad7e` ("IB/mlx5: Add proper representors support") Link: https://lore.kernel.org/r/20200227125407.99803-1-leon@kernel.org Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-04 14:20:14 -04:00
Parav Pandit	9e3aaf6883	IB/mlx5: Add np_min_time_between_cnps and rp_max_rate debug params Add two debugfs parameters described below. np_min_time_between_cnps - Minimum time between sending CNPs from the port. Unit = microseconds. Default = 0 (no min wait time; generated based on incoming ECN marked packets). rp_max_rate - Maximum rate at which reaction point node can transmit. Once this limit is reached, RP is no longer rate limited. Unit = Mbits/sec Default = 0 (full speed) Link: https://lore.kernel.org/r/20200227125246.99472-1-leon@kernel.org Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-04 14:20:03 -04:00
Artemy Kovalyov	de5ed007a0	IB/mlx5: Fix implicit ODP race Following race may occur because of the call_srcu and the placement of the synchronize_srcu vs the xa_erase. CPU0 CPU1 mlx5_ib_free_implicit_mr: destroy_unused_implicit_child_mr: xa_erase(odp_mkeys) synchronize_srcu() xa_lock(implicit_children) if (still in xarray) atomic_inc() call_srcu() xa_unlock(implicit_children) xa_erase(implicit_children): xa_lock(implicit_children) __xa_erase() xa_unlock(implicit_children) flush_workqueue() [..] free_implicit_child_mr_rcu: (via call_srcu) queue_work() WARN_ON(atomic_read()) [..] free_implicit_child_mr_work: (via wq) free_implicit_child_mr() mlx5_mr_cache_invalidate() mlx5_ib_update_xlt() <-- UMR QP fail atomic_dec() The wait_event() solves the race because it blocks until free_implicit_child_mr_work() completes. Fixes: `5256edcb98` ("RDMA/mlx5: Rework implicit ODP destroy") Link: https://lore.kernel.org/r/20200227113918.94432-1-leon@kernel.org Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-04 13:25:00 -04:00
Alexander Lobakin	91b74bf531	IB/mlx5: Optimize u64 division on 32-bit arches Commit `f164be8c03` ("IB/mlx5: Extend caps stage to handle VAR capabilities") introduced a straight "/" division of the u64 variable "bar_size". This was fixed with commit `685eff5131` ("IB/mlx5: Use div64_u64 for num_var_hw_entries calculation"). However, div64_u64() is redundant here as mlx5_var_table::stride_size is of type u32. Make the actual code way more optimized on 32-bit kernels using div_u64() and fix 80 chars break-through by the way. Fixes: `685eff5131` ("IB/mlx5: Use div64_u64 for num_var_hw_entries calculation") Link: https://lore.kernel.org/r/20200217073629.8051-1-alobakin@dlink.ru Signed-off-by: Alexander Lobakin <alobakin@dlink.ru> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-04 13:11:49 -04:00
Jason Gunthorpe	c13cac2a21	Linux 5.6-rc4 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl5cOX8eHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGw0AH/0nex1uEpzUTm+Gw D8QPFr3y61sYLu7sIMVt39+Zl6OSxvOOX14QIM/mrNrzjjRI8EXGvYgES5gSO4on 6NLS6/64c1oQThDHCsxusKoSWLZ9KqP2vRPt7tZjn7DZMzEsuLhlINKBeupcqALX FnOBr768P+if/j0WcDR2pBaMg3ch+XC5sfYav7kapjgWUqCx9BvrHKLXXdlEGUC0 7Ku7PH+nF7CIHiTay+i89odvOd8aLGsa/SUf5XGauKkH65VgQkmksgPeZUPqTnyC MEsyLJLfn4AP3ySwqzfSLac8jqZG8FGBt4DgM2MQBHibctzfeMIznfcfh/A8+Edx jqLKLAs= =4075 -----END PGP SIGNATURE----- Merge tag 'v5.6-rc4' into rdma.git for-next Required due to dependencies in following patches. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-04 13:11:06 -04:00
Kamal Heib	bb8865f435	RDMA/providers: Fix return value when QP type isn't supported The proper return code is "-EOPNOTSUPP" when the requested QP type is not supported by the provider. Link: https://lore.kernel.org/r/20200130082049.463-1-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-04 12:13:42 -04:00
Michael Guralnik	5e29d1443c	RDMA/mlx5: Prevent UMR usage with RO only when we have RO caps Relaxed ordering is not supported in UMR so we are disabling UMR usage when user passes relaxed ordering access flag. Enable using UMR when user requested relaxed ordering but there are no relaxed ordering capabilities. This will prevent user from unnecessarily registering a new mkey. Fixes: `d6de0bb185` ("RDMA/mlx5: Set relaxed ordering when requested") Link: https://lore.kernel.org/r/20200227113834.94233-1-leon@kernel.org Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-02 11:10:38 -04:00
YueHaibing	75d0366508	RDMA/bnxt_re: Remove set but not used variables 'pg' and 'idx' Fixes gcc '-Wunused-but-set-variable' warning: drivers/infiniband/hw/bnxt_re/qplib_rcfw.c: In function '__send_message': drivers/infiniband/hw/bnxt_re/qplib_rcfw.c:101:10: warning: variable 'idx' set but not used [-Wunused-but-set-variable] drivers/infiniband/hw/bnxt_re/qplib_rcfw.c:101:6: warning: variable 'pg' set but not used [-Wunused-but-set-variable] commit `cee0c7bba4` ("RDMA/bnxt_re: Refactor command queue management code") involved this, but not used. Link: https://lore.kernel.org/r/20200227064900.92255-1-yuehaibing@huawei.com Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-02 11:10:38 -04:00
YueHaibing	a0b404a98e	RDMA/bnxt_re: Remove set but not used variable 'dev_attr' Fixes gcc '-Wunused-but-set-variable' warning: drivers/infiniband/hw/bnxt_re/ib_verbs.c: In function 'bnxt_re_create_gsi_qp': drivers/infiniband/hw/bnxt_re/ib_verbs.c:1283:30: warning: variable 'dev_attr' set but not used [-Wunused-but-set-variable] commit `8dae419f9e` ("RDMA/bnxt_re: Refactor queue pair creation code") involved this, but not used, so remove it. Link: https://lore.kernel.org/r/20200227064542.91205-1-yuehaibing@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-02 11:10:38 -04:00
YueHaibing	6be2067d1e	RDMA/bnxt_re: Remove set but not used variable 'pg_size' Fixes gcc '-Wunused-but-set-variable' warning: drivers/infiniband/hw/bnxt_re/qplib_res.c: In function '__alloc_pbl': drivers/infiniband/hw/bnxt_re/qplib_res.c:109:13: warning: variable 'pg_size' set but not used [-Wunused-but-set-variable] commit `0c4dcd6028` ("RDMA/bnxt_re: Refactor hardware queue memory allocation") involved this, but not used, so remove it. Link: https://lore.kernel.org/r/20200227064209.87893-1-yuehaibing@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-02 11:10:38 -04:00
Selvin Xavier	66832705c4	RDMA/bnxt_re: Use driver_unregister and unregistration API Using the new unregister APIs provided by the core. Provide the dealloc_driver hook for the core to callback at the time of device un-registration. bnxt_re VF resources are created by the corresponding PF driver. During ib_unregister_driver, PF might get removed before VF and this could cause failure when VFs are removed. Driver is explicitly queuing the removal of VF devices before calling ib_unregister_driver. Link: https://lore.kernel.org/r/1582731932-26574-3-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-02 11:10:38 -04:00
Selvin Xavier	c2b777a959	RDMA/bnxt_re: Refactor device add/remove functionalities - bnxt_re_ib_reg() handles two main functionalities - initializing the device and registering with the IB stack. Split it into 2 functions i.e. bnxt_re_dev_init() and bnxt_re_ib_init() to account for the same thereby improve modularity. Do the same for bnxt_re_ib_unreg()i.e. split into two functions - bnxt_re_dev_uninit() and bnxt_re_ib_uninit(). - Simplify the code by combining the different steps to add and remove the device into two functions. - Report correct netdev link state during device register Link: https://lore.kernel.org/r/1582731932-26574-2-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-02 11:10:37 -04:00
Dennis Dalessandro	817a68a658	IB/hfi1, qib: Ensure RCU is locked when accessing list The packet handling function, specifically the iteration of the qp list for mad packet processing misses locking RCU before running through the list. Not only is this incorrect, but the list_for_each_entry_rcu() call can not be called with a conditional check for lock dependency. Remedy this by invoking the rcu lock and unlock around the critical section. This brings MAD packet processing in line with what is done for non-MAD packets. Fixes: `7724105686` ("IB/hfi1: add driver files") Link: https://lore.kernel.org/r/20200225195445.140896.41873.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-03-02 11:10:21 -04:00
Gal Pressman	ff6629f88c	RDMA/efa: Do not delay freeing of DMA pages When destroying a DMA mmapped object, there is no need to artificially delay the freeing of the pages to the mmap entry removal. Since the vma keeps a reference count on these pages, free_pages_exact can be called on the destroy verb as it won't really free the pages until the reference count is cleared (in case the user hasn't called munmap yet). Remove the special handling of DMA pages and call free_pages_exact on destroy_qp/cq. The mmap entry removal is moved to the beginning of the destroy flows, so the driver can safely free the pages. Link: https://lore.kernel.org/r/20200225114010.21790-4-galpress@amazon.com Reviewed-by: Firas JahJah <firasj@amazon.com> Reviewed-by: Yossi Leybovich <sleybo@amazon.com> Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-28 12:12:12 -04:00
Gal Pressman	56a7a721dd	RDMA/efa: Properly document the interrupt mask register The fact that the LSB in the register is the enable bit should not be an implicit assumption between the driver and the device, properly document that in the register definition. Link: https://lore.kernel.org/r/20200225114010.21790-3-galpress@amazon.com Reviewed-by: Firas JahJah <firasj@amazon.com> Reviewed-by: Yossi Leybovich <sleybo@amazon.com> Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-28 12:12:12 -04:00
Gal Pressman	88d033077b	RDMA/efa: Unified getters/setters for device structs bitmask access Use unified macros for device structs access instead of open coding the shifts and masks over and over again. Link: https://lore.kernel.org/r/20200225114010.21790-2-galpress@amazon.com Reviewed-by: Firas JahJah <firasj@amazon.com> Reviewed-by: Yossi Leybovich <sleybo@amazon.com> Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-28 12:12:04 -04:00
Xi Wang	cfec045b82	RDMA/hns: Optimize qp doorbell allocation flow Encapsulate the kernel qp doorbell allocation related code into 2 functions: alloc_qp_db() and free_qp_db(). Link: https://lore.kernel.org/r/1582526258-13825-8-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-28 11:35:11 -04:00
Xi Wang	b37c413997	RDMA/hns: Optimize kernel qp wrid allocation flow Encapsulate the kernel qp wrid allocation related code into 2 functions: alloc_kernel_wrid() and free_kernel_wrid(). Link: https://lore.kernel.org/r/1582526258-13825-7-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-28 11:35:11 -04:00
Xi Wang	ae85bf92ef	RDMA/hns: Optimize qp param setup flow Encapsulate the qp param setup related code into set_qp_param(). Link: https://lore.kernel.org/r/1582526258-13825-6-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-28 11:35:11 -04:00
Xi Wang	24c22112b9	RDMA/hns: Optimize qp buffer allocation flow Encapsulate qp buffer allocation related code into 3 functions: alloc_qp_buf(), map_wqe_buf() and free_qp_buf(). Link: https://lore.kernel.org/r/1582526258-13825-5-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-28 11:35:11 -04:00
Xi Wang	df83a66e1b	RDMA/hns: Optimize qp number assign flow Encapsulate the code associated with the qp number assignment into alloc_qpn() and free_qpn(). Link: https://lore.kernel.org/r/1582526258-13825-4-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-28 11:35:10 -04:00
Xi Wang	b71961d1da	RDMA/hns: Optimize qp context create and destroy flow Rename the qp context related functions and adjusts the code location to distinguish between the qp context and the entire qp. Link: https://lore.kernel.org/r/1582526258-13825-3-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-28 11:35:10 -04:00
Xi Wang	e365b26c6b	RDMA/hns: Optimize qp destroy flow Wrap the duplicate code in hip08 and hip06 qp destruction process as hns_roce_qp_destroy() to simply the qp destroy flow. Link: https://lore.kernel.org/r/1582526258-13825-2-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-28 11:35:10 -04:00
Yixian Liu	75c994e694	RDMA/hns: Stop doorbell update while qp state error There are two paths to update qp producer index into hardware now, one path is doorbell in post verbs (send and recv), the another is mailbox in modify qp verb which is called by flush process. This will lead the hardware to be broken to correctly generate flush cqe. With stopping doorbell update and holding qp spinlock in modify qp during flush process, the problem can be solved. Fixes: `0425e3e6e0` ("RDMA/hns: Support flush cqe for hip08 in kernel space") Link: https://lore.kernel.org/r/1582367158-27030-3-git-send-email-liuyixian@huawei.com Signed-off-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-28 11:28:31 -04:00
Yixian Liu	0fc99566f6	RDMA/hns: Use flush framework for the case in aeq As now we already have flush framework, using it instead of current flush process for qp error in asynchronized interrupt (aeq). Link: https://lore.kernel.org/r/1582367158-27030-2-git-send-email-liuyixian@huawei.com Signed-off-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-28 11:28:31 -04:00
Lang Cheng	dfaf2854b0	RDMA/hns: Treat revision HIP08_A as a special case Set revisions that equal to or higher than HIP08_B as default to maintain backward compatibility. Link: https://lore.kernel.org/r/1582363039-10714-1-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-28 11:24:08 -04:00
Jason Gunthorpe	65a1662015	RDMA/bnxt_re: Using vmalloc requires including vmalloc.h Add it Fixes: `0c4dcd6028` ("RDMA/bnxt_re: Refactor hardware queue memory allocation") Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Reviewed-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-26 13:24:41 -04:00
Ingo Molnar	e9765680a3	EFI updates for v5.7: This time, the set of changes for the EFI subsystem is much larger than usual. The main reasons are: - Get things cleaned up before EFI support for RISC-V arrives, which will increase the size of the validation matrix, and therefore the threshold to making drastic changes, - After years of defunct maintainership, the GRUB project has finally started to consider changes from the distros regarding UEFI boot, some of which are highly specific to the way x86 does UEFI secure boot and measured boot, based on knowledge of both shim internals and the layout of bootparams and the x86 setup header. Having this maintenance burden on other architectures (which don't need shim in the first place) is hard to justify, so instead, we are introducing a generic Linux/UEFI boot protocol. Summary of changes: - Boot time GDT handling changes (Arvind) - Simplify handling of EFI properties table on arm64 - Generic EFI stub cleanups, to improve command line handling, file I/O, memory allocation, etc. - Introduce a generic initrd loading method based on calling back into the firmware, instead of relying on the x86 EFI handover protocol or device tree. - Introduce a mixed mode boot method that does not rely on the x86 EFI handover protocol either, and could potentially be adopted by other architectures (if another one ever surfaces where one execution mode is a superset of another) - Clean up the contents of struct efi, and move out everything that doesn't need to be stored there. - Incorporate support for UEFI spec v2.8A changes that permit firmware implementations to return EFI_UNSUPPORTED from UEFI runtime services at OS runtime, and expose a mask of which ones are supported or unsupported via a configuration table. - Various documentation updates and minor code cleanups (Heinrich) - Partial fix for the lack of by-VA cache maintenance in the decompressor on 32-bit ARM. Note that these patches were deliberately put at the beginning so they can be used as a stable branch that will be shared with a PR containing the complete fix, which I will send to the ARM tree. -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEnNKg2mrY9zMBdeK7wjcgfpV0+n0FAl5S7WYACgkQwjcgfpV0 +n1jmQgAmwV3V8pbgB4mi4P2Mv8w5Zj5feUe6uXnTR2AFv5nygLcTzdxN+TU/6lc OmZv2zzdsAscYlhuUdI/4t4cXIjHAZI39+UBoNRuMqKbkbvXCFscZANLxvJjHjZv FFbgUk0DXkF0BCEDuSLNavidAv4b3gZsOmHbPfwuB8xdP05LbvbS2mf+2tWVAi2z ULfua/0o9yiwl19jSS6iQEPCvvLBeBzTLW7x5Rcm/S0JnotzB59yMaeqD7jO8JYP 5PvI4WM/l5UfVHnZp2k1R76AOjReALw8dQgqAsT79Q7+EH3sNNuIjU6omdy+DFf4 tnpwYfeLOaZ1SztNNfU87Hsgnn2CGw== =pDE3 -----END PGP SIGNATURE----- Merge tag 'efi-next' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi into efi/core Pull EFI updates for v5.7 from Ard Biesheuvel: This time, the set of changes for the EFI subsystem is much larger than usual. The main reasons are: - Get things cleaned up before EFI support for RISC-V arrives, which will increase the size of the validation matrix, and therefore the threshold to making drastic changes, - After years of defunct maintainership, the GRUB project has finally started to consider changes from the distros regarding UEFI boot, some of which are highly specific to the way x86 does UEFI secure boot and measured boot, based on knowledge of both shim internals and the layout of bootparams and the x86 setup header. Having this maintenance burden on other architectures (which don't need shim in the first place) is hard to justify, so instead, we are introducing a generic Linux/UEFI boot protocol. Summary of changes: - Boot time GDT handling changes (Arvind) - Simplify handling of EFI properties table on arm64 - Generic EFI stub cleanups, to improve command line handling, file I/O, memory allocation, etc. - Introduce a generic initrd loading method based on calling back into the firmware, instead of relying on the x86 EFI handover protocol or device tree. - Introduce a mixed mode boot method that does not rely on the x86 EFI handover protocol either, and could potentially be adopted by other architectures (if another one ever surfaces where one execution mode is a superset of another) - Clean up the contents of struct efi, and move out everything that doesn't need to be stored there. - Incorporate support for UEFI spec v2.8A changes that permit firmware implementations to return EFI_UNSUPPORTED from UEFI runtime services at OS runtime, and expose a mask of which ones are supported or unsupported via a configuration table. - Various documentation updates and minor code cleanups (Heinrich) - Partial fix for the lack of by-VA cache maintenance in the decompressor on 32-bit ARM. Note that these patches were deliberately put at the beginning so they can be used as a stable branch that will be shared with a PR containing the complete fix, which I will send to the ARM tree. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2020-02-26 15:21:22 +01:00
Ard Biesheuvel	d79b348c35	infiniband: hfi1: Use EFI GetVariable only when available Replace the EFI runtime services check with one that tells us whether EFI GetVariable() is implemented by the firmware. Cc: Mike Marciniszyn <mike.marciniszyn@intel.com> Cc: Dennis Dalessandro <dennis.dalessandro@intel.com> Cc: Doug Ledford <dledford@redhat.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: linux-rdma@vger.kernel.org Signed-off-by: Ard Biesheuvel <ardb@kernel.org>	2020-02-23 21:59:42 +01:00
Devesh Sharma	6ccad8483b	RDMA/bnxt_re: use ibdev based message printing functions Replacing the dev_err/dbg/warn with ibdev_err/dbg/warn. In the IB device provider driver these functions are recommended to use. Currently qplib layer function calls has not been replaced due to unavailability of ib_device pointer at that layer. Link: https://lore.kernel.org/r/1581786665-23705-9-git-send-email-devesh.sharma@broadcom.com Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-21 20:21:44 -04:00
Devesh Sharma	6f53196bc5	RDMA/bnxt_re: Refactor doorbell management functions Moving all the fast path doorbell functions at one place under qplib_res.h. To pass doorbell record information a new structure bnxt_qplib_db_info has been introduced. Every roce object holds an instance of this structure and doorbell information is initialized during resource creation. When DB is rung only the current queue index is read from hardware ring and rest of the data is taken from pre-initialized dbinfo structure. Link: https://lore.kernel.org/r/1581786665-23705-8-git-send-email-devesh.sharma@broadcom.com Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-21 20:21:44 -04:00
Devesh Sharma	9555352bac	RDMA/bnxt_re: Refactor notification queue management code Cleaning up the notification queue data structures and management code. The CQ and SRQ event handlers have been type defined instead of in-place declaration. NQ doorbell register descriptor has been added in base NQ structure. The nq->vector has been renamed to nq->msix_vec. Link: https://lore.kernel.org/r/1581786665-23705-7-git-send-email-devesh.sharma@broadcom.com Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-21 20:21:43 -04:00
Devesh Sharma	cee0c7bba4	RDMA/bnxt_re: Refactor command queue management code Refactoring the command queue (rcfw) management code. A new data-structure is introduced to describe the bar register. each object which deals with mmio space should have a descriptor structure. This structure specifically hold DB register information. Thus, slow path creq structure now hold a bar register descriptor. Further cleanup the rcfw structure to introduce the command queue context and command response event queue context structures. Rest of the rcfw related code has been touched to incorporate these three structures. Link: https://lore.kernel.org/r/1581786665-23705-6-git-send-email-devesh.sharma@broadcom.com Signed-off-by: Naresh Kumar PBS <nareshkumar.pbs@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-21 20:21:43 -04:00
Devesh Sharma	b08fe048a6	RDMA/bnxt_re: Refactor net ring allocation function Introducing a new attribute structure to reduce the long list of arguments passed in bnxt_re_net_ring_alloc() function. The caller of bnxt_re_net_ring_alloc should fill in the list of attributes in bnxt_re_ring_attr structure and then pass the pointer to the function. Link: https://lore.kernel.org/r/1581786665-23705-5-git-send-email-devesh.sharma@broadcom.com Signed-off-by: Naresh Kumar PBS <nareshkumar.pbs@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-21 20:21:43 -04:00
Devesh Sharma	0c4dcd6028	RDMA/bnxt_re: Refactor hardware queue memory allocation At top level there are three major data structure addition. viz bnxt_qplib_hwq_attr, bnxt_qplib_sg_info and bnxt_qplib_tqm_ctx Intorduction of first data structure reduces the arguments list to bnxt_re_alloc_init_hwq() function. There are changes all over the driver code to incorporate this new structure. The caller needs to fill the attribute data structure and pass to this function. The second data structure is to pass memory region description viz. sghead, page_size and page_shift. There are changes all over the driver code to initialize bnxt_re_sg_info data structure. The new data structure helps to reduce the argument list of __alloc_pbl() function call. Till now the TQM rings related members were not collected under any specific data-structure making it hard to manage. The third data sctructure bnxt_qplib_tqm_ctx is added to refactor the TQM queue allocation and initialization. Link: https://lore.kernel.org/r/1581786665-23705-4-git-send-email-devesh.sharma@broadcom.com Signed-off-by: Naresh Kumar PBS <nareshkumar.pbs@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-21 20:21:43 -04:00
Devesh Sharma	0cfb329db9	RDMA/bnxt_re: Replace chip context structure with pointer The chip_ctx member in bnxt_re_dev structure is now a pointer to struct bnxt_qplib_chip_ctx. Since the member type has changed there are changes in rest of the code wherever dev->chip_ctx is used. Link: https://lore.kernel.org/r/1581786665-23705-3-git-send-email-devesh.sharma@broadcom.com Signed-off-by: Naresh Kumar PBS <nareshkumar.pbs@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-21 20:21:43 -04:00
Devesh Sharma	8dae419f9e	RDMA/bnxt_re: Refactor queue pair creation code Restructuring the bnxt_re_create_qp function. Listing below the major changes: - Monolithic central part of create_qp where attributes are initialized is now enclosed in one function and this new function has few more sub-functions. - Top level qp limit checking code moved to a function. - GSI QP creation and GSI Shadow qp creation code is handled in a sub function. Link: https://lore.kernel.org/r/1581786665-23705-2-git-send-email-devesh.sharma@broadcom.com Signed-off-by: Naresh Kumar PBS <nareshkumar.pbs@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-21 20:21:42 -04:00
Gustavo A. R. Silva	5b361328ca	RDMA: Replace zero-length array with flexible-array member The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit `7649773293` ("cxgb3/l2t: Fix undefined behaviour") Link: https://lore.kernel.org/r/20200213010425.GA13068@embeddedor.com Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> # added a few more	2020-02-20 13:33:51 -04:00
Lang Cheng	52c5e9e749	RDMA/hns: Initialize all fields of doorbells to zero Prevent uninitialized fields when new fields are added, and make code look simpler. Link: https://lore.kernel.org/r/1582162471-50361-1-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-20 13:22:02 -04:00
Colin Ian King	8d8d2b76ac	RDMA/hns: fix spelling mistake: "attatch" -> "attach" There is a spelling mistake in a dev_err error message. Fix it. Link: https://lore.kernel.org/r/20200214003338.6573-1-colin.king@canonical.com Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-20 13:11:23 -04:00
Paul Blakey	0f0d3827c0	net/mlx5: E-Switch, Move source port on reg_c0 to the upper 16 bits Multi chain support requires the miss path to continue the processing from the last chain id, and for that we need to save the chain miss tag (a mapping for 32bit chain id) on reg_c0 which will come in a next patch. Currently reg_c0 is exclusively used to store the source port metadata, giving it 32bit, it is created from 16bits of vcha_id, and 16bits of vport number. We will move this source port metadata to upper 16bits, and leave the lower bits for the chain miss tag. We compress the reg_c0 source port metadata to 16bits by taking 8 bits from vhca_id, and 8bits from the vport number. Since we compress the vport number to 8bits statically, and leave two top ids for special PF/ECPF numbers, we will only support a max of 254 vports with this strategy. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-02-19 17:49:48 -08:00
Selvin Xavier	0a01623b74	RDMA/bnxt_re: Use rdma_read_gid_hw_context to retrieve HW gid index bnxt_re HW maintains a GID table with only a single entry for the two duplicate GID entries (v1 and v2). Driver needs to map stack gid index to the HW table gid index. Use the new API rdma_read_gid_hw_context () to retrieve the HW GID context to get the HW table index. Link: https://lore.kernel.org/r/1582107594-5180-3-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-19 16:54:25 -04:00
Jason Gunthorpe	685eff5131	IB/mlx5: Use div64_u64 for num_var_hw_entries calculation On i386: ERROR: "__udivdi3" [drivers/infiniband/hw/mlx5/mlx5_ib.ko] undefined! ERROR: "__divdi3" [drivers/infiniband/hw/mlx5/mlx5_ib.ko] undefined! Fixes: `f164be8c03` ("IB/mlx5: Extend caps stage to handle VAR capabilities") Reported-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Reported-by: Alexander Lobakin <alobakin@dlink.ru> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-14 15:21:52 -04:00
Yixian Liu	b53742865e	RDMA/hns: Delayed flush cqe process with workqueue HiP08 RoCE hardware lacks ability(a known hardware problem) to flush outstanding WQEs if QP state gets into errored mode for some reason. To overcome this hardware problem and as a workaround, when QP is detected to be in errored state during various legs like post send, post receive etc[1], flush needs to be performed from the driver. The earlier patch[1] sent to solve the hardware limitation explained in the cover-letter had a bug in the software flushing leg. It acquired mutex while modifying QP state to errored state and while conveying it to the hardware using the mailbox. This caused leg to sleep while holding spin-lock and caused crash. Suggested Solution: we have proposed to defer the flushing of the QP in the Errored state using the workqueue to get around with the limitation of our hardware. This patch specifically adds the calls to the flush handler from where parts of the code like post_send/post_recv etc. when the QP state gets into the errored mode. [1] https://patchwork.kernel.org/patch/10534271/ Link: https://lore.kernel.org/r/1580983005-13899-3-git-send-email-liuyixian@huawei.com Signed-off-by: Yixian Liu <liuyixian@huawei.com> Reviewed-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-13 16:44:21 -04:00
Yixian Liu	ffd541d457	RDMA/hns: Add the workqueue framework for flush cqe handler HiP08 RoCE hardware lacks ability(a known hardware problem) to flush outstanding WQEs if QP state gets into errored mode for some reason. To overcome this hardware problem and as a workaround, when QP is detected to be in errored state during various legs like post send, post receive etc [1], flush needs to be performed from the driver. The earlier patch[1] sent to solve the hardware limitation explained in the cover-letter had a bug in the software flushing leg. It acquired mutex while modifying QP state to errored state and while conveying it to the hardware using the mailbox. This caused leg to sleep while holding spin-lock and caused crash. Suggested Solution: we have proposed to defer the flushing of the QP in the Errored state using the workqueue to get around with the limitation of our hardware. This patch adds the framework of the workqueue and the flush handler function. [1] https://patchwork.kernel.org/patch/10534271/ Link: https://lore.kernel.org/r/1580983005-13899-2-git-send-email-liuyixian@huawei.com Signed-off-by: Yixian Liu <liuyixian@huawei.com> Reviewed-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-13 16:44:20 -04:00
Leon Romanovsky	9b6d3bbc13	RDMA/mlx5: Prevent overflow in mmap offset calculations The cmd and index variables declared as u16 and the result is supposed to be stored in u64. The C arithmetic rules doesn't promote "(index >> 8) << 16" to be u64 and leaves the end result to be u16. Fixes: `7be76bef32` ("IB/mlx5: Introduce VAR object and its alloc/destroy methods") Link: https://lore.kernel.org/r/20200212072635.682689-10-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-13 10:39:23 -04:00
Yishai Hadas	a8af8694a5	RDMA/mlx5: Fix async events cleanup flows As in the prior patch, the devx code is not fully cleaning up its event_lists before finishing driver_destroy allowing a later read to trigger user after free conditions. Re-arrange things so that the event_list is always empty after destroy and ensure it remains empty until the file is closed. Fixes: `f7c8416cce` ("RDMA/core: Simplify destruction of FD uobjects") Link: https://lore.kernel.org/r/20200212072635.682689-7-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-13 09:48:17 -04:00
Shiraz Saleem	9a4b24108d	i40iw: Do an RCU lookup in i40iw_add_ipv4_addr The in_dev_for_each_ifa_rtnl() iterator in i40iw_add_ipv4_addr requires that the rtnl lock be held. But the rtnl_trylock/unlock scheme in this function does not guarantee it. Replace the rtnl locking with an RCU lookup using in_dev_for_each_ifa_rcu() Fixes: `8e06af711b` ("i40iw: add main, hdr, status") Link: https://lore.kernel.org/r/20200204223840.2151-1-shiraz.saleem@intel.com Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-11 14:31:11 -04:00
Krishnamraju Eraparaju	d219face90	RDMA/iw_cxgb4: initiate CLOSE when entering TERM As per draft-hilland-iwarp-verbs-v1.0, sec 6.2.3, always initiate a CLOSE when entering into TERM state. In c4iw_modify_qp(), disconnect operation should only be performed when the modify_qp call is invoked from ib_core. And all other internal modify_qp calls(invoked within iw_cxgb4) that needs 'disconnect' should call c4iw_ep_disconnect() explicitly after modify_qp. Otherwise, deadlocks like below can occur: Call Trace: schedule+0x2f/0xa0 schedule_preempt_disabled+0xa/0x10 __mutex_lock.isra.5+0x2d0/0x4a0 c4iw_ep_disconnect+0x39/0x430 => tries to reacquire ep lock again c4iw_modify_qp+0x468/0x10d0 rx_data+0x218/0x570 => acquires ep lock process_work+0x5f/0x70 process_one_work+0x1a7/0x3b0 worker_thread+0x30/0x390 kthread+0x112/0x130 ret_from_fork+0x35/0x40 Fixes: `d2c33370ae` ("RDMA/iw_cxgb4: Always disconnect when QP is transitioning to TERMINATE state") Link: https://lore.kernel.org/r/20200204091230.7210-1-krishna2@chelsio.com Signed-off-by: Krishnamraju Eraparaju <krishna2@chelsio.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-11 14:28:00 -04:00
Mark Zhang	10189e8e6f	IB/mlx5: Return failure when rts2rts_qp_counters_set_id is not supported When binding a QP with a counter and the QP state is not RESET, return failure if the rts2rts_qp_counters_set_id is not supported by the device. This is to prevent cases like manual bind for Connect-IB devices from returning success when the feature is not supported. Fixes: `d14133dd41` ("IB/mlx5: Support set qp counter") Link: https://lore.kernel.org/r/20200126171708.5167-1-leon@kernel.org Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-11 14:23:01 -04:00
Xi Wang	d7e2d3432a	RDMA/hns: Optimize eqe buffer allocation flow The eqe has a private multi-hop addressing implementation, but there is already a set of interfaces in the hns driver that can achieve this. So, simplify the eqe buffer allocation process by using the mtr interface and remove large amount of repeated logic. Link: https://lore.kernel.org/r/20200126145835.11368-1-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-11 14:12:43 -04:00
Lang Cheng	b14c95bee8	RDMA/hns: Cleanups of magic numbers Some magic numbers are hard to understand, so replace them with macros or add some comments for them. Link: https://lore.kernel.org/r/20200126145504.9700-1-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Yixing Liu <liuyixing1@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-11 14:10:20 -04:00
Mike Marciniszyn	be8638344c	IB/hfi1: Close window for pq and request coliding Cleaning up a pq can result in the following warning and panic: WARNING: CPU: 52 PID: 77418 at lib/list_debug.c:53 __list_del_entry+0x63/0xd0 list_del corruption, ffff88cb2c6ac068->next is LIST_POISON1 (dead000000000100) Modules linked in: mmfs26(OE) mmfslinux(OE) tracedev(OE) 8021q garp mrp ib_isert iscsi_target_mod target_core_mod crc_t10dif crct10dif_generic opa_vnic rpcrdma ib_iser libiscsi scsi_transport_iscsi ib_ipoib(OE) bridge stp llc iTCO_wdt iTCO_vendor_support intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crct10dif_pclmul crct10dif_common crc32_pclmul ghash_clmulni_intel ast aesni_intel ttm lrw gf128mul glue_helper ablk_helper drm_kms_helper cryptd syscopyarea sysfillrect sysimgblt fb_sys_fops drm pcspkr joydev lpc_ich mei_me drm_panel_orientation_quirks i2c_i801 mei wmi ipmi_si ipmi_devintf ipmi_msghandler nfit libnvdimm acpi_power_meter acpi_pad hfi1(OE) rdmavt(OE) rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core binfmt_misc numatools(OE) xpmem(OE) ip_tables nfsv3 nfs_acl nfs lockd grace sunrpc fscache igb ahci i2c_algo_bit libahci dca ptp libata pps_core crc32c_intel [last unloaded: i2c_algo_bit] CPU: 52 PID: 77418 Comm: pvbatch Kdump: loaded Tainted: G OE ------------ 3.10.0-957.38.3.el7.x86_64 #1 Hardware name: HPE.COM HPE SGI 8600-XA730i Gen10/X11DPT-SB-SG007, BIOS SBED1229 01/22/2019 Call Trace: [<ffffffff90365ac0>] dump_stack+0x19/0x1b [<ffffffff8fc98b78>] __warn+0xd8/0x100 [<ffffffff8fc98bff>] warn_slowpath_fmt+0x5f/0x80 [<ffffffff8ff970c3>] __list_del_entry+0x63/0xd0 [<ffffffff8ff9713d>] list_del+0xd/0x30 [<ffffffff8fddda70>] kmem_cache_destroy+0x50/0x110 [<ffffffffc0328130>] hfi1_user_sdma_free_queues+0xf0/0x200 [hfi1] [<ffffffffc02e2350>] hfi1_file_close+0x70/0x1e0 [hfi1] [<ffffffff8fe4519c>] __fput+0xec/0x260 [<ffffffff8fe453fe>] ____fput+0xe/0x10 [<ffffffff8fcbfd1b>] task_work_run+0xbb/0xe0 [<ffffffff8fc2bc65>] do_notify_resume+0xa5/0xc0 [<ffffffff90379134>] int_signal+0x12/0x17 BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 IP: [<ffffffff8fe1f93e>] kmem_cache_close+0x7e/0x300 PGD 2cdab19067 PUD 2f7bfdb067 PMD 0 Oops: 0000 [#1] SMP Modules linked in: mmfs26(OE) mmfslinux(OE) tracedev(OE) 8021q garp mrp ib_isert iscsi_target_mod target_core_mod crc_t10dif crct10dif_generic opa_vnic rpcrdma ib_iser libiscsi scsi_transport_iscsi ib_ipoib(OE) bridge stp llc iTCO_wdt iTCO_vendor_support intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crct10dif_pclmul crct10dif_common crc32_pclmul ghash_clmulni_intel ast aesni_intel ttm lrw gf128mul glue_helper ablk_helper drm_kms_helper cryptd syscopyarea sysfillrect sysimgblt fb_sys_fops drm pcspkr joydev lpc_ich mei_me drm_panel_orientation_quirks i2c_i801 mei wmi ipmi_si ipmi_devintf ipmi_msghandler nfit libnvdimm acpi_power_meter acpi_pad hfi1(OE) rdmavt(OE) rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core binfmt_misc numatools(OE) xpmem(OE) ip_tables nfsv3 nfs_acl nfs lockd grace sunrpc fscache igb ahci i2c_algo_bit libahci dca ptp libata pps_core crc32c_intel [last unloaded: i2c_algo_bit] CPU: 52 PID: 77418 Comm: pvbatch Kdump: loaded Tainted: G W OE ------------ 3.10.0-957.38.3.el7.x86_64 #1 Hardware name: HPE.COM HPE SGI 8600-XA730i Gen10/X11DPT-SB-SG007, BIOS SBED1229 01/22/2019 task: ffff88cc26db9040 ti: ffff88b5393a8000 task.ti: ffff88b5393a8000 RIP: 0010:[<ffffffff8fe1f93e>] [<ffffffff8fe1f93e>] kmem_cache_close+0x7e/0x300 RSP: 0018:ffff88b5393abd60 EFLAGS: 00010287 RAX: 0000000000000000 RBX: ffff88cb2c6ac000 RCX: 0000000000000003 RDX: 0000000000000400 RSI: 0000000000000400 RDI: ffffffff9095b800 RBP: ffff88b5393abdb0 R08: ffffffff9095b808 R09: ffffffff8ff77c19 R10: ffff88b73ce1f160 R11: ffffddecddde9800 R12: ffff88cb2c6ac000 R13: 000000000000000c R14: ffff88cf3fdca780 R15: 0000000000000000 FS: 00002aaaaab52500(0000) GS:ffff88b73ce00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000010 CR3: 0000002d27664000 CR4: 00000000007607e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: [<ffffffff8fe20d44>] __kmem_cache_shutdown+0x14/0x80 [<ffffffff8fddda78>] kmem_cache_destroy+0x58/0x110 [<ffffffffc0328130>] hfi1_user_sdma_free_queues+0xf0/0x200 [hfi1] [<ffffffffc02e2350>] hfi1_file_close+0x70/0x1e0 [hfi1] [<ffffffff8fe4519c>] __fput+0xec/0x260 [<ffffffff8fe453fe>] ____fput+0xe/0x10 [<ffffffff8fcbfd1b>] task_work_run+0xbb/0xe0 [<ffffffff8fc2bc65>] do_notify_resume+0xa5/0xc0 [<ffffffff90379134>] int_signal+0x12/0x17 Code: 00 00 ba 00 04 00 00 0f 4f c2 3d 00 04 00 00 89 45 bc 0f 84 e7 01 00 00 48 63 45 bc 49 8d 04 c4 48 89 45 b0 48 8b 80 c8 00 00 00 <48> 8b 78 10 48 89 45 c0 48 83 c0 10 48 89 45 d0 48 8b 17 48 39 RIP [<ffffffff8fe1f93e>] kmem_cache_close+0x7e/0x300 RSP <ffff88b5393abd60> CR2: 0000000000000010 The panic is the result of slab entries being freed during the destruction of the pq slab. The code attempts to quiesce the pq, but looking for n_req == 0 doesn't account for new requests. Fix the issue by using SRCU to get a pq pointer and adjust the pq free logic to NULL the fd pq pointer prior to the quiesce. Fixes: `e87473bc1b` ("IB/hfi1: Only set fd pointer when base context is completely initialized") Link: https://lore.kernel.org/r/20200210131033.87408.81174.stgit@awfm-01.aw.intel.com Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-11 11:41:31 -04:00
Kaike Wan	a70ed0f2e6	IB/hfi1: Acquire lock to release TID entries when user file is closed Each user context is allocated a certain number of RcvArray (TID) entries and these entries are managed through TID groups. These groups are put into one of three lists in each user context: tid_group_list, tid_used_list, and tid_full_list, depending on the number of used TID entries within each group. When TID packets are expected, one or more TID groups will be allocated. After the packets are received, the TID groups will be freed. Since multiple user threads may access the TID groups simultaneously, a mutex exp_mutex is used to synchronize the access. However, when the user file is closed, it tries to release all TID groups without acquiring the mutex first, which risks a race condition with another thread that may be releasing its TID groups, leading to data corruption. This patch addresses the issue by acquiring the mutex first before releasing the TID groups when the file is closed. Fixes: `3abb33ac65` ("staging/hfi1: Add TID cache receive init and free funcs") Link: https://lore.kernel.org/r/20200210131026.87408.86853.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-11 11:41:31 -04:00
Kamal Heib	8a4f300b97	RDMA/hfi1: Fix memory leak in _dev_comp_vect_mappings_create Make sure to free the allocated cpumask_var_t's to avoid the following reported memory leak by kmemleak: $ cat /sys/kernel/debug/kmemleak unreferenced object 0xffff8897f812d6a8 (size 8): comm "kworker/1:1", pid 347, jiffies 4294751400 (age 101.703s) hex dump (first 8 bytes): 00 00 00 00 00 00 00 00 ........ backtrace: [<00000000bff49664>] alloc_cpumask_var_node+0x4c/0xb0 [<0000000075d3ca81>] hfi1_comp_vectors_set_up+0x20f/0x800 [hfi1] [<0000000098d420df>] hfi1_init_dd+0x3311/0x4960 [hfi1] [<0000000071be7e52>] init_one+0x25e/0xf10 [hfi1] [<000000005483d4c2>] local_pci_probe+0xd4/0x180 [<000000007c3cbc6e>] work_for_cpu_fn+0x51/0xa0 [<000000001d626905>] process_one_work+0x8f0/0x17b0 [<000000007e569e7e>] worker_thread+0x536/0xb50 [<00000000fd39a4a5>] kthread+0x30c/0x3d0 [<0000000056f2edb3>] ret_from_fork+0x3a/0x50 Fixes: `5d18ee67d4` ("IB/{hfi1, rdmavt, qib}: Implement CQ completion vector support") Link: https://lore.kernel.org/r/20200205110530.12129-1-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-02-11 11:35:45 -04:00
Linus Torvalds	8fdd4019bc	RDMA subsystem updates for 5.6 - Driver updates and cleanup for qedr, bnxt_re, hns, siw, mlx5, mlx4, rxe, i40iw - Larger series doing cleanup and rework for hns and hfi1. - Some general reworking of the CM code to make it a little more understandable - Unify the different code paths connected to the uverbs FD scheme - New UAPI ioctls conversions for get context and get async fd - Trace points for CQ and CM portions of the RDMA stack - mlx5 driver support for virtio-net formatted rings as RDMA raw ethernet QPs - verbs support for setting the PCI-E relaxed ordering bit on DMA traffic connected to a MR - A couple of bug fixes that came too late to make rc7 -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl4zPwQACgkQOG33FX4g mxoURw//fuQmuJ7aTMH+0qrhaZUmzXOcI/WKvY0YMyYLvxolRcIO+uCL239wxezR 9iTHPO7HeYXUQ4W8Hi/fTyuQ9hzaPOP3wgOJfQhm4QT/XDpRW0H3Mb+hTLHTUAcA rgKc9suAn+5BbIDOz7hEfeOTssx1wYrLsaHDc11NZ42JuG6uvPR33lhXiKWG+5tH 2MpfeTU6BjL035dm3YZXCo+ouobpdMuvzJItYIsB2E5Nl0s91SMzsymIYiD0gb3t yUJ3wqPW3pchfAl8VEn+W5AHTUYYgGjmEblL8WdVq5JRrkQgQzj8QtCRT9NOPAT0 LivCvgBrm0kscaQS2TjtG56Ojbwz8z1QPE/4shf0pj/G2lZfacYDAeaUc/2VafxY y/KG+3dB1DxtYY3eXJUxbB7Vpk7kfr35p5b75NdMhd2t49oPgV7EKoZMLYGzfX4S PtyNyNSiwx8qsRTr4lznOMswmrDLfG4XiywWgYo6NGOWyKYlARWIYBAEQZ0DPTiE 9mqJ19gusdSdAgm8LGDInPmH6/AojGOVzYonJFWdlOtwCXGNXL4Gx02x4WYHykDG w+oy5NMJbU3b6+MWEagkuQNcrwqv02MT1mB/Lgv4GPm6rS0UXR7zUPDeccE50fSL X36k28UlftlPlaD7PeJdTOAhyBv5DxfpL5rbB2TfpUTpNxjayuU= =hepK -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull rdma updates from Jason Gunthorpe: "A very quiet cycle with few notable changes. Mostly the usual list of one or two patches to drivers changing something that isn't quite rc worthy. The subsystem seems to be seeing a larger number of rework and cleanup style patches right now, I feel that several vendors are prepping their drivers for new silicon. Summary: - Driver updates and cleanup for qedr, bnxt_re, hns, siw, mlx5, mlx4, rxe, i40iw - Larger series doing cleanup and rework for hns and hfi1. - Some general reworking of the CM code to make it a little more understandable - Unify the different code paths connected to the uverbs FD scheme - New UAPI ioctls conversions for get context and get async fd - Trace points for CQ and CM portions of the RDMA stack - mlx5 driver support for virtio-net formatted rings as RDMA raw ethernet QPs - verbs support for setting the PCI-E relaxed ordering bit on DMA traffic connected to a MR - A couple of bug fixes that came too late to make rc7" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (108 commits) RDMA/core: Make the entire API tree static RDMA/efa: Mask access flags with the correct optional range RDMA/cma: Fix unbalanced cm_id reference count during address resolve RDMA/umem: Fix ib_umem_find_best_pgsz() IB/mlx4: Fix leak in id_map_find_del IB/opa_vnic: Spelling correction of 'erorr' to 'error' IB/hfi1: Fix logical condition in msix_request_irq RDMA/cm: Remove CM message structs RDMA/cm: Use IBA functions for complex structure members RDMA/cm: Use IBA functions for simple structure members RDMA/cm: Use IBA functions for swapping get/set acessors RDMA/cm: Use IBA functions for simple get/set acessors RDMA/cm: Add SET/GET implementations to hide IBA wire format RDMA/cm: Add accessors for CM_REQ transport_type IB/mlx5: Return the administrative GUID if exists RDMA/core: Ensure that rdma_user_mmap_entry_remove() is a fence IB/mlx4: Fix memory leak in add_gid error flow IB/mlx5: Expose RoCE accelerator counters RDMA/mlx5: Set relaxed ordering when requested RDMA/core: Add the core support field to METHOD_GET_CONTEXT ...	2020-01-31 14:40:36 -08:00
John Hubbard	f1f6a7dd9b	mm, tree-wide: rename put_user_page() to unpin_user_page() In order to provide a clearer, more symmetric API for pinning and unpinning DMA pages. This way, pin_user_pages() calls match up with unpin_user_pages() calls, and the API is a lot closer to being self-explanatory. Link: http://lkml.kernel.org/r/20200107224558.2362728-23-jhubbard@nvidia.com Signed-off-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Björn Töpel <bjorn.topel@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Jason Gunthorpe <jgg@mellanox.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jens Axboe <axboe@kernel.dk> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Leon Romanovsky <leonro@mellanox.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-01-31 10:30:38 -08:00
John Hubbard	dfa0a4fff1	IB/{core,hw,umem}: set FOLL_PIN via pin_user_pages(), fix up ODP Convert infiniband to use the new pin_user_pages() calls. Also, revert earlier changes to Infiniband ODP that had it using put_user_page(). ODP is "Case 3" in Documentation/core-api/pin_user_pages.rst, which is to say, normal get_user_pages() and put_page() is the API to use there. The new pin_user_pages() calls replace corresponding get_user_pages() calls, and set the FOLL_PIN flag. The FOLL_PIN flag requires that the caller must return the pages via put_user_page*() calls, but infiniband was already doing that as part of an earlier commit. Link: http://lkml.kernel.org/r/20200107224558.2362728-14-jhubbard@nvidia.com Signed-off-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Björn Töpel <bjorn.topel@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jens Axboe <axboe@kernel.dk> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Leon Romanovsky <leonro@mellanox.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-01-31 10:30:37 -08:00
Jason Gunthorpe	8889f6fa35	RDMA/core: Make the entire API tree static Compilation of mlx5 driver without CONFIG_INFINIBAND_USER_ACCESS generates the following error. on x86_64: ld: drivers/infiniband/hw/mlx5/main.o: in function `mlx5_ib_handler_MLX5_IB_METHOD_VAR_OBJ_ALLOC': main.c:(.text+0x186d): undefined reference to `ib_uverbs_get_ucontext_file' ld: drivers/infiniband/hw/mlx5/main.o:(.rodata+0x2480): undefined reference to `uverbs_idr_class' ld: drivers/infiniband/hw/mlx5/main.o:(.rodata+0x24d8): undefined reference to `uverbs_destroy_def_handler' This is happening because some parts of the UAPI description are not static. This is a hold over from earlier code that relied on struct pointers to refer to object types, now object types are referenced by number. Remove the unused globals and add statics to the remaining UAPI description elements. Remove the redundent #ifdefs around mlx5_ib_*defs and obsolete mlx5_ib_get_devx_tree(). The compiler now trims alot more unused code, including the above problematic definitions when !CONFIG_INFINIBAND_USER_ACCESS. Fixes: `7be76bef32` ("IB/mlx5: Introduce VAR object and its alloc/destroy methods") Reported-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-01-30 16:28:52 -04:00
Gal Pressman	ba19e16651	RDMA/efa: Mask access flags with the correct optional range The uapi value IB_UVERBS_ACCESS_OPTIONAL_RANGE shouldn't be used inside the driver, use IB_ACCESS_OPTIONAL instead. Fixes: `86dd738cf2` ("RDMA/efa: Allow passing of optional access flags for MR registration") Link: https://lore.kernel.org/r/20200129071803.40117-1-galpress@amazon.com Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-01-29 16:41:05 -04:00
Linus Torvalds	bd2463ac7d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from David Miller: 1) Add WireGuard 2) Add HE and TWT support to ath11k driver, from John Crispin. 3) Add ESP in TCP encapsulation support, from Sabrina Dubroca. 4) Add variable window congestion control to TIPC, from Jon Maloy. 5) Add BCM84881 PHY driver, from Russell King. 6) Start adding netlink support for ethtool operations, from Michal Kubecek. 7) Add XDP drop and TX action support to ena driver, from Sameeh Jubran. 8) Add new ipv4 route notifications so that mlxsw driver does not have to handle identical routes itself. From Ido Schimmel. 9) Add BPF dynamic program extensions, from Alexei Starovoitov. 10) Support RX and TX timestamping in igc, from Vinicius Costa Gomes. 11) Add support for macsec HW offloading, from Antoine Tenart. 12) Add initial support for MPTCP protocol, from Christoph Paasch, Matthieu Baerts, Florian Westphal, Peter Krystad, and many others. 13) Add Octeontx2 PF support, from Sunil Goutham, Geetha sowjanya, Linu Cherian, and others. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1469 commits) net: phy: add default ARCH_BCM_IPROC for MDIO_BCM_IPROC udp: segment looped gso packets correctly netem: change mailing list qed: FW 8.42.2.0 debug features qed: rt init valid initialization changed qed: Debug feature: ilt and mdump qed: FW 8.42.2.0 Add fw overlay feature qed: FW 8.42.2.0 HSI changes qed: FW 8.42.2.0 iscsi/fcoe changes qed: Add abstraction for different hsi values per chip qed: FW 8.42.2.0 Additional ll2 type qed: Use dmae to write to widebus registers in fw_funcs qed: FW 8.42.2.0 Parser offsets modified qed: FW 8.42.2.0 Queue Manager changes qed: FW 8.42.2.0 Expose new registers and change windows qed: FW 8.42.2.0 Internal ram offsets modifications MAINTAINERS: Add entry for Marvell OcteonTX2 Physical Function driver Documentation: net: octeontx2: Add RVU HW and drivers overview octeontx2-pf: ethtool RSS config support octeontx2-pf: Add basic ethtool support ...	2020-01-28 16:02:33 -08:00
Linus Torvalds	c0e809e244	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "Kernel side changes: - Ftrace is one of the last W^X violators (after this only KLP is left). These patches move it over to the generic text_poke() interface and thereby get rid of this oddity. This requires a surprising amount of surgery, by Peter Zijlstra. - x86/AMD PMUs: add support for 'Large Increment per Cycle Events' to count certain types of events that have a special, quirky hw ABI (by Kim Phillips) - kprobes fixes by Masami Hiramatsu Lots of tooling updates as well, the following subcommands were updated: annotate/report/top, c2c, clang, record, report/top TUI, sched timehist, tests; plus updates were done to the gtk ui, libperf, headers and the parser" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (57 commits) perf/x86/amd: Add support for Large Increment per Cycle Events perf/x86/amd: Constrain Large Increment per Cycle events perf/x86/intel/rapl: Add Comet Lake support tracing: Initialize ret in syscall_enter_define_fields() perf header: Use last modification time for timestamp perf c2c: Fix return type for histogram sorting comparision functions perf beauty sockaddr: Fix augmented syscall format warning perf/ui/gtk: Fix gtk2 build perf ui gtk: Add missing zalloc object perf tools: Use %define api.pure full instead of %pure-parser libperf: Setup initial evlist::all_cpus value perf report: Fix no libunwind compiled warning break s390 issue perf tools: Support --prefix/--prefix-strip perf report: Clarify in help that --children is default tools build: Fix test-clang.cpp with Clang 8+ perf clang: Fix build with Clang 9 kprobes: Fix optimize_kprobe()/unoptimize_kprobe() cancellation logic tools lib: Fix builds when glibc contains strlcpy() perf report/top: Make 'e' visible in the help and make it toggle showing callchains perf report/top: Do not offer annotation for symbols without samples ...	2020-01-28 09:44:15 -08:00
Linus Torvalds	634cd4b6af	Merge branch 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI updates from Ingo Molnar: "The main changes in this cycle were: - Cleanup of the GOP [graphics output] handling code in the EFI stub - Complete refactoring of the mixed mode handling in the x86 EFI stub - Overhaul of the x86 EFI boot/runtime code - Increase robustness for mixed mode code - Add the ability to disable DMA at the root port level in the EFI stub - Get rid of RWX mappings in the EFI memory map and page tables, where possible - Move the support code for the old EFI memory mapping style into its only user, the SGI UV1+ support code. - plus misc fixes, updates, smaller cleanups. ... and due to interactions with the RWX changes, another round of PAT cleanups make a guest appearance via the EFI tree - with no side effects intended" * 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (75 commits) efi/x86: Disable instrumentation in the EFI runtime handling code efi/libstub/x86: Fix EFI server boot failure efi/x86: Disallow efi=old_map in mixed mode x86/boot/compressed: Relax sed symbol type regex for LLVM ld.lld efi/x86: avoid KASAN false positives when accessing the 1: 1 mapping efi: Fix handling of multiple efi_fake_mem= entries efi: Fix efi_memmap_alloc() leaks efi: Add tracking for dynamically allocated memmaps efi: Add a flags parameter to efi_memory_map efi: Fix comment for efi_mem_type() wrt absent physical addresses efi/arm: Defer probe of PCIe backed efifb on DT systems efi/x86: Limit EFI old memory map to SGI UV machines efi/x86: Avoid RWX mappings for all of DRAM efi/x86: Don't map the entire kernel text RW for mixed mode x86/mm: Fix NX bit clearing issue in kernel_map_pages_in_pgd efi/libstub/x86: Fix unused-variable warning efi/libstub/x86: Use mandatory 16-byte stack alignment in mixed mode efi/libstub/x86: Use const attribute for efi_is_64bit() efi: Allow disabling PCI busmastering on bridges during boot efi/x86: Allow translating 64-bit arguments for mixed mode calls ...	2020-01-28 09:03:40 -08:00
Linus Torvalds	6a1000bd27	ioremap changes for 5.6 - remove ioremap_nocache given that is is equivalent to ioremap everywhere -----BEGIN PGP SIGNATURE----- iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAl4vKHwLHGhjaEBsc3Qu ZGUACgkQD55TZVIEUYMPGBAAuVNUZaZfWYHpiVP2oRcUQUguFiD3NTbknsyzV2oH J9P0GfeENSKwE9OOhZ7XIjnCZAJwQgTK/ppQY5yiQ/KAtYyyXjXEJ6jqqjiTDInr +3+I3t/LhkgrK7tMrb7ylTGa/d7KhaciljnOXC8+b75iddvM9I1z2pbHDbppZMS9 wT4RXL/cFtRb85AfOyPLybcka3f5P2gGvQz38qyimhJYEzHDXZu9VO1Bd20f8+Xf eLBKX0o6yWMhcaPLma8tm0M0zaXHEfLHUKLSOkiOk+eHTWBZ3b/w5nsOQZYZ7uQp 25yaClbameAn7k5dHajduLGEJv//ZjLRWcN3HJWJ5vzO111aHhswpE7JgTZJSVWI ggCVkytD3ESXapvswmACSeCIDMmiJMzvn6JvwuSMVB7a6e5mcqTuGo/FN+DrBF/R IP+/gY/T7zIIOaljhQVkiEIIwiD/akYo0V9fheHTBnqcKEDTHV4WjKbeF6aCwcO+ b8inHyXZSKSMG//UlDuN84/KH/o1l62oKaB1uDIYrrL8JVyjAxctWt3GOt5KgSFq wVz1lMw4kIvWtC/Sy2H4oB+RtODLp6yJDqmvmPkeJwKDUcd/1JKf0KsZ8j3FpGei /rEkBEss0KBKyFAgBSRO2jIpdj2epgcBcsdB/r5mlhcn8L77AS6mHbA173kY4pQ/ Kdg= =TUCJ -----END PGP SIGNATURE----- Merge tag 'ioremap-5.6' of git://git.infradead.org/users/hch/ioremap Pull ioremap updates from Christoph Hellwig: "Remove the ioremap_nocache API (plus wrappers) that are always identical to ioremap" * tag 'ioremap-5.6' of git://git.infradead.org/users/hch/ioremap: remove ioremap_nocache and devm_ioremap_nocache MIPS: define ioremap_nocache to ioremap	2020-01-27 13:03:00 -08:00
Håkon Bugge	ea660ad7c1	IB/mlx4: Fix leak in id_map_find_del Using CX-3 virtual functions, either from a bare-metal machine or pass-through from a VM, MAD packets are proxied through the PF driver. Since the VF drivers have separate name spaces for MAD Transaction Ids (TIDs), the PF driver has to re-map the TIDs and keep the book keeping in a cache. Following the RDMA Connection Manager (CM) protocol, it is clear when an entry has to evicted from the cache. When a DREP is sent from mlx4_ib_multiplex_cm_handler(), id_map_find_del() is called. Similar when a REJ is received by the mlx4_ib_demux_cm_handler(), id_map_find_del() is called. This function wipes out the TID in use from the IDR or XArray and removes the id_map_entry from the table. In short, it does everything except the topping of the cake, which is to remove the entry from the list and free it. In other words, for the REJ case enumerated above, one id_map_entry will be leaked. For the other case above, a DREQ has been received first. The reception of the DREQ will trigger queuing of a delayed work to delete the id_map_entry, for the case where the VM doesn't send back a DREP. In the normal case, the VM _will_ send back a DREP, and id_map_find_del() will be called. But this scenario introduces a secondary leak. First, when the DREQ is received, a delayed work is queued. The VM will then return a DREP, which will call id_map_find_del(). As stated above, this will free the TID used from the XArray or IDR. Now, there is window where that particular TID can be re-allocated, lets say by an outgoing REQ. This TID will later be wiped out by the delayed work, when the function id_map_ent_timeout() is called. But the id_map_entry allocated by the outgoing REQ will not be de-allocated, and we have a leak. Both leaks are fixed by removing the id_map_find_del() function and only using schedule_delayed(). Of course, a check in schedule_delayed() to see if the work already has been queued, has been added. Another benefit of always using the delayed version for deleting entries, is that we do get a TimeWait effect; a TID no longer in use, will occupy the XArray or IDR for CM_CLEANUP_CACHE_TIMEOUT time, without any ability of being re-used for that time period. Fixes: `3cf69cc8db` ("IB/mlx4: Add CM paravirtualization") Link: https://lore.kernel.org/r/20200123155521.1212288-1-haakon.bugge@oracle.com Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Signed-off-by: Manjunath Patil <manjunath.b.patil@oracle.com> Reviewed-by: Rama Nichanamatlu <rama.nichanamatlu@oracle.com> Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-01-27 16:46:53 -04:00
Nathan Chancellor	79ba4f9310	IB/hfi1: Fix logical condition in msix_request_irq Clang warns: drivers/infiniband/hw/hfi1/msix.c:136:22: warning: overlapping comparisons always evaluate to false [-Wtautological-overlap-compare] if (type < IRQ_SDMA && type >= IRQ_OTHER) ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~ 1 warning generated. It is impossible for something to be less than 0 (IRQ_SDMA) and greater than or equal to 3 (IRQ_OTHER) at the same time. A logical OR should have been used to keep the same logic as before. Link: https://lore.kernel.org/r/20200116222658.5285-1-natechancellor@gmail.com Link: https://github.com/ClangBuiltLinux/linux/issues/841 Fixes: `13d2a8384b` ("IB/hfi1: Decouple IRQ name from type") Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-01-25 15:33:53 -04:00
Danit Goldberg	4bbd4923d1	IB/mlx5: Return the administrative GUID if exists A user can change the operational GUID (a.k.a affective GUID) through link/infiniband. Therefore it is preferred to return the currently set GUID if it exists instead of the operational. This way the PF can query which VF GUID will be set in the next bind. In order to align with MAC address, zero is returned if administrative GUID is not set. For example, before setting administrative GUID: $ ip link show ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 4092 qdisc mq state UP mode DEFAULT group default qlen 256 link/infiniband 00:00:00:08:fe:80:00:00:00:00:00:00:52:54:00:c0:fe:12:34:55 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff vf 0 link/infiniband 00:00:00:08:fe:80:00:00:00:00:00:00:52:54:00:c0:fe:12:34:55 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff, spoof checking off, NODE_GUID 00:00:00:00:00:00:00:00, PORT_GUID 00:00:00:00:00:00:00:00, link-state auto, trust off, query_rss off Then: $ ip link set ib0 vf 0 node_guid 11:00:af:21:cb:05:11:00 $ ip link set ib0 vf 0 port_guid 22:11:af:21:cb:05:11:00 After setting administrative GUID: $ ip link show ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 4092 qdisc mq state UP mode DEFAULT group default qlen 256 link/infiniband 00:00:00:08:fe:80:00:00:00:00:00:00:52:54:00:c0:fe:12:34:55 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff vf 0 link/infiniband 00:00:00:08:fe:80:00:00:00:00:00:00:52:54:00:c0:fe:12:34:55 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff, spoof checking off, NODE_GUID 11:00:af:21:cb:05:11:00, PORT_GUID 22:11:af:21:cb:05:11:00, link-state auto, trust off, query_rss off Fixes: `9c0015ef09` ("IB/mlx5: Implement callbacks for getting VFs GUID attributes") Link: https://lore.kernel.org/r/20200116120048.12744-1-leon@kernel.org Signed-off-by: Danit Goldberg <danitg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-01-25 14:54:39 -04:00
Jason Gunthorpe	e8b3a426fb	Use ODP MRs for kernel ULPs The following series extends MR creation routines to allow creation of user MRs through kernel ULPs as a proxy. The immediate use case is to allow RDS to work over FS-DAX, which requires ODP (on-demand-paging) MRs to be created and such MRs were not possible to create prior this series. The first part of this patchset extends RDMA to have special verb ib_reg_user_mr(). The common use case that uses this function is a userspace application that allocates memory for HCA access but the responsibility to register the memory at the HCA is on an kernel ULP. This ULP acts as an agent for the userspace application. The second part provides advise MR functionality for ULPs. This is integral part of ODP flows and used to trigger pagefaults in advance to prepare memory before running working set. The third part is actual user of those in-kernel APIs. -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQT1m3YD37UfMCUQBNwp8NhrnBAZsQUCXiVO8AAKCRAp8NhrnBAZ scTrAP9gb0d3qv0IOtHw5aGI1DAgjTUn/SzUOnsjDEn7DIoh9gEA2+ZmaEyLXKrl +UcZb31auy5P8ueJYokRLhLAyRcOIAg= =yaHb -----END PGP SIGNATURE----- Merge tag 'rds-odp-for-5.5' into rdma.git for-next From https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma Leon Romanovsky says: ==================== Use ODP MRs for kernel ULPs The following series extends MR creation routines to allow creation of user MRs through kernel ULPs as a proxy. The immediate use case is to allow RDS to work over FS-DAX, which requires ODP (on-demand-paging) MRs to be created and such MRs were not possible to create prior this series. The first part of this patchset extends RDMA to have special verb ib_reg_user_mr(). The common use case that uses this function is a userspace application that allocates memory for HCA access but the responsibility to register the memory at the HCA is on an kernel ULP. This ULP acts as an agent for the userspace application. The second part provides advise MR functionality for ULPs. This is integral part of ODP flows and used to trigger pagefaults in advance to prepare memory before running working set. The third part is actual user of those in-kernel APIs. ==================== * tag 'rds-odp-for-5.5': net/rds: Use prefetch for On-Demand-Paging MR net/rds: Handle ODP mr registration/unregistration net/rds: Detect need of On-Demand-Paging memory registration RDMA/mlx5: Fix handling of IOVA != user_va in ODP paths IB/mlx5: Mask out unsupported ODP capabilities for kernel QPs RDMA/mlx5: Don't fake udata for kernel path IB/mlx5: Add ODP WQE handlers for kernel QPs IB/core: Add interface to advise_mr for kernel users IB/core: Introduce ib_reg_user_mr IB: Allow calls to ib_umem_get from kernel ULPs Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-01-21 09:55:04 -04:00
David S. Miller	ad063075d4	Use ODP MRs for kernel ULPs The following series extends MR creation routines to allow creation of user MRs through kernel ULPs as a proxy. The immediate use case is to allow RDS to work over FS-DAX, which requires ODP (on-demand-paging) MRs to be created and such MRs were not possible to create prior this series. The first part of this patchset extends RDMA to have special verb ib_reg_user_mr(). The common use case that uses this function is a userspace application that allocates memory for HCA access but the responsibility to register the memory at the HCA is on an kernel ULP. This ULP acts as an agent for the userspace application. The second part provides advise MR functionality for ULPs. This is integral part of ODP flows and used to trigger pagefaults in advance to prepare memory before running working set. The third part is actual user of those in-kernel APIs. -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQT1m3YD37UfMCUQBNwp8NhrnBAZsQUCXiVO8AAKCRAp8NhrnBAZ scTrAP9gb0d3qv0IOtHw5aGI1DAgjTUn/SzUOnsjDEn7DIoh9gEA2+ZmaEyLXKrl +UcZb31auy5P8ueJYokRLhLAyRcOIAg= =yaHb -----END PGP SIGNATURE----- Merge tag 'rds-odp-for-5.5' of https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma Leon Romanovsky says: ==================== Use ODP MRs for kernel ULPs The following series extends MR creation routines to allow creation of user MRs through kernel ULPs as a proxy. The immediate use case is to allow RDS to work over FS-DAX, which requires ODP (on-demand-paging) MRs to be created and such MRs were not possible to create prior this series. The first part of this patchset extends RDMA to have special verb ib_reg_user_mr(). The common use case that uses this function is a userspace application that allocates memory for HCA access but the responsibility to register the memory at the HCA is on an kernel ULP. This ULP acts as an agent for the userspace application. The second part provides advise MR functionality for ULPs. This is integral part of ODP flows and used to trigger pagefaults in advance to prepare memory before running working set. The third part is actual user of those in-kernel APIs. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-21 10:22:51 +01:00
Ingo Molnar	cb6c82df68	Linux 5.5-rc7 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl4k7i8eHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGvk0IAKRenVOdiudY77SQ VZjsteyrYTTQtPPv494ToIRjR0XQ+gYp8vyWzXTUC5Nm9Y9U3VzDqUPUjWszrSXE 6mU+tzcMc9qwuUxnIFn8zfg64ygw+37sn/w3xqeH4QmF9Z5Wl3EX3SdXTs7jp3RS VxiztkUNI5ZBV2GDtla5K/9qLPqCQnUYXIiyi5lAtBtiitZDVXFp7dy7hMgEiaEO +78K5Kh3xlt5ndDsBFOlwIb2Oof3KL7bBXntdbSBc/bjol6IRvAgln48HWCv59G2 jzAp2tj2KobX9GRAEPj+v4TQZEW0SXDNDi8MgQsM+3DYVCTmANsv57CBKRuf01+F nB1kAys= =zSnJ -----END PGP SIGNATURE----- Merge tag 'v5.5-rc7' into perf/core, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2020-01-20 08:43:44 +01:00
David S. Miller	b3f7e3f23a	Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/net	2020-01-19 22:10:04 +01:00
Saeed Mahameed	12e9e0d0d9	Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux This merge syncs with mlx5-next latest HW bits and layout updates for next features, in addition one patch that improves mlx5_create_auto_grouped_flow_table() API across all mlx5 users. * 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5: Refactor mlx5_create_auto_grouped_flow_table net/mlx5e: Add discard counters per priority net/mlx5e: Expose FEC feilds and related capability bit net/mlx5: Add mlx5_ifc definitions for connection tracking support net/mlx5: Add copy header action struct layout net/mlx5: Expose resource dump register mapping net/mlx5: Add structures and defines for MIRC register net/mlx5: Read MCAM register groups 1 and 2 net/mlx5: Add structures layout for new MCAM access reg groups net/mlx5: Expose vDPA emulation device capabilities net/mlx5: Add Virtio Emulation related device capabilities Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-01-16 15:48:24 -08:00
Paul Blakey	61dc7b0141	net/mlx5: Refactor mlx5_create_auto_grouped_flow_table Refactor mlx5_create_auto_grouped_flow_table() to use ft_attr param which already carries the max_fte, prio and flags memebers, and is used the same in similar mlx5_create_flow_table() function. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-01-16 15:41:59 -08:00
Jack Morgenstein	eaad647e5c	IB/mlx4: Fix memory leak in add_gid error flow In procedure mlx4_ib_add_gid(), if the driver is unable to update the FW gid table, there is a memory leak in the driver's copy of the gid table: the gid entry's context buffer is not freed. If such an error occurs, free the entry's context buffer, and mark the entry as available (by setting its context pointer to NULL). Fixes: `e26be1bfef` ("IB/mlx4: Implement ib_device callbacks") Link: https://lore.kernel.org/r/20200115085050.73746-1-leon@kernel.org Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-01-16 16:13:22 -04:00

... 3 4 5 6 7 ...

7064 Commits