linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-23 08:33:45 +07:00

Author	SHA1	Message	Date
Bart Van Assche	2d67017cc7	IB/srpt: Micro-optimize I/O context state manipulation Since all I/O context state changes are already serialized, it is not necessary to protect I/O context state changes with the I/O context spinlock. Hence remove that spinlock. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2018-01-08 16:05:12 -05:00
Bart Van Assche	dd3bec8655	IB/srpt: Inline srpt_get_cmd_state() It is not necessary to obtain ioctx->spinlock when reading the ioctx state. Since after removal of this locking only a single line remains, inline the srpt_get_cmd_state() function. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2018-01-08 16:05:12 -05:00
Bart Van Assche	b14cb74479	IB/srpt: Introduce srpt_format_guid() Introduce a function for converting a GUID into an ASCII string. This patch does not change any functionality. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2018-01-08 16:05:12 -05:00
Bart Van Assche	3fa7f0e994	IB/srpt: Reduce frequency of receive failure messages Disabling an SRP target port causes the state of all QPs associated with a port to be changed into IB_QPS_ERR. Avoid that this causes one error message per I/O context to be reported. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2018-01-08 16:05:12 -05:00
Bart Van Assche	ea51d2e12a	IB/srpt: Convert a warning into a debug message At least when running the ib_srpt driver on top of the rdma_rxe driver it is easy to trigger a zero-length write completion in the CH_DISCONNECTED state. Hence make the message that reports this less noisy. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2018-01-08 16:05:12 -05:00
Bart Van Assche	b185d3f895	IB/srpt: Use the IPv6 format for GIDs in log messages Make the ib_srpt driver use the IPv6 format for GIDs in log messages to improve consistency of this driver with other RDMA kernel drivers. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2018-01-08 16:05:12 -05:00
Bart Van Assche	5f1141b165	IB/srpt: Verify port numbers in srpt_event_handler() Verify whether port numbers are in the expected range before using these as an array index. Complain if a port number is out of range. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2018-01-08 16:05:12 -05:00
Bart Van Assche	d9f45ae69f	IB/srpt: Reduce the severity level of a log message Since the SRQ event message is only useful for debugging purposes, reduce its severity from "informational" to "debug". Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2018-01-08 16:05:12 -05:00
Bart Van Assche	ed262287e2	IB/srpt: Rename a local variable, a member variable and a constant Rename rsp_size into max_rsp_size and SRPT_RQ_SIZE into MAX_SRPT_RQ_SIZE. The new names better reflect the role of this member variable and constant. Since the prefix "srp_" is superfluous in the context of the function that creates an RDMA channel, rename srp_sq_size into sq_size. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2018-01-08 16:05:12 -05:00
Bart Van Assche	f8781a5300	IB/srpt: Document all structure members in ib_srpt.h This patch avoids that the following command reports any warnings: scripts/kernel-doc -none drivers/infiniband/ulp/srpt/ib_srpt.h Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2018-01-08 16:05:12 -05:00
Bart Van Assche	10eac19bb2	IB/srpt: Fix kernel-doc warnings in ib_srpt.c Avoid that warnings about missing parameter descriptions are reported when building with W=1. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2018-01-08 16:05:11 -05:00
Bart Van Assche	1b0bb73f72	IB/srpt: Remove an unused structure member Fixes: commit `a42d985bd5` ("ib_srpt: Initial SRP Target merge for v3.3-rc1") Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2018-01-08 16:05:11 -05:00
Bart Van Assche	a1ffa4670c	IB/srpt: Fix ACL lookup during login Make sure that the initiator port GUID is stored in ch->ini_guid. Note: when initiating a connection sgid and dgid members in struct sa_path_rec represent the source and destination GIDs. When accepting a connection however sgid represents the destination GID and dgid the source GID. Fixes: commit `2bce1a6d22` ("IB/srpt: Accept GUIDs as port names") Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: <stable@vger.kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2018-01-03 20:07:51 -07:00
Bart Van Assche	bec40c2604	IB/srpt: Disable RDMA access by the initiator With the SRP protocol all RDMA operations are initiated by the target. Since no RDMA operations are initiated by the initiator, do not grant the initiator permission to submit RDMA reads or writes to the target. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: <stable@vger.kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2018-01-03 20:07:21 -07:00
Leon Romanovsky	f8978bd95c	RDMA/netlink: Fix locking around __ib_get_device_by_index Holding locks is mandatory when calling __ib_device_get_by_index, otherwise there are races during the list iteration with device removal. Since the locks are static to device.c, __ib_device_get_by_index can never be called correctly by any user out side the file. Make the function static and provide a safe function that gets the correct locks and returns a kref'd pointer. Fix all callers. Fixes: `e5c9469efc` ("RDMA/netlink: Add nldev device doit implementation") Fixes: `c3f66f7b00` ("RDMA/netlink: Implement nldev port doit callback") Fixes: `7d02f605f0` ("RDMA/netlink: Add nldev port dumpit implementation") Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2018-01-02 14:11:40 -07:00
Erez Shitrit	16ba3defb8	IB/ipoib: Fix race condition in neigh creation When using enhanced mode for IPoIB, two threads may execute xmit in parallel to two different TX queues while the target is the same. In this case, both of them will add the same neighbor to the path's neigh link list and we might see the following message: list_add double add: new=ffff88024767a348, prev=ffff88024767a348... WARNING: lib/list_debug.c:31__list_add_valid+0x4e/0x70 ipoib_start_xmit+0x477/0x680 [ib_ipoib] dev_hard_start_xmit+0xb9/0x3e0 sch_direct_xmit+0xf9/0x250 __qdisc_run+0x176/0x5d0 __dev_queue_xmit+0x1f5/0xb10 __dev_queue_xmit+0x55/0xb10 Analysis: Two SKB are scheduled to be transmitted from two cores. In ipoib_start_xmit, both gets NULL when calling ipoib_neigh_get. Two calls to neigh_add_path are made. One thread takes the spin-lock and calls ipoib_neigh_alloc which creates the neigh structure, then (after the __path_find) the neigh is added to the path's neigh link list. When the second thread enters the critical section it also calls ipoib_neigh_alloc but in this case it gets the already allocated ipoib_neigh structure, which is already linked to the path's neigh link list and adds it again to the list. Which beside of triggering the list, it creates a loop in the linked list. This loop leads to endless loop inside path_rec_completion. Solution: Check list_empty(&neigh->list) before adding to the list. Add a similar fix in "ipoib_multicast.c::ipoib_mcast_send" Fixes: `b63b70d877` ('IPoIB: Use a private hash table for path lookup in xmit path') Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2018-01-02 11:09:05 -07:00
Leon Romanovsky	5a371cf87e	IB/mlx4: Fix mlx4_ib_alloc_mr error flow ibmr.device is being set only after ib_alloc_mr() is successfully complete. Therefore, in case imlx4_mr_enable() returns with error, the error flow unwinder calls to mlx4_free_priv_pages(), which uses ibmr.device. Such usage causes to NULL dereference oops and to fix it, the IB device should be set in the mr struct earlier stage (e.g. prior to calling mlx4_free_priv_pages()). Fixes: `1b2cd0fc67` ("IB/mlx4: Support the new memory registration API") Signed-off-by: Nitzan Carmi <nitzanc@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2018-01-02 11:09:05 -07:00
Nitzan Carmi	45e6ae7ef2	IB/mlx5: Fix mlx5_ib_alloc_mr error flow ibmr.device is being set only after ib_alloc_mr() is (successfully) complete. Therefore, in case mlx5_core_create_mkey() return with error, the error flow calls mlx5_free_priv_descs() which uses ibmr.device (which doesn't exist yet), causing a NULL dereference oops. To fix this, the IB device should be set in the mr struct earlier stage (e.g. prior to calling mlx5_core_create_mkey()). Fixes: `8a187ee52b` ("IB/mlx5: Support the new memory registration API") Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Nitzan Carmi <nitzanc@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-27 15:24:41 -07:00
Moni Shoua	4a50881bba	IB/core: Verify that QP is security enabled in create and destroy The XRC target QP create flow sets up qp_sec only if there is an IB link with LSM security enabled. However, several other related uAPI entry points blindly follow the qp_sec NULL pointer, resulting in a possible oops. Check for NULL before using qp_sec. Cc: <stable@vger.kernel.org> # v4.12 Fixes: `d291f1a652` ("IB/core: Enforce PKey security on QPs") Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-27 15:24:41 -07:00
Moni Shoua	05d14e7b0c	IB/uverbs: Fix command checking as part of ib_uverbs_ex_modify_qp() If the input command length is larger than the kernel supports an error should be returned in case the unsupported bytes are not cleared, instead of the other way aroudn. This matches what all other callers of ib_is_udata_cleared do and will avoid user ABI problems in the future. Cc: <stable@vger.kernel.org> # v4.10 Fixes: `189aba99e7` ("IB/uverbs: Extend modify_qp and support packet pacing") Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-27 15:24:41 -07:00
Majd Dibbiny	ad9a3668a4	IB/mlx5: Serialize access to the VMA list User-space applications can do mmap and munmap directly at any time. Since the VMA list is not protected with a mutex, concurrent accesses to the VMA list from the mmap and munmap can cause data corruption. Add a mutex around the list. Cc: <stable@vger.kernel.org> # v4.7 Fixes: `7c2344c3bb` ("IB/mlx5: Implements disassociate_ucontext API") Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-27 15:24:40 -07:00
Michael J. Ruhl	4c009af473	IB/hfi: Only read capability registers if the capability exists During driver init, various registers are saved to allow restoration after an FLR or gen3 bump. Some of these registers are not available in some circumstances (i.e. Virtual machines). This bug makes the driver unusable when the PCI device is passed into a VM, it fails during probe. Delete unnecessary register read/write, and only access register if the capability exists. Cc: <stable@vger.kernel.org> # 4.14.x Fixes: `a618b7e40a` ("IB/hfi1: Move saving PCI values to a separate function") Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-22 10:42:08 -07:00
Alex Vesker	1f80bd6a6c	IB/ipoib: Fix lockdep issue found on ipoib_ib_dev_heavy_flush The locking order of vlan_rwsem (LOCK A) and then rtnl (LOCK B), contradicts other flows such as ipoib_open possibly causing a deadlock. To prevent this deadlock heavy flush is called with RTNL locked and only then tries to acquire vlan_rwsem. This deadlock is possible only when there are child interfaces. [ 140.941758] ====================================================== [ 140.946276] WARNING: possible circular locking dependency detected [ 140.950950] 4.15.0-rc1+ #9 Tainted: G O [ 140.954797] ------------------------------------------------------ [ 140.959424] kworker/u32:1/146 is trying to acquire lock: [ 140.963450] (rtnl_mutex){+.+.}, at: [<ffffffffc083516a>] __ipoib_ib_dev_flush+0x2da/0x4e0 [ib_ipoib] [ 140.970006] but task is already holding lock: [ 140.975141] (&priv->vlan_rwsem){++++}, at: [<ffffffffc0834ee1>] __ipoib_ib_dev_flush+0x51/0x4e0 [ib_ipoib] [ 140.982105] which lock already depends on the new lock. [ 140.990023] the existing dependency chain (in reverse order) is: [ 140.998650] -> #1 (&priv->vlan_rwsem){++++}: [ 141.005276] down_read+0x4d/0xb0 [ 141.009560] ipoib_open+0xad/0x120 [ib_ipoib] [ 141.014400] __dev_open+0xcb/0x140 [ 141.017919] __dev_change_flags+0x1a4/0x1e0 [ 141.022133] dev_change_flags+0x23/0x60 [ 141.025695] devinet_ioctl+0x704/0x7d0 [ 141.029156] sock_do_ioctl+0x20/0x50 [ 141.032526] sock_ioctl+0x221/0x300 [ 141.036079] do_vfs_ioctl+0xa6/0x6d0 [ 141.039656] SyS_ioctl+0x74/0x80 [ 141.042811] entry_SYSCALL_64_fastpath+0x1f/0x96 [ 141.046891] -> #0 (rtnl_mutex){+.+.}: [ 141.051701] lock_acquire+0xd4/0x220 [ 141.055212] __mutex_lock+0x88/0x970 [ 141.058631] __ipoib_ib_dev_flush+0x2da/0x4e0 [ib_ipoib] [ 141.063160] __ipoib_ib_dev_flush+0x71/0x4e0 [ib_ipoib] [ 141.067648] process_one_work+0x1f5/0x610 [ 141.071429] worker_thread+0x4a/0x3f0 [ 141.074890] kthread+0x141/0x180 [ 141.078085] ret_from_fork+0x24/0x30 [ 141.081559] other info that might help us debug this: [ 141.088967] Possible unsafe locking scenario: [ 141.094280] CPU0 CPU1 [ 141.097953] ---- ---- [ 141.101640] lock(&priv->vlan_rwsem); [ 141.104771] lock(rtnl_mutex); [ 141.109207] lock(&priv->vlan_rwsem); [ 141.114032] lock(rtnl_mutex); [ 141.116800] * DEADLOCK * Fixes: `b4b678b06f` ("IB/ipoib: Grab rtnl lock on heavy flush when calling ndo_open/stop") Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-21 16:06:07 -07:00
Majd Dibbiny	71a0ff65a2	IB/mlx5: Fix congestion counters in LAG mode Congestion counters are counted and queried per physical function. When working in LAG mode, CNP packets can be sent or received on both of the functions, thus congestion counters should be aggregated from the two physical functions. Fixes: `e1f24a79f4` ("IB/mlx5: Support congestion related counters") Signed-off-by: Majd Dibbiny <majd@mellanox.com> Reviewed-by: Aviv Heller <avivh@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-21 16:06:07 -07:00
Bryan Tan	e3524b269e	RDMA/vmw_pvrdma: Avoid use after free due to QP/CQ/SRQ destroy The use of wait queues in vmw_pvrdma for handling concurrent access to a resource leaves a race condition which can cause a use after free bug. Fix this by using the pattern from other drivers, complete() protected by dec_and_test to ensure complete() is called only once. Fixes: `29c8d9eba5` ("IB: Add vmw_pvrdma driver") Signed-off-by: Bryan Tan <bryantan@vmware.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-21 16:06:07 -07:00
Bryan Tan	30a366a9da	RDMA/vmw_pvrdma: Use refcount_dec_and_test to avoid warning refcount_dec generates a warning when the operation causes the refcount to hit zero. Avoid this by using refcount_dec_and_test. Fixes: `8b10ba783c` ("RDMA/vmw_pvrdma: Add shared receive queue support") Reviewed-by: Adit Ranadive <aditr@vmware.com> Reviewed-by: Aditya Sarwade <asarwade@vmware.com> Reviewed-by: Jorgen Hansen <jhansen@vmware.com> Signed-off-by: Bryan Tan <bryantan@vmware.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-21 16:06:07 -07:00
Bryan Tan	17748056ce	RDMA/vmw_pvrdma: Call ib_umem_release on destroy QP path The QP cleanup did not previously call ib_umem_release, resulting in a user-triggerable kernel resource leak. Fixes: `29c8d9eba5` ("IB: Add vmw_pvrdma driver") Reviewed-by: Adit Ranadive <aditr@vmware.com> Reviewed-by: Aditya Sarwade <asarwade@vmware.com> Reviewed-by: Jorgen Hansen <jhansen@vmware.com> Signed-off-by: Bryan Tan <bryantan@vmware.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-21 16:06:06 -07:00
Steve Wise	d145873345	iw_cxgb4: when flushing, complete all wrs in a chain If a wr chain was posted and needed to be flushed, only the first wr in the chain was completed with FLUSHED status. The rest were never completed. This caused isert to hang on shutdown due to the missing completions which left iscsi IO commands referenced, stalling the shutdown. Fixes: `4fe7c2962e` ("iw_cxgb4: refactor sq/rq drain logic") Cc: stable@vger.kernel.org Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-21 16:06:06 -07:00
Steve Wise	96a236ed28	iw_cxgb4: reflect the original WR opcode in drain cqes The flush/drain logic was not retaining the original wr opcode in its completion. This can cause problems if the application uses the completion opcode to make decisions. Use bit 10 of the CQE header word to indicate the CQE is a special drain completion, and save the original WR opcode in the cqe header opcode field. Fixes: `4fe7c2962e` ("iw_cxgb4: refactor sq/rq drain logic") Cc: stable@vger.kernel.org Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-21 16:06:06 -07:00
Steve Wise	f55688c454	iw_cxgb4: Only validate the MSN for successful completions If the RECV CQE is in error, ignore the MSN check. This was causing recvs that were flushed into the sw cq to be completed with the wrong status (BAD_MSN instead of FLUSHED). Cc: stable@vger.kernel.org Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-21 16:06:06 -07:00
Yuval Shaia	9d98e19ba0	IB/ipoib: Restore MM behavior in case of tx_ring allocation failure memalloc_noio_save modifies the behavior of MM, we must restore it after we are done. Fixes: `d83187dda9` ("IB/IPoIB: Convert IPoIB to memalloc_noio_* calls") Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-13 10:31:57 -07:00
Steve Wise	c058ecf6e4	iw_cxgb4: only insert drain cqes if wq is flushed Only insert our special drain CQEs to support ib_drain_sq/rq() after the wq is flushed. Otherwise, existing but not yet polled CQEs can be returned out of order to the user application. This can happen when the QP has exited RTS but not yet flushed the QP, which can happen during a normal close (vs abortive close). In addition never count the drain CQEs when determining how many CQEs need to be synthesized during the flush operation. This latter issue should never happen if the QP is properly flushed before inserting the drain CQE, but I wanted to avoid corrupting the CQ state. So we handle it and log a warning once. Fixes: `4fe7c2962e` ("iw_cxgb4: refactor sq/rq drain logic") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Cc: stable@vger.kernel.org Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-11 15:33:51 -07:00
Steve Wise	335ebf6fa3	iw_cxgb4: only clear the ARMED bit if a notification is needed In __flush_qp(), the CQ ARMED bit was being cleared regardless of whether any notification is actually needed. This resulted in the iser termination logic getting stuck in ib_drain_sq() because the CQ was not marked ARMED and thus the drain CQE notification wasn't triggered. This new bug was exposed when this commit was merged: commit `cbb40fadd3` ("iw_cxgb4: only call the cq comp_handler when the cq is armed") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-07 14:09:59 -07:00
Leon Romanovsky	d0e312fe3d	RDMA/netlink: Fix general protection fault The RDMA netlink core code checks validity of messages by ensuring that type and operand are in range. It works well for almost all clients except NLDEV, which has cb_table less than number of operands. Request to access such operand will trigger the following kernel panic. This patch updates all places where cb_table is declared for the consistency, but only NLDEV is actually need it. general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN Modules linked in: CPU: 0 PID: 522 Comm: syz-executor6 Not tainted 4.13.0+ #4 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 task: ffff8800657799c0 task.stack: ffff8800695d000 RIP: 0010:rdma_nl_rcv_msg+0x13a/0x4c0 RSP: 0018:ffff8800695d7838 EFLAGS: 00010207 RAX: dffffc0000000000 RBX: 1ffff1000d2baf0b RCX: 00000000704ff4d7 RDX: 0000000000000000 RSI: ffffffff81ddb03c RDI: 00000003827fa6bc RBP: ffff8800695d7900 R08: ffffffff82ec0578 R09: 0000000000000000 R10: ffff8800695d7900 R11: 0000000000000001 R12: 000000000000001c R13: ffff880069d31e00 R14: 00000000ffffffff R15: ffff880069d357c0 FS: 00007fee6acb8700(0000) GS:ffff88006ca00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000201a9000 CR3: 0000000059766000 CR4: 00000000000006b0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ? rdma_nl_multicast+0x80/0x80 rdma_nl_rcv+0x36b/0x4d0 ? ibnl_put_attr+0xc0/0xc0 netlink_unicast+0x4bd/0x6d0 ? netlink_sendskb+0x50/0x50 ? drop_futex_key_refs.isra.4+0x68/0xb0 netlink_sendmsg+0x9ab/0xbd0 ? nlmsg_notify+0x140/0x140 ? wake_up_q+0xa1/0xf0 ? drop_futex_key_refs.isra.4+0x68/0xb0 sock_sendmsg+0x88/0xd0 sock_write_iter+0x228/0x3c0 ? sock_sendmsg+0xd0/0xd0 ? do_futex+0x3e5/0xb20 ? iov_iter_init+0xaf/0x1d0 __vfs_write+0x46e/0x640 ? sched_clock_cpu+0x1b/0x190 ? __vfs_read+0x620/0x620 ? __fget+0x23a/0x390 ? rw_verify_area+0xca/0x290 vfs_write+0x192/0x490 SyS_write+0xde/0x1c0 ? SyS_read+0x1c0/0x1c0 ? trace_hardirqs_on_thunk+0x1a/0x1c entry_SYSCALL_64_fastpath+0x18/0xad RIP: 0033:0x7fee6a74a219 RSP: 002b:00007fee6acb7d58 EFLAGS: 00000212 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000638000 RCX: 00007fee6a74a219 RDX: 0000000000000078 RSI: 0000000020141000 RDI: 0000000000000006 RBP: 0000000000000046 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000212 R12: ffff8800695d7f98 R13: 0000000020141000 R14: 0000000000000006 R15: 00000000ffffffff Code: d6 48 b8 00 00 00 00 00 fc ff df 66 41 81 e4 ff 03 44 8d 72 ff 4a 8d 3c b5 c0 a6 7f 82 44 89 b5 4c ff ff ff 48 89 f9 48 c1 e9 03 <0f> b6 0c 01 48 89 f8 83 e0 07 83 c0 03 38 c8 7c 08 84 c9 0f 85 RIP: rdma_nl_rcv_msg+0x13a/0x4c0 RSP: ffff8800695d7838 ---[ end trace ba085d123959c8ec ]--- Kernel panic - not syncing: Fatal exception Cc: syzkaller <syzkaller@googlegroups.com> Fixes: `b4c598a67e` ("RDMA/netlink: Implement nldev device dumpit calback") Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-12-07 15:28:07 -05:00
Guy Levi	4d02ebd9bb	IB/mlx4: Fix RSS hash fields restrictions Mistakenly the driver didn't allow RSS hash fields combinations which involve both IPv4 and IPv6 protocols. This bug caused to failures for user's use cases for RSS. Consequently, this patch fixes this bug and allows any combination that the HW can support. Additionally, the patch fixes the driver to return an error in case the user provides an unsupported mask for RSS hash fields. Fixes: `3078f5f1bd` ("IB/mlx4: Add support for RSS QP") Signed-off-by: Guy Levi <guyle@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-12-07 15:28:07 -05:00
Daniel Jurgens	0fbe8f575b	IB/core: Don't enforce PKey security on SMI MADs Per the infiniband spec an SMI MAD can have any PKey. Checking the pkey on SMI MADs is not necessary, and it seems that some older adapters using the mthca driver don't follow the convention of using the default PKey, resulting in false denials, or errors querying the PKey cache. SMI MAD security is still enforced, only agents allowed to manage the subnet are able to receive or send SMI MADs. Reported-by: Chris Blake <chrisrblake93@gmail.com> Cc: <stable@vger.kernel.org> # v4.12 Fixes: `47a2b338fe` ("IB/core: Enforce security on management datagrams") Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-12-07 15:28:06 -05:00
Daniel Jurgens	4cae8ff136	IB/core: Bound check alternate path port number The alternate port number is used as an array index in the IB security implementation, invalid values can result in a kernel panic. Cc: <stable@vger.kernel.org> # v4.12 Fixes: `d291f1a652` ("IB/core: Enforce PKey security on QPs") Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-12-07 15:28:06 -05:00
Daniel Jurgens	315d160c5a	IB/core: Only enforce security for InfiniBand For now the only LSM security enforcement mechanism available is specific to InfiniBand. Bypass enforcement for non-IB link types. This fixes a regression where modify_qp fails for iWARP because querying the PKEY returns -EINVAL. Cc: Paul Moore <paul@paul-moore.com> Cc: Don Dutile <ddutile@redhat.com> Cc: stable@vger.kernel.org Reported-by: Potnuri Bharat Teja <bharat@chelsio.com> Fixes: d291f1a65232("IB/core: Enforce PKey security on QPs") Fixes: 47a2b338fe63("IB/core: Enforce security on management datagrams") Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Tested-by: Potnuri Bharat Teja <bharat@chelsio.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-01 12:21:28 -07:00
$Wei Hu$Xavier$$ Wei Hu$Xavier$	378efe798e	RDMA/hns: Get rid of page operation after dma_alloc_coherent In general, dma_alloc_coherent() returns a CPU virtual address and a DMA address, and we have no guarantee that the underlying memory even has an associated struct page at all. This patch gets rid of the page operation after dma_alloc_coherent, and records the VA returned form dma_alloc_coherent in the struct of hem in hns RoCE driver. Fixes: 9a44353("IB/hns: Add driver files for hns RoCE driver") Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Signed-off-by: Shaobo Xu <xushaobo2@huawei.com> Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Xiping Zhang (Francis) <zhangxiping3@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-01 12:21:27 -07:00
$Wei Hu$Xavier$$ Wei Hu$Xavier$	b1c1583509	RDMA/hns: Get rid of virt_to_page and vmap calls after dma_alloc_coherent In general dma_alloc_coherent() returns a CPU virtual address and a DMA address, and we have no guarantee that the virtual address is either in the linear map or vmalloc. It could be in some other special place. We have no guarantee that the underlying memory even has an associated struct page at all. In current code, there are incorrect usage as below: dma_alloc_coherent + virt_to_page + vmap. There will probably introduce coherency problem. This patch fixes it to get rid of virt_to_page and vmap calls at Leon's suggestion. The related link: https://lkml.org/lkml/2017/11/7/34 Fixes: 9a44353("IB/hns: Add driver files for hns RoCE driver") Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Signed-off-by: Shaobo Xu <xushaobo2@huawei.com> Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Xiping Zhang (Francis) <zhangxiping3@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-01 12:21:27 -07:00
$Wei Hu$Xavier$$ Wei Hu$Xavier$	db270c4190	RDMA/hns: Fix the issue of IOVA not page continuous in hip08 If the smmu is enabled, the length of sg obtained from __iommu_map_sg_attrs is not 4kB. When the IOVA is set with the sg dma address, the IOVA will not be page continuous. so, the current code has MTPT configuration error that probably cause dma operation failure. In order to fix this issue, the IOVA should be calculated based on the sg length. Fixes: 3958cc5("RDMA/hns: Configure the MTPT in hip08") Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Signed-off-by: Shaobo Xu <xushaobo2@huawei.com> Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Xiping Zhang (Francis) <zhangxiping3@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-01 12:21:27 -07:00
Dmitry Monakhov	a9cd1a6737	IB/core: Init subsys if compiled to vmlinuz-core Once infiniband is compiled as a core component its subsystem must be enabled before device initialization. Otherwise there is a NULL pointer dereference during mlx4_core init, calltrace: ->device_add if (dev->class) { deref dev->class->p =>NULLPTR #Config CONFIG_NET_DEVLINK=y CONFIG_MAY_USE_DEVLINK=y CONFIG_MLX4_EN=y Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-01 12:21:26 -07:00
Moni Shoua	23a9cd2ad9	RDMA/cma: Make sure that PSN is not over max allowed This patch limits the initial value for PSN to 24 bits as spec requires. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com> Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-01 12:21:26 -07:00
Henry Orosco	a7c6dfe215	i40iw: Notify user of established connection after QP in RTS Established CM event is sent prior to modifying QP to RTS state. This can result in application closing the connection before the QP is actually in RTS state. Move sending of established CM event to after modify QP to RTS. Fixes: `f27b4746f3` ("i40iw: add connection management code") Signed-off-by: Henry Orosco <henry.orosco@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-01 12:21:26 -07:00
Tatyana Nikolova	8bb45252bb	i40iw: Move MPA request event for loopback after connect For loopback, a MPA request event is generated when cm_node is initialized, which allows applications to act on the connect request before i40iw_connect() has completed. In some cases, the reject flow executes in parallel with the connect flow and doesn't delete an APBVT entry, because the apbvt_set variable is still not set by the connect flow. Move the MPA request event to the end of i40iw_connect() to notify application for a connect request, after connect has completed. Fixes: `f27b4746f3` ("i40iw: add connection management code") Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Henry Orosco <henry.orosco@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-01 12:21:25 -07:00
Mustafa Ismail	a283cdc4d3	i40iw: Correct ARP index mask The ARP table entry indexes are aliased to 12bits instead of the intended 16bits when uploaded to the QP Context. This will present an issue when the number of connections exceeds 4096 as ARP entries are reused. Fix this by adjusting the mask to account for the full 16bits. Fixes: `4e9042e647` ("i40iw: add hw and utils files") Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-01 12:21:25 -07:00
Mustafa Ismail	10499986db	i40iw: Do not free sqbuf when event is I40IW_TIMER_TYPE_CLOSE When the event type is I40IW_TIMER_TYPE_CLOSE, there is no sqbuf and it should not be freed as one in i40iw_schedule_cm_timer(). Fixes: `f27b4746f3` ("i40iw: add connection management code") Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-01 12:21:25 -07:00
Chien Tin Tung	100d6de2ce	i40iw: Allocate a sdbuf per CQP WQE Currently there is only one sdbuf per Control QP (CQP) for programming Segment Descriptor (SD). If multiple SD work requests are posted simultaneously, the sdbuf is reused by all WQEs and new WQEs can corrupt previous WQEs sdbuf leading to incorrect SD programming. Fix this by allocating one sdbuf per CQP SQ WQE. When an SD command is posted, it will use the corresponding sdbuf for the WQE. Fixes: `86dbcd0f12` ("i40iw: add file to handle cqp calls") Signed-off-by: Chien Tin Tung <chien.tin.tung@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-12-01 12:21:24 -07:00
Geert Uytterhoeven	db0acbc475	IB: INFINIBAND should depend on HAS_DMA If NO_DMA=y: ERROR: "bad_dma_ops" [net/sunrpc/xprtrdma/rpcrdma.ko] undefined! ERROR: "bad_dma_ops" [net/smc/smc.ko] undefined! ERROR: "bad_dma_ops" [net/rds/rds_rdma.ko] undefined! ERROR: "bad_dma_ops" [net/9p/9pnet_rdma.ko] undefined! ERROR: "bad_dma_ops" [drivers/nvme/target/nvmet-rdma.ko] undefined! ERROR: "bad_dma_ops" [drivers/nvme/host/nvme-rdma.ko] undefined! ERROR: "bad_dma_ops" [drivers/infiniband/ulp/srpt/ib_srpt.ko] undefined! ERROR: "bad_dma_ops" [drivers/infiniband/ulp/srp/ib_srp.ko] undefined! ERROR: "bad_dma_ops" [drivers/infiniband/ulp/isert/ib_isert.ko] undefined! ERROR: "bad_dma_ops" [drivers/infiniband/ulp/iser/ib_iser.ko] undefined! ERROR: "bad_dma_ops" [drivers/infiniband/ulp/ipoib/ib_ipoib.ko] undefined! ERROR: "bad_dma_ops" [drivers/infiniband/core/ib_core.ko] undefined! Before, this was handled implicitly by the dependency on PCI. Add an explicit dependency on HAS_DMA to fix this. Fixes: `931bc0d916` ("IB: Move PCI dependency from root KConfig to HW's KConfigs") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-11-30 16:01:28 -07:00
Dennis Dalessandro	8935780b9f	IB/hfi1: Initialize bth1 in 16B rc ack builder It is possible the bth1 variable could be used uninitialized so going ahead and giving it a default value. Otherwise we leak stack memory to the network. Fixes: `5b6cabb0db` ("IB/hfi1: Add 16B RC/UC support") Reviewed-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2017-11-30 16:01:28 -07:00

1 2 3 4 5 ...

721767 Commits