This organizes the write commands into objects and links them to the
uverbs_api data structure. The command path is reworked to use uapi
instead of its internal structures.
The command mask is moved from a runtime check to a registration time
check in the uapi.
Since the write interface does not have the object ID as part of the
command, the radix bins are converted into linear lists to support the
lookup.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Bringing all uapi entry points into one place lets us deal with them
consistently. For instance the write, write_ex and ioctl paths can be
disabled when an API is not supported by the driver.
This will replace the uverbs_cmd_table static arrays.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
If we can't destroy the object then we certainly shouldn't allow it be
created or used. Remove it from the uverbs_uapi in this case.
This also disables methods of other objects that have mandatory object
handle inputs - ie REG_DM_MR is now automatically removed if DM objects
cannot be created.
Typically drivers not supporting an interface will mark all of the
supporting functions as NULL, including destroy.
This is intended to automatically eliminate entire corner cases in the API
that are difficult to test.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Rely on UAPI_DEF_IS_OBJ_SUPPORTED instead of manipulating the contents of
the driver's definition list.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
We have many cases where parts of the uapi are not supported in a driver,
needs a certain protocol, or whatever. It is best to reflect this directly
into the struct uverbs_api when it is built so that everything is simply
blocked off, and future introspection can report a proper supported list.
This is done by adding some additional helpers to the definition list
language that disable objects based on a 'supported' call back, and a
helper that disables based on a NULL struct ib_device function pointer.
Disablement is global. For instance, if a driver disables an object then
everything connected to that object is removed, including core methods.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
The next patch needs another copy of this, provide a simple helper to
reduce the coding. uapi_add_get_elm() returns an existing entry or adds a
new one.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
The 'tree' data structure is very hard to build at compile time, and this
makes it very limited. The new radix tree based compiler can handle a more
complex input language that does not require the compiler to perfectly
group everything into a neat tree structure.
Instead use a simple list to describe to input, where the list elements
can be of various different 'opcodes' instructing the radix compiler what
to do. Start out with opcodes chaining to other definition lists and
chaining to the existing 'tree' definition.
Replace the very top level of the 'object tree' with this list type and
get rid of struct uverbs_object_tree_def and DECLARE_UVERBS_OBJECT_TREE.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
For DM there is no reason not to add the spec for the START_OFFSET, if DM
is not supported then ib_dev.alloc_dm is already set to NULL which ensures
we do not call the method.
For IPSEC, the core code should be setting ib_dev.create_flow_action_esp
to NULL to disable it, not relying on wonky manipulation of the specs.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
The mlx4 driver does not trigger an IB_EVENT_PORT_ACTIVE when the RoCE
network interface is activated. When SMC determines the RoCE device port
to be used, it checks the port states. This patch triggers IB events for
NETDEV_UP and NETDEV_DOWN.
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Only retry connection setup with MPAv1 if the peer actually aborted the
connection upon receiving the MPAv2 start message. This avoids retrying
with MPAv1 in the case where the connection was aborted due to retransmit
timeouts.
Fixes: d2fe99e86b ("RDMA/cxgb4: Add support for MPAv2 Enhanced RDMA Negotiation")
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Since the function always returns 0 make it void.
Reported-by: Håkon Bugge <haakon.bugge@oracle.com>
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
There is no need to have the 'struct se_portal_group *tpg' variable static
since new value always be assigned before use.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
From git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
mlx5 updates taken for dependencies on later ODP patches.
Conflict resolved by deleting mlx5_ib_get_vector_affinity()
* branch 'mlx5-next': (21 commits)
net/mlx5: EQ, Make EQE access methods inline
{net,IB}/mlx5: Move Page fault EQ and ODP logic to RDMA
net/mlx5: EQ, Generic EQ
net/mlx5: EQ, Different EQ types
net/mlx5: EQ, Privatize eq_table and friends
net/mlx5: EQ, irq_info and rmap belong to eq_table
net/mlx5: EQ, Create all EQs in one place
net/mlx5: EQ, Move all EQ logic to eq.c
net/mlx5: EQ, Remove redundant completion EQ list lock
net/mlx5: EQ, No need to store eq index as a field
net/mlx5: EQ, Remove unused fields and structures
net/mlx5: EQ, Use the right place to store/read IRQ affinity hint
IB/mlx5: Improve ODP debugging messages
net/mlx5: Use multi threaded workqueue for page fault handling
net/mlx5: Return success for PAGE_FAULT_RESUME in internal error state
IB/mlx5: Lock QP during page fault handling
net/mlx5: Enumerate page fault types
net/mlx5: Add interface to hold and release core resources
net/mlx5: Release resource on error flow
net/mlx5: Fix offsets of ifc reserved fields
...
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This is required so the user can set the SL on the DC QP.
Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com>
Reviewed-by: Yossi Itigin <yosefe@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Use the new generic EQ API to move all ODP RDMA data structures and logic
form mlx5 core driver into mlx5_ib driver.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Move unnecessary EQ table structures and declaration from the
public include/linux/mlx5/driver.h into the private area of mlx5_core
and into eq.c/eq.h.
Introduce new mlx5 EQ APIs:
mlx5_comp_vectors_count(dev);
mlx5_comp_irq_get_affinity_mask(dev, vector);
And use them from mlx5_ib or mlx5e netdevice instead of direct access to
mlx5_core internal structures.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Add and modify debug messages to ODP related error flows.
In that context, return code EAGAIN is considered less severe and print
level for it is set debug instead of warn.
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
When page fault event for a WQE arrives, the event data contains the
resource (e.g. QP) number which will later be used by the page fault
handler to retrieve the resource. Meanwhile, another context can destroy
the resource and cause use-after-free. To avoid that, take a reference on the
resource when handler starts and release it when it ends.
Page fault events for RDMA operations don't need to be protected because
the driver doesn't need to access the QP in the page fault handler.
Fixes: d9aaed8387 ("{net,IB}/mlx5: Refactor page fault handling")
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Since the function rxe_unregister_device always returns 0, it is changed
to void.
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The variable rxe is only used in the function rxe_xmit_packet, and the
caller functions do not use it. So move this variable into the function
rxe_xmit_packet.
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
link_down is self-explanatory.
rdma_sends and rdma_recvs count the number of RDMA Send and RDMA Receive
operations completed successfully. This is different from the existing
sent_pkts and rcvd_pkts counters because the existing counters measure
packets, not RDMA operations.
ack_deffered is renamed to ack_deferred to fix the spelling.
out_of_sequence is renamed to out_of_seq_request to make clear that it is
counting only requests and not other packets which can be out of sequence.
Signed-off-by: Andrew Boyer <andrew.boyer@dell.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
In ib_query_port(), use the netdev's IFF_UP flag to determine phys_state
(flag set = down = POLLING, flag clear = disabled = DISABLED).
Callers can then use the phys_state field to distinguish between links
which have a dead partner, cable missing, etc., from links which are
turned off on the local node. This is useful for HA and supportability.
Signed-off-by: Andrew Boyer <andrew.boyer@dell.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Devices that does not use managed affinity can not export a vector
affinity as the consumer relies on having a static mapping it can map to
upper layer affinity (e.g. sw queues). If the driver allows the user to
set the device irq affinity, then the affinitization of a long term
existing entites is not relevant.
For example, nvme-rdma controllers queue-irq affinitization is determined
at init time so if the irq affinity changes over time, we are no longer
aligned.
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Devices that does not use managed affinity can not export a vector
affinity as the consumer relies on having a static mapping it can map to
upper layer affinity (e.g. sw queues). If the driver allows the user to
set the device irq affinity, then the affinitization of a long term
existing entites is not relevant.
For example, nvme-rdma controllers queue-irq affinitization is determined
at init time so if the irq affinity changes over time, we are no longer
aligned.
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Error completions must still contain a valid wr_id and
qp_num such that the consumer can rely on. Correctly
fill these fields in receive error completions.
Reported-by: Walker Benjamin <benjamin.walker@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Tested-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
When resp is in error state, the queued SKBs will not be handled.
The function get_req cleans up the skb queue directly.
CC: Srinivas Eeda <srinivas.eeda@oracle.com>
CC: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This is a mechanical transformation, no change in logic.
Signed-off-by: Allen Pais <allen.lkml@gmail.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Revert 5ff7091f5a ("mm, mmu_notifier: annotate mmu notifiers with
blockable invalidate callbacks").
MMU_INVALIDATE_DOES_NOT_BLOCK flags was the only one used and it is no
longer needed since 93065ac753 ("mm, oom: distinguish blockable mode for
mmu notifiers"). We now have a full support for per range !blocking
behavior so we can drop the stop gap workaround which the per notifier
flag was used for.
Link: http://lkml.kernel.org/r/20180827112623.8992-4-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This has been a smaller cycle with many of the commits being smallish code
fixes and improvements across the drivers.
- Driver updates for bnxt_re, cxgb4, hfi1, hns, mlx5, nes, qedr, and rxe
- Memory window support in hns
- mlx5 user API 'flow mutate/steering' allows accessing the full packet
mangling and matching machinery from user space
- Support inter-working with verbs API calls in the 'devx' mlx5 user API, and
provide options to use devx with less privilege
- Modernize the use of syfs and the device interface to use attribute groups
and cdev properly for uverbs, and clean up some of the core code's device list
management
- More progress on net namespaces for RDMA devices
- Consolidate driver BAR mmapping support into core code helpers and rework
how RDMA holds poitners to mm_struct for get_user_pages cases
- First pass to use 'dev_name' instead of ib_device->name
- Device renaming for RDMA devices
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAlvR7dUACgkQOG33FX4g
mxojiw//a9GU5kq4IZ3LNAEio/3Ql/NHRF0uie5tSzJgipRJA1Ln9zW0Cm1S/ms1
VCmaSJ3l3q3GC4i3tIlsZSIIkN5qtjv/FsT/i+TZwSJYx9BDpPbzWtG6Mp4PSDj0
v3xzklFCN5HMOmEcjkNmyZw3VjHOt2Iw2mKjqvGbI9imCPLOYnw+WQaZLmMWMH6p
GL0HDbAopN5Lv8ireWd8pOhPLVbSb12cWM1crx+yHOS3q8YNWjIXGiZr/QkOPtPr
cymSXB8yuITJ7gnjbs/GxZHg6rxU0knC/Ck8hE7FqqYYHgytTklOXDE2ef1J2lFe
1VmotD+nTsCir0mZWSdcRrszEk7tzaZT7n1oWggKvWySDB6qaH0II8vWumJchQnN
pElIQn/WDgpekIqplamNqXJnKnDXZJpEVA01OHHDN4MNSc+Ad08hQy4FyFzpB6/G
jv9TnDMfGC6ma9pr1ipOXyCgCa2pHYEUCaYxUqRA0O/4ATVl7/PplqT0rqtJ6hKg
o/hmaVCawIFOUKD87/bo7Em2HBs3xNwE/c5ggbsQElLYeydrgPrZfrPfjkshv5K3
eIKDb+HPyis0is1aiF7m/bz1hSIYZp0bQhuKCdzLRjZobwCm5WDPhtuuAWb7vYVw
GSLCJWyet+bLyZxynNOt67gKm9je9lt8YTr5nilz49KeDytspK0=
=pacJ
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma updates from Jason Gunthorpe:
"This has been a smaller cycle with many of the commits being smallish
code fixes and improvements across the drivers.
- Driver updates for bnxt_re, cxgb4, hfi1, hns, mlx5, nes, qedr, and
rxe
- Memory window support in hns
- mlx5 user API 'flow mutate/steering' allows accessing the full
packet mangling and matching machinery from user space
- Support inter-working with verbs API calls in the 'devx' mlx5 user
API, and provide options to use devx with less privilege
- Modernize the use of syfs and the device interface to use attribute
groups and cdev properly for uverbs, and clean up some of the core
code's device list management
- More progress on net namespaces for RDMA devices
- Consolidate driver BAR mmapping support into core code helpers and
rework how RDMA holds poitners to mm_struct for get_user_pages
cases
- First pass to use 'dev_name' instead of ib_device->name
- Device renaming for RDMA devices"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (242 commits)
IB/mlx5: Add support for extended atomic operations
RDMA/core: Fix comment for hw stats init for port == 0
RDMA/core: Refactor ib_register_device() function
RDMA/core: Fix unwinding flow in case of error to register device
ib_srp: Remove WARN_ON in srp_terminate_io()
IB/mlx5: Allow scatter to CQE without global signaled WRs
IB/mlx5: Verify that driver supports user flags
IB/mlx5: Support scatter to CQE for DC transport type
RDMA/drivers: Use core provided API for registering device attributes
RDMA/core: Allow existing drivers to set one sysfs group per device
IB/rxe: Remove unnecessary enum values
RDMA/umad: Use kernel API to allocate umad indexes
RDMA/uverbs: Use kernel API to allocate uverbs indexes
RDMA/core: Increase total number of RDMA ports across all devices
IB/mlx4: Add port and TID to MAD debug print
IB/mlx4: Enable debug print of SMPs
RDMA/core: Rename ports_parent to ports_kobj
RDMA/core: Do not expose unsupported counters
IB/mlx4: Refer to the device kobject instead of ports_parent
RDMA/nldev: Allow IB device rename through RDMA netlink
...
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAlvPV7IUHGJoZWxnYWFz
QGdvb2dsZS5jb20ACgkQWYigwDrT+vyaUg//WnCaRIu2oKOp8c/bplZJDW5eT10d
oYAN9qeyptU9RYrg4KBNbZL9UKGFTk3AoN5AUjrk8njxc/dY2ra/79esOvZyyYQy
qLXBvrXKg3yZnlNlnyBneGSnUVwv/kl2hZS+kmYby2YOa8AH/mhU0FIFvsnfRK2I
XvwABFm2ZYvXCqh3e5HXaHhOsR88NQ9In0AXVC7zHGqv1r/bMVn2YzPZHL/zzMrF
mS79tdBTH+shSvchH9zvfgIs+UEKvvjEJsG2liwMkcQaV41i5dZjSKTdJ3EaD/Y2
BreLxXRnRYGUkBqfcon16Yx+P6VCefDRLa+RhwYO3dxFF2N4ZpblbkIdBATwKLjL
npiGc6R8yFjTmZU0/7olMyMCm7igIBmDvWPcsKEE8R4PezwoQv6YKHBMwEaflIbl
Rv4IUqjJzmQPaA0KkRoAVgAKHxldaNqno/6G1FR2gwz+fr68p5WSYFlQ3axhvTjc
bBMJpB/fbp9WmpGJieTt6iMOI6V1pnCVjibM5ZON59WCFfytHGGpbYW05gtZEod4
d/3yRuU53JRSj3jQAQuF1B6qYhyxvv5YEtAQqIFeHaPZ67nL6agw09hE+TlXjWbE
rTQRShflQ+ydnzIfKicFgy6/53D5hq7iH2l7HwJVXbXRQ104T5DB/XHUUTr+UWQn
/Nkhov32/n6GjxQ=
=58I4
-----END PGP SIGNATURE-----
Merge tag 'pci-v4.20-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
- Fix ASPM link_state teardown on removal (Lukas Wunner)
- Fix misleading _OSC ASPM message (Sinan Kaya)
- Make _OSC optional for PCI (Sinan Kaya)
- Don't initialize ASPM link state when ACPI_FADT_NO_ASPM is set
(Patrick Talbert)
- Remove x86 and arm64 node-local allocation for host bridge structures
(Punit Agrawal)
- Pay attention to device-specific _PXM node values (Jonathan Cameron)
- Support new Immediate Readiness bit (Felipe Balbi)
- Differentiate between pciehp surprise and safe removal (Lukas Wunner)
- Remove unnecessary pciehp includes (Lukas Wunner)
- Drop pciehp hotplug_slot_ops wrappers (Lukas Wunner)
- Tolerate PCIe Slot Presence Detect being hardwired to zero to
workaround broken hardware, e.g., the Wilocity switch/wireless device
(Lukas Wunner)
- Unify pciehp controller & slot structs (Lukas Wunner)
- Constify hotplug_slot_ops (Lukas Wunner)
- Drop hotplug_slot_info (Lukas Wunner)
- Embed hotplug_slot struct into users instead of allocating it
separately (Lukas Wunner)
- Initialize PCIe port service drivers directly instead of relying on
initcall ordering (Keith Busch)
- Restore PCI config state after a slot reset (Keith Busch)
- Save/restore DPC config state along with other PCI config state
(Keith Busch)
- Reference count devices during AER handling to avoid race issue with
concurrent hot removal (Keith Busch)
- If an Upstream Port reports ERR_FATAL, don't try to read the Port's
config space because it is probably unreachable (Keith Busch)
- During error handling, use slot-specific reset instead of secondary
bus reset to avoid link up/down issues on hotplug ports (Keith Busch)
- Restore previous AER/DPC handling that does not remove and
re-enumerate devices on ERR_FATAL (Keith Busch)
- Notify all drivers that may be affected by error recovery resets
(Keith Busch)
- Always generate error recovery uevents, even if a driver doesn't have
error callbacks (Keith Busch)
- Make PCIe link active reporting detection generic (Keith Busch)
- Support D3cold in PCIe hierarchies during system sleep and runtime,
including hotplug and Thunderbolt ports (Mika Westerberg)
- Handle hpmemsize/hpiosize kernel parameters uniformly, whether slots
are empty or occupied (Jon Derrick)
- Remove duplicated include from pci/pcie/err.c and unused variable
from cpqphp (YueHaibing)
- Remove driver pci_cleanup_aer_uncorrect_error_status() calls (Oza
Pawandeep)
- Uninline PCI bus accessors for better ftracing (Keith Busch)
- Remove unused AER Root Port .error_resume method (Keith Busch)
- Use kfifo in AER instead of a local version (Keith Busch)
- Use threaded IRQ in AER bottom half (Keith Busch)
- Use managed resources in AER core (Keith Busch)
- Reuse pcie_port_find_device() for AER injection (Keith Busch)
- Abstract AER interrupt handling to disconnect error injection (Keith
Busch)
- Refactor AER injection callbacks to simplify future improvments
(Keith Busch)
- Remove unused Netronome NFP32xx Device IDs (Jakub Kicinski)
- Use bitmap_zalloc() for dma_alias_mask (Andy Shevchenko)
- Add switch fall-through annotations (Gustavo A. R. Silva)
- Remove unused Switchtec quirk variable (Joshua Abraham)
- Fix pci.c kernel-doc warning (Randy Dunlap)
- Remove trivial PCI wrappers for DMA APIs (Christoph Hellwig)
- Add Intel GPU device IDs to spurious interrupt quirk (Bin Meng)
- Run Switchtec DMA aliasing quirk only on NTB endpoints to avoid
useless dmesg errors (Logan Gunthorpe)
- Update Switchtec NTB documentation (Wesley Yung)
- Remove redundant "default n" from Kconfig (Bartlomiej Zolnierkiewicz)
- Avoid panic when drivers enable MSI/MSI-X twice (Tonghao Zhang)
- Add PCI support for peer-to-peer DMA (Logan Gunthorpe)
- Add sysfs group for PCI peer-to-peer memory statistics (Logan
Gunthorpe)
- Add PCI peer-to-peer DMA scatterlist mapping interface (Logan
Gunthorpe)
- Add PCI configfs/sysfs helpers for use by peer-to-peer users (Logan
Gunthorpe)
- Add PCI peer-to-peer DMA driver writer's documentation (Logan
Gunthorpe)
- Add block layer flag to indicate driver support for PCI peer-to-peer
DMA (Logan Gunthorpe)
- Map Infiniband scatterlists for peer-to-peer DMA if they contain P2P
memory (Logan Gunthorpe)
- Register nvme-pci CMB buffer as PCI peer-to-peer memory (Logan
Gunthorpe)
- Add nvme-pci support for PCI peer-to-peer memory in requests (Logan
Gunthorpe)
- Use PCI peer-to-peer memory in nvme (Stephen Bates, Steve Wise,
Christoph Hellwig, Logan Gunthorpe)
- Cache VF config space size to optimize enumeration of many VFs
(KarimAllah Ahmed)
- Remove unnecessary <linux/pci-ats.h> include (Bjorn Helgaas)
- Fix VMD AERSID quirk Device ID matching (Jon Derrick)
- Fix Cadence PHY handling during probe (Alan Douglas)
- Signal Cadence Endpoint interrupts via AXI region 0 instead of last
region (Alan Douglas)
- Write Cadence Endpoint MSI interrupts with 32 bits of data (Alan
Douglas)
- Remove redundant controller tests for "device_type == pci" (Rob
Herring)
- Document R-Car E3 (R8A77990) bindings (Tho Vu)
- Add device tree support for R-Car r8a7744 (Biju Das)
- Drop unused mvebu PCIe capability code (Thomas Petazzoni)
- Add shared PCI bridge emulation code (Thomas Petazzoni)
- Convert mvebu to use shared PCI bridge emulation (Thomas Petazzoni)
- Add aardvark Root Port emulation (Thomas Petazzoni)
- Support 100MHz/200MHz refclocks for i.MX6 (Lucas Stach)
- Add initial power management for i.MX7 (Leonard Crestez)
- Add PME_Turn_Off support for i.MX7 (Leonard Crestez)
- Fix qcom runtime power management error handling (Bjorn Andersson)
- Update TI dra7xx unaligned access errata workaround for host mode as
well as endpoint mode (Vignesh R)
- Fix kirin section mismatch warning (Nathan Chancellor)
- Remove iproc PAXC slot check to allow VF support (Jitendra Bhivare)
- Quirk Keystone K2G to limit MRRS to 256 (Kishon Vijay Abraham I)
- Update Keystone to use MRRS quirk for host bridge instead of open
coding (Kishon Vijay Abraham I)
- Refactor Keystone link establishment (Kishon Vijay Abraham I)
- Simplify and speed up Keystone link training (Kishon Vijay Abraham I)
- Remove unused Keystone host_init argument (Kishon Vijay Abraham I)
- Merge Keystone driver files into one (Kishon Vijay Abraham I)
- Remove redundant Keystone platform_set_drvdata() (Kishon Vijay
Abraham I)
- Rename Keystone functions for uniformity (Kishon Vijay Abraham I)
- Add Keystone device control module DT binding (Kishon Vijay Abraham
I)
- Use SYSCON API to get Keystone control module device IDs (Kishon
Vijay Abraham I)
- Clean up Keystone PHY handling (Kishon Vijay Abraham I)
- Use runtime PM APIs to enable Keystone clock (Kishon Vijay Abraham I)
- Clean up Keystone config space access checks (Kishon Vijay Abraham I)
- Get Keystone outbound window count from DT (Kishon Vijay Abraham I)
- Clean up Keystone outbound window configuration (Kishon Vijay Abraham
I)
- Clean up Keystone DBI setup (Kishon Vijay Abraham I)
- Clean up Keystone ks_pcie_link_up() (Kishon Vijay Abraham I)
- Fix Keystone IRQ status checking (Kishon Vijay Abraham I)
- Add debug messages for all Keystone errors (Kishon Vijay Abraham I)
- Clean up Keystone includes and macros (Kishon Vijay Abraham I)
- Fix Mediatek unchecked return value from devm_pci_remap_iospace()
(Gustavo A. R. Silva)
- Fix Mediatek endpoint/port matching logic (Honghui Zhang)
- Change Mediatek Root Port Class Code to PCI_CLASS_BRIDGE_PCI (Honghui
Zhang)
- Remove redundant Mediatek PM domain check (Honghui Zhang)
- Convert Mediatek to pci_host_probe() (Honghui Zhang)
- Fix Mediatek MSI enablement (Honghui Zhang)
- Add Mediatek system PM support for MT2712 and MT7622 (Honghui Zhang)
- Add Mediatek loadable module support (Honghui Zhang)
- Detach VMD resources after stopping root bus to prevent orphan
resources (Jon Derrick)
- Convert pcitest build process to that used by other tools (iio, perf,
etc) (Gustavo Pimentel)
* tag 'pci-v4.20-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (140 commits)
PCI/AER: Refactor error injection fallbacks
PCI/AER: Abstract AER interrupt handling
PCI/AER: Reuse existing pcie_port_find_device() interface
PCI/AER: Use managed resource allocations
PCI: pcie: Remove redundant 'default n' from Kconfig
PCI: aardvark: Implement emulated root PCI bridge config space
PCI: mvebu: Convert to PCI emulated bridge config space
PCI: mvebu: Drop unused PCI express capability code
PCI: Introduce PCI bridge emulated config space common logic
PCI: vmd: Detach resources after stopping root bus
nvmet: Optionally use PCI P2P memory
nvmet: Introduce helper functions to allocate and free request SGLs
nvme-pci: Add support for P2P memory in requests
nvme-pci: Use PCI p2pmem subsystem to manage the CMB
IB/core: Ensure we map P2P memory correctly in rdma_rw_ctx_[init|destroy]()
block: Add PCI P2P flag for request queue
PCI/P2PDMA: Add P2P DMA driver writer's documentation
docs-rst: Add a new directory for PCI documentation
PCI/P2PDMA: Introduce configfs/sysfs enable attribute helpers
PCI/P2PDMA: Add PCI p2pmem DMA mappings to adjust the bus offset
...
- Add PCI support for peer-to-peer DMA (Logan Gunthorpe)
- Add sysfs group for PCI peer-to-peer memory statistics (Logan
Gunthorpe)
- Add PCI peer-to-peer DMA scatterlist mapping interface (Logan
Gunthorpe)
- Add PCI configfs/sysfs helpers for use by peer-to-peer users (Logan
Gunthorpe)
- Add PCI peer-to-peer DMA driver writer's documentation (Logan
Gunthorpe)
- Add block layer flag to indicate driver support for PCI peer-to-peer
DMA (Logan Gunthorpe)
- Map Infiniband scatterlists for peer-to-peer DMA if they contain P2P
memory (Logan Gunthorpe)
- Register nvme-pci CMB buffer as PCI peer-to-peer memory (Logan
Gunthorpe)
- Add nvme-pci support for PCI peer-to-peer memory in requests (Logan
Gunthorpe)
- Use PCI peer-to-peer memory in nvme (Stephen Bates, Steve Wise,
Christoph Hellwig, Logan Gunthorpe)
* pci/peer-to-peer:
nvmet: Optionally use PCI P2P memory
nvmet: Introduce helper functions to allocate and free request SGLs
nvme-pci: Add support for P2P memory in requests
nvme-pci: Use PCI p2pmem subsystem to manage the CMB
IB/core: Ensure we map P2P memory correctly in rdma_rw_ctx_[init|destroy]()
block: Add PCI P2P flag for request queue
PCI/P2PDMA: Add P2P DMA driver writer's documentation
docs-rst: Add a new directory for PCI documentation
PCI/P2PDMA: Introduce configfs/sysfs enable attribute helpers
PCI/P2PDMA: Add PCI p2pmem DMA mappings to adjust the bus offset
PCI/P2PDMA: Add sysfs group to display p2pmem stats
PCI/P2PDMA: Support peer-to-peer memory
net/sched/cls_api.c has overlapping changes to a call to
nlmsg_parse(), one (from 'net') added rtm_tca_policy instead of NULL
to the 5th argument, and another (from 'net-next') added cb->extack
instead of NULL to the 6th argument.
net/ipv4/ipmr_base.c is a case of a bug fix in 'net' being done to
code which moved (to mr_table_dump)) in 'net-next'. Thanks to David
Ahern for the heads up.
Signed-off-by: David S. Miller <davem@davemloft.net>
Ok, so last week I thought we had sent our final pull request for 4.19.
Well, wouldn't ya know someone went and found a couple Spectre v1 fixes
were needed :-/. So, a couple *very* small specter patches for this
(hopefully) final -rc week.
Signed-off-by: Doug Ledford <dledford@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJbyR/5AAoJELgmozMOVy/djDQP/2XYToI8u4n35IpylwLnd9OC
p+GogThE7MoqxxizZbSdXqVmkPvhMBiSsv1cKXEhuyt2KnF8xwvJwjgv6X6bJv3E
yesuG2odbiyNQ5K0avgWQHey20HIbH8My+u2S5OyphjI36bY1oiKiKmb+rJ9AT6B
2fumq7jiEB732lBuMnPniwKZQO+LUMSD9EVRKU4mprN7MlRWt9mLJzdu1GZ8eQJZ
bE1ncPCrjkJTr9osGrDbkHhuBkFVI1kU6duZki8wkkmwwxzyuibbQ6SQsWBDvMRL
qIdbRz4V17cG0YTjaaD6zb226o80zFT79e1MY9dk1/yJ2msnE1N7qSf25a0xEiWj
XQaETabfNBCVclldSL2EenM9S4NkKpnsiUnxH0TxAW3tOY/Hi37Ouv8EtM0Iq/U6
Ns60U4Cy1TDNY+L9sE4yjrfL0mnwBYf5N3qhktFpo7lk76s+4KmiiQjeTuwXQleV
cYXzkJCTV9d2YJLUCVJieZIZx5xMtPAHkrokjjgztVLPmm9peM+xet1JugoGG6ss
DFhVD7vHt23a1u/R2ETmvAxUHDm8acKaaJTYIhh8889N6pH+vnVvZL0WQUPoiQrr
wOTrT/R7Dm0LZERM69T/PhOaFbhRkBHppaaZaQE6Lype2H6A7i+hLDTZCg9IWV9f
gHlBSApUr8vutAyiieXE
=ZHZX
-----END PGP SIGNATURE-----
Merge tag 'for-gkh' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Doug writes:
"Really final for-rc pull request for 4.19
Ok, so last week I thought we had sent our final pull request for
4.19. Well, wouldn't ya know someone went and found a couple Spectre
v1 fixes were needed :-/. So, a couple *very* small specter patches
for this (hopefully) final -rc week."
* tag 'for-gkh' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/ucma: Fix Spectre v1 vulnerability
IB/ucm: Fix Spectre v1 vulnerability
Take struct mlx5_frag_buf out of mlx5_frag_buf_ctrl, as it is not
needed to manage and control the datapath of the fragmented buffers API.
struct mlx5_frag_buf contains control info to manage the allocation
and de-allocation of the fragmented buffer.
Its fields are not relevant for datapath, so here I take them out of the
struct mlx5_frag_buf_ctrl, except for the fragments array itself.
In addition, modified mlx5_fill_fbc to initialise the frags pointers
as well. This implies that the buffer must be allocated before the
function is called.
A set of type-specific *_get_byte_size() functions are replaced by
a generic one.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
If no-append flag is set, we will add a new FTE, instead of appending
the actions of the inserted rule when the same match already exists.
While here, move the has_flow_tag boolean indicator to be a flag too.
This patch doesn't change any functionality.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanmox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Currently, when a flow rule is created using the FS core layer, the caller
has to pass the entire flow counter object and not just the counter HW
handle (ID). This requires both the FS core and the caller to have
knowledge about the inner implementation of the FS layer flow counters
cache and limits the possible users.
Move to use the counter ID across the place when dealing with flows.
Doing this decoupling, now can we privatize the inner implementation
of the flow counters.
Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
mlx5 updates for both net-next and rdma-next
* 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: (21 commits)
net/mlx5: Expose DC scatter to CQE capability bit
net/mlx5: Update mlx5_ifc with DEVX UID bits
net/mlx5: Set uid as part of DCT commands
net/mlx5: Set uid as part of SRQ commands
net/mlx5: Set uid as part of SQ commands
net/mlx5: Set uid as part of RQ commands
net/mlx5: Set uid as part of QP commands
net/mlx5: Set uid as part of CQ commands
net/mlx5: Rename incorrect naming in IFC file
net/mlx5: Export packet reformat alloc/dealloc functions
net/mlx5: Pass a namespace for packet reformat ID allocation
net/mlx5: Expose new packet reformat capabilities
{net, RDMA}/mlx5: Rename encap to reformat packet
net/mlx5: Move header encap type to IFC header file
net/mlx5: Break encap/decap into two separated flow table creation flags
net/mlx5: Add support for more namespaces when allocating modify header
net/mlx5: Export modify header alloc/dealloc functions
net/mlx5: Add proper NIC TX steering flow tables support
net/mlx5: Cleanup flow namespace getter switch logic
net/mlx5: Add memic command opcode to command checker
...
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
In order to use PCI P2P memory the pci_p2pmem_map_sg() function must be
called to map the correct PCI bus address.
To do this, check the first page in the scatter list to see if it is P2P
memory or not. At the moment, scatter lists that contain P2P memory must
be homogeneous so if the first page is P2P the entire SGL should be P2P.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Extended atomic operations cmp&swp and fetch&add is a Mellanox
feature extending the standard atomic operation to use, varied
operand sizes, as apposed to normal atomic operation that use
an 8 byte operand only.
Extended atomics allows masking the results and arguments.
This patch configures QP to support extended atomic operation
with the maximum size possible, as exposed by HCA capabilities.
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
When add_port() is done for port == 0, it indicates that ports hardware
counters initialization should be skipped. Reflect so in the comment.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
ib_register_device() does several allocation and initialization
steps. Split it into smaller more readable functions for easy
review and maintenance.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
If port pkey list initialization fails, free the port_immutable memory
during cleanup path. Currently it is missed out.
If cache setup fails, free the pkey list during cleanup path.
Fixes: d291f1a65 ("IB/core: Enforce PKey security on QPs")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The WARN_ON() is pointless as the rport is placed in SDEV_TRANSPORT_OFFLINE
at that time, so no new commands can be submitted via srp_queuecommand()
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.com>
Acked-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Requester scatter to CQE is restricted to QPs configured to signal
all WRs.
This patch adds ability to enable scatter to cqe (force enable)
in the requester without sig_all, for users who do not want all WRs
signaled but rather just the ones whose data found in the CQE.
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Flags sent down from user might not be supported by
running driver.
This might lead to unwanted bugs.
To solve this, added macro to test for unsupported flags.
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Scatter to CQE is a HW offload that saves PCI writes by scattering the
payload to the CQE.
This patch extends already existing functionality to support DC
transport type.
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Use rdma_set_device_sysfs_group() to register device attributes and
simplify the driver.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>