linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-28 07:35:20 +07:00

Author	SHA1	Message	Date
Petr Machata	419476624c	mlxsw: reg: Add MLXSW_REG_MPAT_SPAN_TYPE_REMOTE_ETH Add MLXSW_REG_MPAT_SPAN_TYPE_REMOTE_ETH to support VLAN-encapsulated port mirroring. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-10 17:50:52 -04:00
Tariq Toukan	e57328384b	net/mlx4_core: Use msi_x module param to limit num of MSI-X irqs Extend the boolean interpretation of msi_x module parameter to numerical, as follows: 0 - Don't use MSI-X. 1 - Use MSI-X, driver decides the num of MSI-X irqs. >=2 - Use MSI-X, limit number of MSI-X irqs to msi_x. In SRIOV, this limits the number of MSI-X irqs per VF. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Cc: Ajaykumar Hotchandani <ajaykumar.hotchandani@oracle.com> Reviewed-by: Ajaykumar Hotchandani <ajaykumar.hotchandani@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-10 16:08:25 -04:00
Yishai Hadas	86a3e5d02c	net/mlx4_core: Add PCI calls for suspend/resume Implement suspend/resume callbacks in struct pci_driver. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-10 16:08:25 -04:00
Eran Ben Elisha	e5c9a70545	net/mlx4_core: Report driver version to FW If supported, write a driver version string to FW as part of the INIT_HCA command. Example of driver version: "Linux,mlx4_core,4.0-0" Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-10 16:08:25 -04:00
David S. Miller	db1617a11a	mlx5-updates-2018-05-07 mlx5 core driver misc cleanups and updates: - fix spelling mistake: "modfiy" -> "modify" - Cleanup unused field in Work Queue parameters - dump_command mailbox length printed - Refactor num of blocks in mailbox calculation - Decrease level of prints about non-existent MKEY - remove some extraneous spaces in indentations -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJa8Lh4AAoJEEg/ir3gV/o+0gwH/1nJwrnIfFkFgjgN6TzNFhIW P2P1FwkuwiAiOnrJOr+r1VX17BSSsjZ9cBxd2jRZOHqVbrLfhS5vkY5Lul9wWOdu u8h8JvL1qXCAqilxH+SLeKyZYjDVTIDreUIOFXuuthcz+HcF41NbvzM/9WsVj9kQ cx8ineozMrx85PxSxJIKTvsNpaJc0A0GR1+tKKn3R8djgID8BpCAgXjW3XCJ/Ou9 CoPscduksdLJ6enjqI0JZPZljgi+l4D33819Mqz1yeGpCLFATRaEXe6rLQO0U+zD n1vSZytgBvdpmADUdnCLL9BssWISIjZQzzCIjQxoSERTKFFKMtTCGRMqJXgOS7Y= =2Lzw -----END PGP SIGNATURE----- Merge tag 'mlx5-updates-2018-05-07' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Saeed Mahameed says: ==================== mlx5-updates-2018-05-07 mlx5 core driver misc cleanups and updates: - fix spelling mistake: "modfiy" -> "modify" - Cleanup unused field in Work Queue parameters - dump_command mailbox length printed - Refactor num of blocks in mailbox calculation - Decrease level of prints about non-existent MKEY - remove some extraneous spaces in indentations ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-10 08:17:55 -04:00
Colin Ian King	a8408f4e6d	net/mlx5: fix spelling mistake: "modfiy" -> "modify" Trivial fix to spelling mistake in netdev_warn warning message Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-05-04 12:11:51 -07:00
Tariq Toukan	6fa242af73	net/mlx5: Cleanup unused field in Work Queue parameters Remove the 'linear' field from struct mlx5_wq_param. It is redundant, set but never read. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-05-04 12:11:51 -07:00
Moshe Shemesh	12b996d16b	net/mlx5: Fix dump_command mailbox length printed Dump command mailbox length printed was correct only if data_only flag was set. For the case that data_only flag was clear the offset to stop printing at was wrong and so the buffer printed was too short. Changed the print loop to stop according to number of buffers in mailbox. Fixes: `e126ba97db` ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-05-04 12:11:51 -07:00
Moshe Shemesh	ed644fab19	net/mlx5: Refactor num of blocks in mailbox calculation Get the logic that calculates the number of blocks in a command mailbox into a dedicated function. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-05-04 12:11:51 -07:00
Leon Romanovsky	d034294e3c	net/mlx5: Decrease level of prints about non-existent MKEY User-controlled application can cause multiple prints as below to flood dmesg. Since knowledge of failed MKey release is important for debug, let's decrease its level to debug. mlx5_core 0000:00:04.0: mlx5_core_destroy_mkey:127:(pid 2352): failed radix tree delete of mkey 0x1ed700 Reported-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-05-04 12:11:51 -07:00
Eric Dumazet	2d943adf86	net/mlx4_en: optimizes get_fixed_ipv6_csum() While trying to support CHECKSUM_COMPLETE for IPV6 fragments, I had to experiments various hacks in get_fixed_ipv6_csum(). I must admit I could not find how to implement this :/ However, get_fixed_ipv6_csum() does a lot of redundant operations, calling csum_partial() twice. First csum_partial() computes the checksum of saddr and daddr, put in @csum_pseudo_hdr. Undone later in the second csum_partial() computed on whole ipv6 header. Then nexthdr is added once, added a second time, then substracted. payload_len is added once, then substracted. Really all this can be reduced to two add_csum(), to add back 6 bytes that were removed by mlx4 when providing hw_checksum in RX descriptor. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-04 11:59:19 -04:00
David S. Miller	a7b15ab887	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Overlapping changes in selftests Makefile. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-04 09:58:56 -04:00
Petr Machata	816a3bed95	switchdev: Add fdb.added_by_user to switchdev notifications The following patch enables sending notifications also for events on FDB entries that weren't added by the user. Give the drivers the information necessary to distinguish between the two origins of FDB entries. To maintain the current behavior, have switchdev-implementing drivers bail out on notifications about non-user-added FDB entries. In case of mlxsw driver, allow a call to mlxsw_sp_span_respin() so that SPAN over bridge catches up with the changed FDB. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-03 13:46:47 -04:00
Jiri Pirko	41107685b9	mlxsw: pci: Check number of CQEs for CQE version 2 Check number of CQEs for CQE version 2 reported by QUERY_AQ_CAP command. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-03 13:44:43 -04:00
Jiri Pirko	8404f6f2e8	mlxsw: pci: Allow to use CQEs of version 1 and version 2 Use previously added resources to query FW support for multiple versions of CQEs. Use the biggest version supported. For SDQs, it has no sense to use version 2 as it does not introduce any new features, but it is twice the size of CQE version 1. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-03 13:44:43 -04:00
Jiri Pirko	b76550bbed	mlxsw: pci: Introduce helpers to work with multiple CQE versions Introduce definitions of fields in CQE version 1 and 2. Also, introduce common helpers that would call appropriate version-specific helpers according to the version enum passed. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-03 13:44:42 -04:00
Jiri Pirko	9b934a3bdc	mlxsw: resources: Add CQE versions resources Add resources that FW uses to report supported CQE versions. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-03 13:44:42 -04:00
Colin Ian King	4e11581c27	net/mlx5e: fix spelling mistake: "loobpack" -> "loopback" Trivial fix to spelling mistake in netdev_err error message Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-03 12:58:17 -04:00
Ido Schimmel	50d10711cf	mlxsw: spectrum_router: Return an error for routes added after abort We currently do not perform accounting in the driver and thus can't reject routes before resources are exceeded. However, in order to make users aware of the fact that routes are no longer offloaded we can return an error for routes configured after the abort mechanism was triggered. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-02 13:15:17 -04:00
Ido Schimmel	6290182b2b	mlxsw: spectrum_router: Return an error for non-default FIB rules Since commit `9776d32537` ("net: Move call_fib_rule_notifiers up in fib_nl_newrule") it is possible to forbid the installation of unsupported FIB rules. Have mlxsw return an error for non-default FIB rules in addition to the existing extack message. Example: # ip rule add from 198.51.100.1 table 10 Error: mlxsw_spectrum: FIB rules not supported. Note that offload is only aborted when non-default FIB rules are already installed and merely replayed during module initialization. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-02 13:15:17 -04:00
Colin Ian King	26ff75857e	net/mlx4: fix spelling mistake: "failedi" -> "failed" trivial fix to spelling mistake in mlx4_warn message. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-01 14:17:29 -04:00
Ilya Lesokhin	43585a41bd	net/mlx5e: TLS, Add error statistics Add statistics for rare TLS related errors. Since the errors are rare we have a counter per netdev rather then per SQ. Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: Boris Pismenny <borisp@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-01 09:42:48 -04:00
Ilya Lesokhin	bf23974104	net/mlx5e: TLS, Add Innova TLS TX offload data path Implement the TLS tx offload data path according to the requirements of the TLS generic NIC offload infrastructure. Special metadata ethertype is used to pass information to the hardware. Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: Boris Pismenny <borisp@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-01 09:42:47 -04:00
Ilya Lesokhin	c83294b9ef	net/mlx5e: TLS, Add Innova TLS TX support Add NETIF_F_HW_TLS_TX capability and expose tlsdev_ops to work with the TLS generic NIC offload infrastructure. The NETIF_F_HW_TLS_TX capability will be added in the next patch. Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: Boris Pismenny <borisp@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-01 09:42:47 -04:00
Ilya Lesokhin	1ae1732284	net/mlx5: Accel, Add TLS tx offload interface Add routines for manipulating TLS TX offload contexts. In Innova TLS, TLS contexts are added or deleted via a command message over the SBU connection. The HW then sends a response message over the same connection. Add implementation for Innova TLS (FPGA-based) hardware. These routines will be used by the TLS offload support in a later patch mlx5/accel is a middle acceleration layer to allow mlx5e and other ULPs to work directly with mlx5_core rather than Innova FPGA or other mlx5 acceleration providers. In the future, when IPSec/TLS or any other acceleration gets integrated into ConnectX chip, mlx5/accel layer will provide the integrated acceleration, rather than the Innova one. Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: Boris Pismenny <borisp@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-01 09:42:47 -04:00
Ilya Lesokhin	bb9094161b	net/mlx5e: Move defines out of ipsec code The defines are not IPSEC specific. Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: Boris Pismenny <borisp@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-01 09:42:47 -04:00
Petr Machata	946a11e740	mlxsw: spectrum_span: Allow bridge for gretap mirror When handling mirroring to a gretap or ip6gretap netdevice in mlxsw, the underlay address (i.e. the remote address of the tunnel) may be routed to a bridge. In that case, look up the resolved neighbor Ethernet address in that bridge's FDB. Then configure the offload to direct the mirrored traffic to that port, possibly with tagging. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-30 12:42:40 -04:00
Petr Machata	c520bc6986	mlxsw: Respin SPAN on switchdev events Changes to switchdev artifact can make a SPAN entry offloadable or unoffloadable. To that end: - Listen to SWITCHDEV_FDB__TO_BRIDGE notifications in addition to the _TO_DEVICE ones, to catch whatever activity is sent to the bridge (likely by mlxsw itself). On each FDB notification, respin SPAN to reconcile it with the FDB changes. - Also respin on switchdev port attribute changes (which currently covers changes to STP state of ports) and port object additions and removals. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-30 12:42:40 -04:00
Petr Machata	cda880de93	mlxsw: spectrum: Register SPAN before switchdev Since switchdev events can trigger SPAN respin, it is necessary that the data structures are available. Register SPAN first, with a commentary on what the dependencies are. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-30 12:42:40 -04:00
Petr Machata	ea93c7b608	mlxsw: spectrum_switchdev: Publish two functions Publish the existing function mlxsw_sp_bridge_port_find(), and add another service accessor mlxsw_sp_bridge_port_stp_state(). Publish both in a new file spectrum_switchdev.h. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-30 12:42:40 -04:00
Petr Machata	541e11595c	mlxsw: spectrum: Extract mlxsw_sp_stp_spms_state() Instead of duplicating the decision regarding port forwarding state made by mlxsw_sp_port_vid_stp_set(), extract the decision-making into a new function and reuse. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-30 12:42:40 -04:00
Alexander Duyck	b86629ebd5	mlx4: Don't bother using skb_tx_hash in mlx4_en_select_queue The code in the fallback path has supported XDP in conjunction with the Tx traffic classification for TCs for over a year now. So instead of just calling skb_tx_hash for every packet we are better off using the fallback since that will record the Tx queue to the socket and then that can be used instead of having to recompute the hash every time. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-29 22:01:32 -04:00
David S. Miller	e8e9608116	mlx5-fixes-2018-04-25 -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJa4ixLAAoJEEg/ir3gV/o+PB0IAKlexiYEyID3N3cvG/I0m4Zh yxIN4H8sUu3kz7dG6dkvpL5Oo5c43/huE+tspOYfFmydMWvOV+DakwqAhkE+KUfe VIpF0cM7+T8Els8e7OzuT0Zu5ggeN1wU0uPRhAn1F592BH0ppXGBn8WIKTffYb8c 5XeB2JSZyw4yMgk1zurm4tVtFvHbYO7SAkLZZG5E0m7EeIujVWi1lTnhXNi9zdm/ 48LoLZ/1Rmx0e/Qpey2fm9HEPRPTgCNSBLEsx2hIDiJG56YyWPH6+N7U9Acf2PaI lKu2JMqLYe3du8hhPtCbPYH0i74af/LbNCQgXXgPstAI49v+MydlBcv2NX9J6NY= =nw/Q -----END PGP SIGNATURE----- Merge tag 'mlx5-fixes-2018-04-25' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== Mellanox, mlx5 fixes 2018-04-26 This pull request includes fixes for mlx5 core and netdev driver. Please pull and let me know if there's any problems. For -stable v4.12 net/mlx5e: TX, Use correct counter in dma_map error flow For -stable v4.13 net/mlx5: Avoid cleaning flow steering table twice during error flow For -stable v4.14 net/mlx5e: Allow offloading ipv4 header re-write for icmp For -stable v4.15 net/mlx5e: DCBNL fix min inline header size for dscp For -stable v4.16 net/mlx5: Fix mlx5_get_vector_affinity function ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-27 14:30:38 -04:00
Ido Schimmel	c7f46cca8c	mlxsw: spectrum_switchdev: Do not remove mrouter port from MDB's ports list When IGMP snooping is enabled on a bridge, traffic forwarded by an MDB entry should be sent to both ports member in the MDB's ports list and mrouter ports. In case a port needs to be removed from an MDB's ports list, but this port is also configured as an mrouter port, then do not update the device so that it will continue to forward traffic through that port. Fix a copy-paste error that checked that IGMP snooping is enabled twice instead of checking the port's mrouter state. Fixes: `ded711c87a` ("mlxsw: spectrum_switchdev: Consider mrouter status for mdb changes") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Colin King <colin.king@canonical.com> Reviewed-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-27 13:44:50 -04:00
Chris Mi	202854e9f4	net/mlx5: Properly deal with flow counters when deleting rules When deleting a flow counter, the modify mask should be the action and the flow counter. Otherwise the flow counter is not deleted and we'll get a firmware warning when deleting the remaining destinations on the same FTE. It only happens in the presence of flow counter and multiple vport destinations. If there is only one vport destination, there is no need to update the FTE when deleting the only vport destination, we just delete the FTE. Fixes: `ae05831424` ("net/mlx5: Add option to add fwd rule with counter") Signed-off-by: Chris Mi <chrism@mellanox.com> Signed-off-by: Jianbo Liu <jianbol@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-04-26 12:43:21 -07:00
Shahar Klein	99beaa22f1	net/mlx5e: Fix traffic between VF and representor After the cited commit, WQE RQ size is calculated based on sw_mtu but it was not set for representors. This commit fixes that. Fixes: `472a1e44b3` ("net/mlx5e: Save MTU in channels params") Signed-off-by: Shahar Klein <shahark@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-04-26 12:43:21 -07:00
Talat Batheesh	9c26f5f89d	net/mlx5: Avoid cleaning flow steering table twice during error flow When we fail to initialize the RX root namespace, we need to clean only that and not the entire flow steering. Currently the code may try to clean the flow steering twice on error witch leads to null pointer deference. Make sure we clean correctly. Fixes: `fba53f7b57` ("net/mlx5: Introduce mlx5_flow_steering structure") Signed-off-by: Talat Batheesh <talatb@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-04-26 12:43:20 -07:00
Tariq Toukan	d9a96ec362	net/mlx5e: TX, Use correct counter in dma_map error flow In case of a dma_mapping_error, do not use wi->num_dma as a parameter for dma unmap function because it's yet to be set, and holds an out-of-date value. Use actual value (local variable num_dma) instead. Fixes: `34802a42b3` ("net/mlx5e: Do not modify the TX SKB") Fixes: `e586b3b0ba` ("net/mlx5: Ethernet Datapath files") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-04-26 12:43:20 -07:00
Huy Nguyen	35f80acb24	net/mlx5e: DCBNL fix min inline header size for dscp When the trust state is set to dscp and the netdev is down, the inline header size is not updated. When netdev is up, the inline header size stays at L2 instead of IP. Fix this issue by updating the private parameter when the netdev is in down so that when netdev is up, it picks up the right header size. Fixes: `fbcb127e89` ("net/mlx5e: Support DSCP trust state ...") Signed-off-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-04-26 12:43:18 -07:00
Jianbo Liu	1ccef350db	net/mlx5e: Allow offloading ipv4 header re-write for icmp For ICMPv4, the checksum is calculated from the ICMP headers and data. Since the ICMPv4 checksum doesn't cover the IP header, we can allow to do L3 header re-write for this protocol. Fixes: `bdd66ac0ae` ('net/mlx5e: Disallow TC offloading of unsupported match/action combinations') Signed-off-by: Jianbo Liu <jianbol@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-04-26 12:43:18 -07:00
Tal Gilboa	cbce4f4447	net/mlx5e: Enable adaptive-TX moderation Add support for adaptive TX moderation. This greatly reduces TX interrupt rate and increases bandwidth, mostly for TCP bandwidth over ARM architecture (below). There is a slight single stream TCP with very large message sizes degradation (x86). In this case if there's any moderation on transmitted packets the bandwidth would reduce due to hitting TCP output limit. Since this is a synthetic case, this is still worth doing. Performance improvement (ConnectX-4Lx 40GbE, ARM) TCP 64B bandwidth with 1-50 streams increased 6-35%. TCP 64B bandwidth with 100-500 streams increased 20-70%. Performance improvement (ConnectX-5 100GbE, x86) Bandwidth: increased up to 40% (1024B with 10s of streams). Interrupt rate: reduced up to 50% (1024B with 1000s of streams). Performance degradation (ConnectX-5 100GbE, x86) Bandwidth: up to 10% decrease single stream TCP (1MB message size from 51Gb/s to 47Gb/s). Signed-off-by: Tal Gilboa <talgi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-24 10:15:08 -04:00
Tal Gilboa	026a807c2d	net/dim: Rename _get_profile() functions to _get_rx_moderation() Preparation for introducing adaptive TX to net DIM. Signed-off-by: Tal Gilboa <talgi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-24 10:15:07 -04:00
David S. Miller	1b80f86ed6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2018-04-21 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Initial work on BPF Type Format (BTF) is added, which is a meta data format which describes the data types of BPF programs / maps. BTF has its roots from CTF (Compact C-Type format) with a number of changes to it. First use case is to provide a generic pretty print capability for BPF maps inspection, later work will also add BTF to bpftool. pahole support to convert dwarf to BTF will be upstreamed as well (https://github.com/iamkafai/pahole/tree/btf), from Martin. 2) Add a new xdp_bpf_adjust_tail() BPF helper for XDP that allows for changing the data_end pointer. Only shrinking is currently supported which helps for crafting ICMP control messages. Minor changes in drivers have been added where needed so they recalc the packet's length also when data_end was adjusted, from Nikita. 3) Improve bpftool to make it easier to feed hex bytes via cmdline for map operations, from Quentin. 4) Add support for various missing BPF prog types and attach types that have been added to kernel recently but neither to bpftool nor libbpf yet. Doc and bash completion updates have been added as well for bpftool, from Andrey. 5) Proper fix for avoiding to leak info stored in frame data on page reuse for the two bpf_xdp_adjust_{head,meta} helpers by disallowing to move the pointers into struct xdp_frame area, from Jesper. 6) Follow-up compile fix from BTF in order to include stdbool.h in libbpf, from Björn. 7) Few fixes in BPF sample code, that is, a typo on the netdevice in a comment and fixup proper dump of XDP action code in the tracepoint exception, from Wang and Jesper. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-21 15:56:15 -04:00
David Ahern	93c2fb253d	net/ipv6: Rename fib6_info struct elements Change the prefix for fib6_info struct elements from rt6i_ to fib6_. rt6i_pcpu and rt6i_exception_bucket are left as is given that they point to rt6_info entries. Rename only; not functional change intended. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-19 15:40:12 -04:00
Nikita V. Shirokov	e5e0a59b7c	bpf: make mlx4 compatible w/ bpf_xdp_adjust_tail w/ bpf_xdp_adjust_tail helper xdp's data_end pointer could be changed as well (only "decrease" of pointer's location is going to be supported). changing of this pointer will change packet's size. for mlx4 driver we will just calculate packet's length unconditionally (the same way as it's already being done in mlx5) Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-04-18 23:34:16 +02:00
David Ahern	8d1c802b28	net/ipv6: Flip FIB entries to fib6_info Convert all code paths referencing a FIB entry from rt6_info to fib6_info. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-17 23:41:18 -04:00
David Ahern	5e670d844b	net/ipv6: Move nexthop data to fib6_nh Introduce fib6_nh structure and move nexthop related data from rt6_info and rt6_info.dst to fib6_nh. References to dev, gateway or lwtstate from a FIB lookup perspective are converted to use fib6_nh; datapath references to dst version are left as is. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-17 23:41:16 -04:00
Jesper Dangaard Brouer	039930945a	xdp: transition into using xdp_frame for return API Changing API xdp_return_frame() to take struct xdp_frame as argument, seems like a natural choice. But there are some subtle performance details here that needs extra care, which is a deliberate choice. When de-referencing xdp_frame on a remote CPU during DMA-TX completion, result in the cache-line is change to "Shared" state. Later when the page is reused for RX, then this xdp_frame cache-line is written, which change the state to "Modified". This situation already happens (naturally) for, virtio_net, tun and cpumap as the xdp_frame pointer is the queued object. In tun and cpumap, the ptr_ring is used for efficiently transferring cache-lines (with pointers) between CPUs. Thus, the only option is to de-referencing xdp_frame. It is only the ixgbe driver that had an optimization, in which it can avoid doing the de-reference of xdp_frame. The driver already have TX-ring queue, which (in case of remote DMA-TX completion) have to be transferred between CPUs anyhow. In this data area, we stored a struct xdp_mem_info and a data pointer, which allowed us to avoid de-referencing xdp_frame. To compensate for this, a prefetchw is used for telling the cache coherency protocol about our access pattern. My benchmarks show that this prefetchw is enough to compensate the ixgbe driver. V7: Adjust for commit `d9314c474d` ("i40e: add support for XDP_REDIRECT") V8: Adjust for commit `bd658dda42` ("net/mlx5e: Separate dma base address and offset in dma_sync call") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-17 10:50:29 -04:00
Jesper Dangaard Brouer	60bbf7eeef	mlx5: use page_pool for xdp_return_frame call This patch shows how it is possible to have both the driver local page cache, which uses elevated refcnt for "catching"/avoiding SKB put_page returns the page through the page allocator. And at the same time, have pages getting returned to the page_pool from ndp_xdp_xmit DMA completion. The performance improvement for XDP_REDIRECT in this patch is really good. Especially considering that (currently) the xdp_return_frame API and page_pool_put_page() does per frame operations of both rhashtable ID-lookup and locked return into (page_pool) ptr_ring. (It is the plan to remove these per frame operation in a followup patchset). The benchmark performed was RX on mlx5 and XDP_REDIRECT out ixgbe, with xdp_redirect_map (using devmap) . And the target/maximum capability of ixgbe is 13Mpps (on this HW setup). Before this patch for mlx5, XDP redirected frames were returned via the page allocator. The single flow performance was 6Mpps, and if I started two flows the collective performance drop to 4Mpps, because we hit the page allocator lock (further negative scaling occurs). Two test scenarios need to be covered, for xdp_return_frame API, which is DMA-TX completion running on same-CPU or cross-CPU free/return. Results were same-CPU=10Mpps, and cross-CPU=12Mpps. This is very close to our 13Mpps max target. The reason max target isn't reached in cross-CPU test, is likely due to RX-ring DMA unmap/map overhead (which doesn't occur in ixgbe to ixgbe testing). It is also planned to remove this unnecessary DMA unmap in a later patchset V2: Adjustments requested by Tariq - Changed page_pool_create return codes not return NULL, only ERR_PTR, as this simplifies err handling in drivers. - Save a branch in mlx5e_page_release - Correct page_pool size calc for MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ V5: Updated patch desc V8: Adjust for `b0cedc844c` ("net/mlx5e: Remove rq_headroom field from params") V9: - Adjust for `121e892754` ("net/mlx5e: Refactor RQ XDP_TX indication") - Adjust for `73281b78a3` ("net/mlx5e: Derive Striding RQ size from MTU") - Correct handling if page_pool_create fail for MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ V10: Req from Tariq - Change pool_size calc for MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-17 10:50:29 -04:00
Jesper Dangaard Brouer	ff7d6b27f8	page_pool: refurbish version of page_pool code Need a fast page recycle mechanism for ndo_xdp_xmit API for returning pages on DMA-TX completion time, which have good cross CPU performance, given DMA-TX completion time can happen on a remote CPU. Refurbish my page_pool code, that was presented[1] at MM-summit 2016. Adapted page_pool code to not depend the page allocator and integration into struct page. The DMA mapping feature is kept, even-though it will not be activated/used in this patchset. [1] http://people.netfilter.org/hawk/presentations/MM-summit2016/generic_page_pool_mm_summit2016.pdf V2: Adjustments requested by Tariq - Changed page_pool_create return codes, don't return NULL, only ERR_PTR, as this simplifies err handling in drivers. V4: many small improvements and cleanups - Add DOC comment section, that can be used by kernel-doc - Improve fallback mode, to work better with refcnt based recycling e.g. remove a WARN as pointed out by Tariq e.g. quicker fallback if ptr_ring is empty. V5: Fixed SPDX license as pointed out by Alexei V6: Adjustments requested by Eric Dumazet - Adjust ____cacheline_aligned_in_smp usage/placement - Move rcu_head in struct page_pool - Free pages quicker on destroy, minimize resources delayed an RCU period - Remove code for forward/backward compat ABI interface V8: Issues found by kbuild test robot - Address sparse should be static warnings - Only compile+link when a driver use/select page_pool, mlx5 selects CONFIG_PAGE_POOL, although its first used in two patches Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-17 10:50:29 -04:00

1 2 3 4 5 ...

3695 Commits