linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-02 20:36:47 +07:00

Author	SHA1	Message	Date
Yan Burman	213815a1e6	net/mlx4_en: Fix bridged vSwitch configuration for non SRIOV mode Commit `5b4c4d3686` "mlx4_en: Allow communication between functions on same host" introduced a regression under which a bridge acting as vSwitch whose uplink is an mlx4 Ethernet device become non-operative in native (non sriov) mode. This happens since broadcast ARP requests sent by VMs were loopback-ed by the HW and hence the bridge learned VM source MACs on both the VM and the uplink ports. The fix is to place the DMAC in the send WQE only under SRIOV/eSwitch configuration or when the device is in selftest. Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Yan Burman <yanb@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-18 14:25:28 -05:00
Linus Torvalds	f132c54e3a	First batch of InfiniBand/RDMA changes for the 3.8 merge window: - A good chunk of Bart Van Assche's SRP fixes - UAPI disintegration from David Howells - mlx4 support for "64-byte CQE" hardware feature from Or Gerlitz - Other miscellaneous fixes -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABCAAGBQJQxstjAAoJEENa44ZhAt0hURUQAJd7HumReKTdRqzIzXPc+rgl pRR5eqplPY2anfJMqLDiFphVjfCiKyhudomdo+RUbBFFnUVLlBzk80A0/IZ3g3PZ MHOT+pX4PGDd+3FQxV2AaQCMwgGbvC0haInXyQDVZGm0fbMjRd699yGVWBiA8rOI VNhUi5WMmynSINYokM8UxrhfoUfy3QxsOvZBZ3XUD1zjJB0IMd5HRdiDUG7ur0q+ rfpWKv51DXT81ux36MXbdPBhLRbzx4B7EwuPWOFPqJe1KwK2cD8iA6DwEKC9KMxS Kj2+CxB5Bfpfz8bhLi2VZcMgAKiSIQDXUtiKz8h0yFVhvADYZLU7zdGN49mCqKcY 9dwX8+0aIVez6WB2jH+ir2FSG65NsnvqESwQ4LLQ9bhArgf9fapVGlypHwcKi5hh 3j2ipO/RyT56nLQeI0gz1P5mQneFSWlY96CD8WP+9OxO/mVnxViajzevSwT/cLE6 IOMks8DPhsQK88JXSx0XKVxn3zrJ9SXbYDhRWJ6f4w/fxraRXlFdQi0UfcsAajkX 5qmM4e8Oy97TJYiY1RkAmb7aV182xMWVjtDx2FFTQ5ukgDea/DklIM/JNQ475027 N7zMW1tP6+gnnDyMEkteVuPdbl1fzwI3RdXCh0mFZHZ5tvegkdxbw0XxERcevnQN LZfME8wCuC7+RtmE38Li =TQK2 -----END PGP SIGNATURE----- Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull infiniband upate from Roland Dreier: "First batch of InfiniBand/RDMA changes for the 3.8 merge window: - A good chunk of Bart Van Assche's SRP fixes - UAPI disintegration from David Howells - mlx4 support for "64-byte CQE" hardware feature from Or Gerlitz - Other miscellaneous fixes" Fix up trivial conflict in mellanox/mlx4 driver. * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (33 commits) RDMA/nes: Fix for crash when registering zero length MR for CQ RDMA/nes: Fix for terminate timer crash RDMA/nes: Fix for BUG_ON due to adding already-pending timer IB/srp: Allow SRP disconnect through sysfs srp_transport: Document sysfs attributes srp_transport: Simplify attribute initialization code srp_transport: Fix attribute registration IB/srp: Document sysfs attributes IB/srp: send disconnect request without waiting for CM timewait exit IB/srp: destroy and recreate QP and CQs when reconnecting IB/srp: Eliminate state SRP_TARGET_DEAD IB/srp: Introduce the helper function srp_remove_target() IB/srp: Suppress superfluous error messages IB/srp: Process all error completions IB/srp: Introduce srp_handle_qp_err() IB/srp: Simplify SCSI error handling IB/srp: Keep processing commands during host removal IB/srp: Eliminate state SRP_TARGET_CONNECTING IB/srp: Increase block layer timeout RDMA/cm: Change return value from find_gid_port() ...	2012-12-13 19:19:09 -08:00
Amir Vadai	d317966bd3	net/mlx4_en: Set number of rx/tx channels using ethtool Add support to changing number of rx/tx channels using ethtool ('ethtool -[lL]'). Where the number of tx channels specified in ethtool is the number of rings per user priority - not total number of tx rings. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-02 20:22:59 -05:00
Or Gerlitz	08ff32352d	mlx4: 64-byte CQE/EQE support ConnectX-3 devices can use either 64- or 32-byte completion queue entries (CQEs) and event queue entries (EQEs). Using 64-byte EQEs/CQEs performs better because each entry is aligned to a complete cacheline. This patch queries the HCA's capabilities, and if it supports 64-byte CQEs and EQES the driver will configure the HW to work in 64-byte mode. The 32-byte vs 64-byte mode is global per HCA and not per CQ or EQ. Since this mode is global, userspace (libmlx4) must be updated to work with the configured CQE size, and guests using SR-IOV virtual functions need to know both EQE and CQE size. In case one of the 64-byte CQE/EQE capabilities is activated, the patch makes sure that older guest drivers that use the QUERY_DEV_FUNC command (e.g as done in mlx4_core of Linux 3.3..3.6) will notice that they need an update to be able to work with the PPF. This is done by changing the returned pf_context_behaviour not to be zero any more. In case none of these capabilities is activated that value remains zero and older guest drivers can run OK. The SRIOV related flow is as follows 1. the PPF does the detection of the new capabilities using QUERY_DEV_CAP command. 2. the PPF activates the new capabilities using INIT_HCA. 3. the VF detects if the PPF activated the capabilities using QUERY_HCA, and if this is the case activates them for itself too. Note that the VF detects that it must be aware to the new PF behaviour using QUERY_FUNC_CAP. Steps 1 and 2 apply also for native mode. User space notification is done through a new field introduced in struct mlx4_ib_ucontext which holds device capabilities for which user space must take action. This changes the binary interface so the ABI towards libmlx4 exposed through uverbs is bumped from 3 to 4 but only when needed i.e. only when the driver does use 64-byte CQEs or future device capabilities which must be in sync by user space. This practice allows to work with unmodified libmlx4 on older devices (e.g A0, B0) which don't support 64-byte CQEs. In order to keep existing systems functional when they update to a newer kernel that contains these changes in VF and userspace ABI, a module parameter enable_64b_cqe_eqe must be set to enable 64-byte mode; the default is currently false. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-11-26 10:19:17 -08:00
Moni Shoua	2b39a06198	net/mlx4_en: Don't use vlan tag value as an indication for vlan presence The vlan tag can be zero. This is why it can't serve as an indication that packet requires VLAN header in the TX flow. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-26 03:34:15 -04:00
Jack Morgenstein	7208ca3007	net/mlx4_en: Fix double-release-range in tx-rings The QP range is reserved as a single block. However, when freeing the en resources, the tx-ring QPs are released both in mlx4_en_destroy_tx_ring (one at a time) and in mlx4_en_free_resources (as a block release). Fix by eliminating the one-at-a-time release in mlx4_en_destroy_tx_ring. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-26 03:34:15 -04:00
Eric Dumazet	8112ec3b87	mlx4: dont orphan skbs in mlx4_en_xmit() After commit `e22979d96a` (mlx4_en: Moving to Interrupts for TX completions) we no longer need to orphan skbs in mlx4_en_xmit() since skb wont stay a long time in TX ring before their release. Orphaning skbs in ndo_start_xmit() should be avoided as much as possible, since it breaks TCP Small Queue or other flow control mechanisms (per socket limits) Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Yevgeny Petrilin <yevgenyp@mellanox.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-01 17:01:57 -04:00
Yevgeny Petrilin	c18520bd1b	net/mlx4_en: Fixing TX queue stop/wake flow Removing the ring->blocked flag, it is redundant and leads to a race: We close the TX queue and then set the "blocked" flag. Between those 2 operations the completion function can check the "blocked" flag, sees that it is 0, and wouldn't open the TX queue. Using netif_tx_queue_stopped to check the state of the queue to avoid this race. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-03 16:49:02 -07:00
Amir Vadai	bc6a4744b8	net/mlx4_en: num cores tx rings for every UP Change the TX ring scheme such that the number of rings for untagged packets and for tagged packets (per each of the vlan priorities) is the same, unlike the current situation where for tagged traffic there's one ring per priority and for untagged rings as the number of core. Queue selection is done as follows: If the mqprio qdisc is operates on the interface, such that the core networking code invoked the device setup_tc ndo callback, a mapping of skb->priority => queue set is forced - for both, tagged and untagged traffic. Else, the egress map skb->priority => User priority is used for tagged traffic, and all untagged traffic is sent through tx rings of UP 0. The patch follows the convergence of discussing that issue with John Fastabend over this thread http://comments.gmane.org/gmane.linux.network/229877 Cc: John Fastabend <john.r.fastabend@intel.com> Cc: Liran Liss <liranl@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-17 16:17:50 -04:00
Yevgeny Petrilin	5b263f5374	mlx4_en: Byte Queue Limit support Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-23 22:34:02 -04:00
Yevgeny Petrilin	e22979d96a	mlx4_en: Moving to Interrupts for TX completions Moving to interrupts instead of polling fpr TX completions Avoiding situations where skb can be held in by the driver for a long time (till timer expires). The change is also necessary for supporting BQL. Removing comp_lock that was required because we could handle TX completions from several contexts: Interrupts, timer, polling. Now there is only interrupts Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-23 22:34:02 -04:00
Amir Vadai	897d7846b4	net/mlx4_en: sk_prio <=> UP for untagged traffic Since vlan egress map is only good for tagged traffic, need to have other mapping to be used by untagged traffic. For that, the driver uses sch_mqprio mapping. This mapping could be set by using tc tool from iproute2 package. Mapped UP will be used by the HW for QoS purposes, but won't go out on the wire. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-05 05:08:04 -04:00
Amir Vadai	0e98b523c4	net/mlx4_en: Force user priority by QP attribute Instead of relying on HW to change schedule queue by UP, schedule queue is fixed for a tx_ring, and UP in WQE is ignored in this aspect. This resolves two issues with untagged traffic: 1. untagged traffic has no UP in packet which is needed for QoS. The change above allows setting the schedule queue (and by that the UP) of such a stream. 2. BlueFlame uses the same field used by vlan tag. So forcing UP from QPC allows using BF for untagged but prioritized traffic. In old firmware that force UP is not supported, untagged traffic will not subject to QoS. Because UP is set by QP, need to always have a tx ring per UP, even if pfcrx module paramter is false. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-05 05:08:03 -04:00
Or Gerlitz	966684d581	net/mlx4: fix sparse warnings on TX blue flame buffer The blue flame buffer is defined to be of type void __iomem * but was passed to mlx4_bf_copy which gets unsigned long * . This triggered sparse warning on different address spaces, fix that by changing mlx4_bf_copy first param to be of type void __iomem * . Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-06 15:19:17 -05:00
Yevgeny Petrilin	ebf8c9aa03	net/mlx4_en: Saving mem access on data path Localized the pdev->dev, and using dma_map instead of pci_map There are multiple map/unmap operations on data path, optimizing those by saving redundant pointer access. Those places were identified as hot-spots when running kernel profiling during some benchmarks. The fixes had most impact when testing packet rate with small packets, reducing several % from CPU load, and in some case being the difference between reaching wire speed or being CPU bound. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-06 15:19:17 -05:00
Eric Dumazet	18f973af3e	mlx4_en: remove sparse errors Fix new sparse errors introduced in commit `6221217199` (mlx4_en: dont change mac_header on xmit) Reported-by: Or Gerlitz <or.gerlitz@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-05 16:42:45 -05:00
Eric Dumazet	6221217199	mlx4_en: dont change mac_header on xmit A driver xmit function is not allowed to change skb without special care. mlx4_en_xmit() should not call skb_reset_mac_header() and instead should use skb->data to access ethernet header. This removes a dumb test : if (ethh && ethh->h_dest) Also remove this slow mlx4_en_mac_to_u64() call, we can use get_unaligned() to get faster code. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-26 14:22:05 -05:00
Joe Perches	e404decb0f	drivers/net: Remove unnecessary k.alloc/v.alloc OOM messages alloc failures use dump_stack so emitting an additional out-of-memory message is an unnecessary duplication. Remove the allocation failure messages. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-01-31 16:20:21 -05:00
Eugenia Emantayev	5b4c4d3686	mlx4_en: Allow communication between functions on same host To enable internal loopback, always fill DMAC in control segment when transmitting the packet, once this is done, the packet is subject for loopback for if the DMAC mathces one of the multicast/unicast addresses registered on the physical port. In receive path if source MAC is our own MAC and we are not in selftest, or not in force LB mode - drop this packet. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il> Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-12-13 13:56:07 -05:00
Amir Vadai	c140d769c2	net/mlx4_en: bug fix for the case of vlan id 0 and UP 0 When using vlan 0 and UP 0, vlan header wasn't placed. Signed-off-by: Amir Vadai <amirv@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-27 17:17:04 -05:00
Amir Vadai	60d6fe99e4	net/mlx4_en: adding loopback support Device must be in promiscuous mode or DMAC must be same as the host MAC, or else packet will be dropped by the HW rx filtering. Signed-off-by: Amir Vadai <amirv@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-27 17:17:04 -05:00
Yevgeny Petrilin	f0ab34f011	net/mlx4_en: using non collapsed CQ on TX Moving to regular Completion Queue implementation (not collapsed) Completion for each transmitted packet is written to new entry. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-27 17:17:04 -05:00
Paul Gortmaker	6eb07caf1a	drivers/net: Add moduleparam.h to drivers as required. These files were using moduleparam infrastructure, but were not including anything for it -- which is fine when module.h is being implicitly included in all files, but that is going away. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2011-10-31 19:31:09 -04:00
David S. Miller	1805b2f048	Merge branch 'master' of ra.kernel.org:/pub/scm/linux/kernel/git/davem/net	2011-10-24 18:18:09 -04:00
Ian Campbell	311761c8a5	mlx4: convert to SKB paged frag API. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2011-10-21 02:52:52 -04:00
Yevgeny Petrilin	ad04378cec	mlx4_en: Checksum counters per ring Not updating common counters from data path. The checksum counters are per ring, summarizing them when collecting statistics. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-10-19 03:42:27 -04:00
Eric Dumazet	9e903e0852	net: add skb frag size accessors To ease skb->truesize sanitization, its better to be able to localize all references to skb frags size. Define accessors : skb_frag_size() to fetch frag size, and skb_frag_size_{set\|add\|sub}() to manipulate it. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-10-19 03:10:46 -04:00
Jeff Kirsher	5a2cc190eb	mlx4: Move the Mellanox driver Moves the Mellanox driver into drivers/net/ethernet/mellanox/ and make the necessary Kconfig and Makefile changes. CC: Roland Dreier <roland@kernel.org> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2011-08-11 02:41:35 -07:00

28 Commits