linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-11-24 04:50:53 +07:00

Author	SHA1	Message	Date
David Ahern	ec81053528	selftests: Add redirect tests Add test for ICMP redirects and exception processing. Test is setup for later addition of tests using nexthop objects for routing. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-24 13:26:44 -07:00
David Ahern	0fa6efc547	ipv6: Refactor ip6_route_del for cached routes Move the removal of cached routes to a helper, ip6_del_cached_rt, that can be invoked per nexthop. Rename the existig ip6_del_cached_rt to __ip6_del_cached_rt since it is called by ip6_del_cached_rt. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-24 13:26:44 -07:00
David Ahern	1cf844c747	ipv6: Make fib6_nh optional at the end of fib6_info Move fib6_nh to the end of fib6_info and make it an array of size 0. Pass a flag to fib6_info_alloc indicating if the allocation needs to add space for a fib6_nh. The current code path always has a fib6_nh allocated with a fib6_info; with nexthop objects they will be separate. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-24 13:26:44 -07:00
David Ahern	cc5c073a69	ipv6: Move exception bucket to fib6_nh Similar to the pcpu routes exceptions are really per nexthop, so move rt6i_exception_bucket from fib6_info to fib6_nh. To avoid additional increases to the size of fib6_nh for a 1-bit flag, use the lowest bit in the allocated memory pointer for the flushed flag. Add helpers for retrieving the bucket pointer to mask off the flag. The cleanup of the exception bucket is moved to fib6_nh_release. fib6_nh_flush_exceptions can now be called from 2 contexts: 1. deleting a fib entry 2. deleting a fib6_nh For 1., fib6_nh_flush_exceptions is called for a specific fib6_info that is getting deleted. All exceptions in the cache using the entry are deleted. For 2, the fib6_nh itself is getting destroyed so fib6_nh_flush_exceptions is called for a NULL fib6_info which means flush all entries. The pmtu.sh selftest exercises the affected code paths - from creating exceptions to cleaning them up on device delete. All tests pass without any rcu locking or memleak warnings. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-24 13:26:44 -07:00
David Ahern	c0b220cf7d	ipv6: Refactor exception functions Before moving exception bucket from fib6_info to fib6_nh, refactor rt6_flush_exceptions, rt6_remove_exception_rt, rt6_mtu_change_route, and rt6_update_exception_stamp_rt. In all 3 cases, move the primary logic into a new helper that starts with fib6_nh_. The latter 3 functions still take a fib6_info; this will be changed to fib6_nh in the next patch. In the case of rt6_mtu_change_route, move the fib6_metric_locked out as a standalone check - no need to call the new function if the fib entry has the mtu locked. Also, add fib6_info to rt6_mtu_change_arg as a way of passing the fib entry to the new helper. No functional change intended. The goal here is to make the next patch easier to review by moving existing lookup logic for each to new helpers. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-24 13:26:44 -07:00
David Ahern	7d88d8b557	ipv6: Refactor fib6_drop_pcpu_from Move the existing pcpu walk in fib6_drop_pcpu_from to a new helper, __fib6_drop_pcpu_from, that can be invoked per fib6_nh with a reference to the from entries that need to be evicted. If the passed in 'from' is non-NULL then only entries associated with that fib6_info are removed (e.g., case where fib entry is deleted); if the 'from' is NULL are entries are flushed (e.g., fib6_nh is deleted). For fib6_info entries with builtin fib6_nh (ie., current code) there is no change in behavior. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-24 13:26:44 -07:00
David Ahern	f40b6ae2b6	ipv6: Move pcpu cached routes to fib6_nh rt6_info are specific instances of a fib entry and are tied to a device and gateway - ie., a nexthop. Before nexthop objects, IPv6 fib entries have separate fib6_info for each nexthop in a multipath route, so the location of the pcpu cache in the fib6_info struct worked. However, with nexthop objects a fib6_info can point to a set of nexthops (yet another alignment of ipv6 with ipv4). Accordingly, the pcpu cache needs to be moved to the fib6_nh struct so the cached entries are local to the nexthop specification used to create the rt6_info. Initialization and free of the pcpu entries moved to fib6_nh_init and fib6_nh_release. Change in location only, from fib6_info down to fib6_nh; no other functional change intended. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-24 13:26:44 -07:00
David S. Miller	daeceb2df3	Merge branch 'ENETC-support-hardware-timestamping' Y.b. Lu says: ==================== ENETC: support hardware timestamping This patch-set is to support hardware timestamping for ENETC and also to add ENETC 1588 timer device tree node for ls1028a. Because the ENETC RX BD ring dynamic allocation has not been supported and it is too expensive to use extended RX BDs if timestamping is not used, a Kconfig option is used to enable extended RX BDs in order to support hardware timestamping. This option will be removed once RX BD ring dynamic allocation is implemented. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-24 13:16:33 -07:00
Y.b. Lu	49401003e2	arm64: dts: fsl: ls1028a: add ENETC 1588 timer node Add ENETC 1588 timer node which is ENETC PF 4 (Physiscal Function 4). Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-24 13:16:32 -07:00
Y.b. Lu	ad8288b89d	dt-binding: ptp_qoriq: support ENETC PTP compatible Add a new compatible for ENETC PTP. Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-24 13:16:32 -07:00
Y.b. Lu	41514737ec	enetc: add get_ts_info interface for ethtool This patch is to add get_ts_info interface for ethtool to support getting timestamping capability. Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-24 13:16:32 -07:00
Y.b. Lu	d398231219	enetc: add hardware timestamping support This patch is to add hardware timestamping support for ENETC. On Rx, timestamping is enabled for all frames. On Tx, we only instruct the hardware to timestamp the frames marked accordingly by the stack. Because the RX BD ring dynamic allocation has not been supported and it is too expensive to use extended RX BDs if timestamping is not used, a Kconfig option is used to enable extended RX BDs in order to support hardware timestamping. This option will be removed once RX BD ring dynamic allocation is implemented. Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-24 13:16:32 -07:00
Esben Haabendal	dfb569f2b9	net: ll_temac: Fix compile error Fixes: `1b3fa5cf85` ("net: ll_temac: Cleanup multicast filter on change") Signed-off-by: Esben Haabendal <esben@geanix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-23 22:27:52 -07:00
David S. Miller	884714ce16	Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 100GbE Intel Wired LAN Driver Updates 2019-05-23 This series contains updates to ice driver only. Anirudh cleans up white space issues and other code formatting issues in the driver. Also implemented LLDP persistence across reboots and start/stop of the LLDP agent. Updated print statements for driver capabilities to include if it is a device or function capability. Bruce cleaned up variable declarations by removing unneeded assignment. Dave fixes a potential hang due to a couple of flows that recursively acquire the RTNL lock which results in a deadlock. Tony updates the driver to advertise what link modes we are capable of when the user does not request a specific link mode. Usha fixes up the LLDP MIB change event handling by cleaning up workarounds and print the DCB configuration changes detected. Brett fixes the driver to handle failures in the VF reset path, which was failing to free resources upon an error. Richard fixed the reported of stats via ethtool to align with our other Intel drivers. Jesse optimizes the transmit buffer and ring structures to have more efficient ordering to get hot cache lines to have packed data. Also optimized the VF structure to use less memory, since it is used hundreds of times throughout the driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-23 17:52:43 -07:00
Bruce Allan	feee3cb306	ice: Silence semantic parser warnings Recent versions of sparse warn about casting pointers to/from restricted endian types in the Linux driver. Silence those with the compiler attribute __force macro from the Linux kernel to force casts to/from restricted endian types. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-23 10:51:54 -07:00
Brett Creeley	aa6ccf3f2d	ice: Fix couple of issues in ice_vsi_release Currently the driver is calling ice_napi_del() and then unregister_netdev(). The call to unregister_netdev() will result in a call to ice_stop() and then ice_vsi_close(). This is where we call napi_disable() for all the MSI-X vectors. This flow is reversed so make the changes to ensure napi_disable() happens prior to napi_del(). Before calling napi_del() and free_netdev() make sure unregister_netdev() was called. This is done by making sure the __ICE_DOWN bit is set in the vsi->state for the interested VSI. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-23 10:51:54 -07:00
Jesse Brandeburg	8d5fce1903	ice: Reorganize ice_vf struct The ice_vf struct can be used hundreds of times in our driver so it pays to use less memory per struct. ice_vf prior to this commit: /* size: 112, cachelines: 2, members: 25 / / sum members: 101, holes: 4, sum holes: 8 / / bit holes: 2, sum bit holes: 11 bits / / padding: 3 / / last cacheline: 48 bytes / ice_vf after this commit: / size: 104, cachelines: 2, members: 25 / / sum members: 100, holes: 3, sum holes: 4 / / bit holes: 1, sum bit holes: 3 bits / / last cacheline: 40 bytes */ Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-23 10:51:54 -07:00
Jesse Brandeburg	0ab54c5f2f	ice: Use bitfields when possible We can use bit fields to store boolean values and when the bit fields are next to each other, the compiler will combine them (as long as the size holds enough). Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-23 10:51:54 -07:00
Jesse Brandeburg	65124bbf98	ice: Reorganize tx_buf and ring structs Use more efficient structure ordering by using the pahole tool and a lot of code inspection to get hot cache lines to have packed data (no holes if possible) and adjacent warm data. ice_ring prior to this change: /* size: 192, cachelines: 3, members: 23 / / sum members: 158, holes: 4, sum holes: 12 / / padding: 22 / ice_ring after this change: / size: 192, cachelines: 3, members: 25 / / sum members: 162, holes: 1, sum holes: 1 / / padding: 29 / ice_tx_buf prior to this change: / size: 48, cachelines: 1, members: 7 / / sum members: 38, holes: 2, sum holes: 6 / / padding: 4 / / last cacheline: 48 bytes / ice_tx_buf after this change: / size: 40, cachelines: 1, members: 7 / / sum members: 38, holes: 1, sum holes: 2 / / last cacheline: 40 bytes */ Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-23 10:51:54 -07:00
Richard Rodriguez	55e062ba77	ice: Format ethtool reported stats Fixes ethtool -S reported stats in ice driver to match format and nomenclature of the ixgbe driver. Signed-off-by: Richard Rodriguez <richard.rodriguez@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-23 10:51:54 -07:00
Brett Creeley	72f9c20398	ice: Gracefully handle reset failure in ice_alloc_vfs() Currently if ice_reset_all_vfs() fails in ice_alloc_vfs() we fail to free some resources, reset variables, and return an error value. Fix this by adding another unroll case to free the pf->vf array, set the pf->num_alloc_vfs to 0, and return an error code. Without this, if ice_reset_all_vfs() fails in ice_alloc_vfs() we will not be able to do SRIOV without hard rebooting the system because rmmod'ing the driver does not work. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-23 10:51:54 -07:00
Usha Ketineni	a17a5ff681	ice: Refactor the LLDP MIB change event handling This patch fixes the LLDP MIB change event handling code by removing the workarounds in the current code. Added ice_dcb_need_recfg() to print the DCB configuration changes detected via MIB change event. Signed-off-by: Usha Ketineni <usha.k.ketineni@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-23 10:51:53 -07:00
Tony Nguyen	9ccb062c14	ice: Advertise supported link modes if none requested User requested link modes affect what is returned as an advertised link mode. If no modes have been requested, we are not advertising any link modes. Advertise what we are capable of supporting if no link modes have been requested. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-23 10:51:53 -07:00
Dave Ertman	e223eaec67	ice: Fix hang when ethtool disables FW LLDP When disabling and enabling VSIs, there are a couple of flows that recursively acquire the RTNL lock which causes a deadlock. Fix that. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-23 10:51:53 -07:00
Anirudh Venkataramanan	a84db52569	ice: Call out dev/func caps when printing ice_parse_caps is used to parse both device and function capabilities. Currently, capabilities are printed with a cryptic "HW caps" prefix, which makes it difficult to distinguish whether the capabilities being printed are device or function capabilities. This patch makes a change to add a "func cap" prefix when printing function capabilities, and a "dev cap" prefix when printing device capabilities. This patch also changes some of the capability print strings for consistency. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-23 10:51:53 -07:00
Anirudh Venkataramanan	f24e35d88b	ice: Remove braces for single statement blocks Fix checkpatch warning "WARNING:BRACES: braces {} are not necessary for single statement blocks" Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-23 10:51:53 -07:00
Bruce Allan	173e23c0cb	ice: Cleanup an unnecessary variable initialization Commit 3463688e6ced ("ice: Add more validation in ice_vc_cfg_irq_map_msg") added an assignment of vsi making the assignment during declaration unnecessary. Also, cleanup the declaration and assignment of irqmap_info to not use two lines in the variable declaration section. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-23 10:51:53 -07:00
Anirudh Venkataramanan	31eafa403b	ice: Implement LLDP persistence Implement LLDP persistence across reboots, start and stop of LLDP agent. Add additional parameter to ice_aq_start_lldp and ice_aq_stop_lldp. Also change the ethtool private flag from "disable-fw-lldp" to "enable-fw-lldp". This change will flip the boolean logic of the functionality of the flag (on = enable, off = disable). The change in name and functionality is to differentiate between the pre-persistence and post-persistence states. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-23 10:51:53 -07:00
Anirudh Venkataramanan	b4603dbf1e	ice: Fix double spacing Fix double spacing in ice_napi_disable_all Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-05-23 10:51:53 -07:00
Subash Abhinov Kasiviswanathan	9395da4efb	net: qualcomm: rmnet: Move common struct definitions to include Create if_rmnet.h and move the rmnet MAP packet structs to this common include file. To account for portablity, add little and big endian bitfield definitions similar to the ip & tcp headers. The definitions in the headers can now be re-used by the upcoming ipa driver series as well as qmi_wwan. Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-23 09:48:07 -07:00
Ioana Radulescu	16fa1cf1ed	Revert "dpaa2-eth: configure the cache stashing amount on a queue" This reverts commit `f8b9958534`. The reverted change instructed the QMan hardware block to fetch RX frame annotation and beginning of frame data to cache before the core would read them. It turns out that in rare cases, it's possible that a QMan stashing transaction is delayed long enough such that, by the time it gets executed, the frame in question had already been dequeued by the core and software processing began on it. If the core manages to unmap the frame buffer _before_ the stashing transaction is executed, an SMMU exception will be raised. Unfortunately there is no easy way to work around this while keeping the performance advantages brought by QMan stashing, so disable it altogether. Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-23 09:37:22 -07:00
Raju Rangoju	dcf10ec772	cxgb4: use firmware API for validating filter spec Adds support for validating hardware filter spec configured in firmware before offloading exact match flows. Use the new fw api FW_PARAM_DEV_FILTER_MODE_MASK to read the filter mode and mask from firmware. If the api isn't supported, then fall-back to older way of reading just the mode from indirect register. Signed-off-by: Raju Rangoju <rajur@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-23 09:36:14 -07:00
David S. Miller	00e31a0961	Merge branch 'net-ll_temac-Fix-and-enable-multicast-support' Esben Haabendal says: ==================== net: ll_temac: Fix and enable multicast support This patch series makes the necessary fixes to ll_temac driver to make multicast work, and enables support for it.so that multicast support can The main change is the change from mutex to spinlock of the lock used to synchronize access to the shared indirect register access. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-23 09:33:57 -07:00
Esben Haabendal	0127cd5440	net: ll_temac: Enable multicast support Multicast support have been tested and is working now. Signed-off-by: Esben Haabendal <esben@geanix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-23 09:33:57 -07:00
Esben Haabendal	1b3fa5cf85	net: ll_temac: Cleanup multicast filter on change Avoid leaving old address table entries when using multicast. If more than one multicast address were removed, only the first removed address would actually be cleared. Signed-off-by: Esben Haabendal <esben@geanix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-23 09:33:57 -07:00
Esben Haabendal	1bd33bf0fe	net: ll_temac: Prepare indirect register access for multicast support With .ndo_set_rx_mode/temac_set_multicast_list() being called in atomic context (holding addr_list_lock), and temac_set_multicast_list() needing to access temac indirect registers, the mutex used to synchronize indirect register is a no-no. Replace it with a spinlock, and avoid sleeping in temac_indirect_busywait(). To avoid excessive holding of the lock, which is now a spinlock, the temac_device_reset() function is changed to only hold the lock for short periods. With timeouts, it could be holding the spinlock for more than 2 seconds. Signed-off-by: Esben Haabendal <esben@geanix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-23 09:33:57 -07:00
Esben Haabendal	ddc0bf34f9	net: ll_temac: Do not make promiscuous mode sticky on multicast When user has requested IFF_ALLMULTI or have set more than 4 multicast addresses, we should just use promiscuous mode, but not set it in flags, as it causes the interface to stay in promiscuous mode even when the non-IFF_PROMISC condition that caused promiscuous mode to be enabled has gone away. Signed-off-by: Esben Haabendal <esben@geanix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-23 09:33:57 -07:00
Christophe Leroy	5556fdb0c2	net: phy: lxt: Add suspend/resume support to LXT971 and LXT973. All LXT PHYs implement the standard "power down" bit 11 of BMCR, so this patch adds support using the generic genphy_{suspend,resume} functions added by commit `0f0ca340e5` ("phy: power management support"). LXT970 is left aside because all registers get cleared upon "power down" exit. Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-23 09:19:21 -07:00
Jiri Pirko	136bf27fc0	devlink: add warning in case driver does not set port type Prevent misbehavior of drivers who would not set port type for longer period of time. Drivers should always set port type. Do WARN if that happens. Note that it is perfectly fine to temporarily not have the type set, during initialization and port type change. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-23 09:18:43 -07:00
Sunil Muthuswamy	14a1eaa882	hv_sock: perf: loop in send() to maximize bandwidth Currently, the hv_sock send() iterates once over the buffer, puts data into the VMBUS channel and returns. It doesn't maximize on the case when there is a simultaneous reader draining data from the channel. In such a case, the send() can maximize the bandwidth (and consequently minimize the cpu cycles) by iterating until the channel is found to be full. Perf data: Total Data Transfer: 10GB/iteration Single threaded reader/writer, Linux hvsocket writer with Windows hvsocket reader Packet size: 64KB CPU sys time was captured using the 'time' command for the writer to send 10GB of data. 'Send Buffer Loop' is with the patch applied. The values below are over 10 iterations. \|--------------------------------------------------------\| \| \| Current \| Send Buffer Loop \| \|--------------------------------------------------------\| \| \| Throughput \| CPU sys \| Throughput \| CPU sys \| \| \| (MB/s) \| time (s) \| (MB/s) \| time (s) \| \|--------------------------------------------------------\| \| Min \| 407 \| 7.048 \| 401 \| 5.958 \| \|--------------------------------------------------------\| \| Max \| 455 \| 7.563 \| 542 \| 6.993 \| \|--------------------------------------------------------\| \| Avg \| 440 \| 7.411 \| 451 \| 6.639 \| \|--------------------------------------------------------\| \| Median \| 446 \| 7.417 \| 447 \| 6.761 \| \|--------------------------------------------------------\| Observation: 1. The avg throughput doesn't really change much with this change for this scenario. This is most probably because the bottleneck on throughput is somewhere else. 2. The average system (or kernel) cpu time goes down by 10%+ with this change, for the same amount of data transfer. Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com> Reviewed-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-22 18:00:36 -07:00
Sunil Muthuswamy	ac383f58f3	hv_sock: perf: Allow the socket buffer size options to influence the actual socket buffers Currently, the hv_sock buffer size is static and can't scale to the bandwidth requirements of the application. This change allows the applications to influence the socket buffer sizes using the SO_SNDBUF and the SO_RCVBUF socket options. Few interesting points to note: 1. Since the VMBUS does not allow a resize operation of the ring size, the socket buffer size option should be set prior to establishing the connection for it to take effect. 2. Setting the socket option comes with the cost of that much memory being reserved/allocated by the kernel, for the lifetime of the connection. Perf data: Total Data Transfer: 1GB Single threaded reader/writer Results below are summarized over 10 iterations. Linux hvsocket writer + Windows hvsocket reader: \|---------------------------------------------------------------------------------------------\| \|Packet size -> \| 128B \| 1KB \| 4KB \| 64KB \| \|---------------------------------------------------------------------------------------------\| \|SO_SNDBUF size \| \| Throughput in MB/s (min/max/avg/median): \| \| v \| \| \|---------------------------------------------------------------------------------------------\| \| Default \| 109/118/114/116 \| 636/774/701/700 \| 435/507/480/476 \| 410/491/462/470 \| \| 16KB \| 110/116/112/111 \| 575/705/662/671 \| 749/900/854/869 \| 592/824/692/676 \| \| 32KB \| 108/120/115/115 \| 703/823/767/772 \| 718/878/850/866 \| 1593/2124/2000/2085 \| \| 64KB \| 108/119/114/114 \| 592/732/683/688 \| 805/934/903/911 \| 1784/1943/1862/1843 \| \|---------------------------------------------------------------------------------------------\| Windows hvsocket writer + Linux hvsocket reader: \|---------------------------------------------------------------------------------------------\| \|Packet size -> \| 128B \| 1KB \| 4KB \| 64KB \| \|---------------------------------------------------------------------------------------------\| \|SO_RCVBUF size \| \| Throughput in MB/s (min/max/avg/median): \| \| v \| \| \|---------------------------------------------------------------------------------------------\| \| Default \| 69/82/75/73 \| 313/343/333/336 \| 418/477/446/445 \| 659/701/676/678 \| \| 16KB \| 69/83/76/77 \| 350/401/375/382 \| 506/548/517/516 \| 602/624/615/615 \| \| 32KB \| 62/83/73/73 \| 471/529/496/494 \| 830/1046/935/939 \| 944/1180/1070/1100 \| \| 64KB \| 64/70/68/69 \| 467/533/501/497 \| 1260/1590/1430/1431 \| 1605/1819/1670/1660 \| \|---------------------------------------------------------------------------------------------\| Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com> Reviewed-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-22 18:00:14 -07:00
Eric Dumazet	0db355d499	ipv4/igmp: shrink struct ip_sf_list Removing two 4 bytes holes allows to use kmalloc-32 kmem cache instead of kmalloc-64 on 64bit kernels. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-22 17:57:37 -07:00
David Ahern	fc651001d2	neighbor: Add tracepoint to __neigh_create Add tracepoint to __neigh_create to enable debugging of new entries. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-22 17:50:24 -07:00
David Ahern	a92a0a7b8e	selftests: pmtu: Simplify cleanup and namespace names The point of the pause-on-fail argument is to leave the setup as is after a test fails to allow a user to debug why it failed. Move the cleanup after posting the result to the user to make it so. Random names for the namespaces are not user friendly when trying to debug a failure. Make them simpler and more direct for the tests. Run cleanup at the beginning to ensure they are cleaned up if they already exist. Remove cleanup_done. There is no harm in doing cleanup twice; just ignore any errors related to not existing - which is already done. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-22 17:50:24 -07:00
David Ahern	9b7e94e6e8	selftests: fib-onlink: Make quiet by default Add VERBOSE argument to fib-onlink-tests.sh and make output quiet by default. Add getopt parsing of inputs and support for -v (verbose) and -p (pause on fail). Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-22 17:50:24 -07:00
David Ahern	75425657fe	net: Set strict_start_type for routes and rules New userspace on an older kernel can send unknown and unsupported attributes resulting in an incompelete config which is almost always wrong for routing (few exceptions are passthrough settings like the protocol that installed the route). Set strict_start_type in the policies for IPv4 and IPv6 routes and rules to detect new, unsupported attributes and fail the route add. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-22 17:50:24 -07:00
David S. Miller	e38f7cbd36	Merge branch 'net-Export-functions-for-nexthop-code' David Ahern says: ==================== net: Export functions for nexthop code This set exports ipv4 and ipv6 fib functions for use by the nexthop code. It also adds new ones to send route notifications if a nexthop configuration changes. v2 - repost of patches dropped at the end of the last dev window added patch 8 which exports nh_update_mtu since it is inline with the other patches ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-22 17:48:44 -07:00
David Ahern	06c77c3e67	ipv4: Rename and export nh_update_mtu Rename nh_update_mtu to fib_nhc_update_mtu and export for use by the nexthop code. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-22 17:48:44 -07:00
David Ahern	c3669486b5	ipv4: export fib_info_update_nh_saddr Add scope as input argument versus relying on fib_info reference in fib_nh, and export fib_info_update_nh_saddr. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-22 17:48:44 -07:00
David Ahern	9bd8366792	ipv4: export fib_flush As nexthops are deleted, fib entries referencing it are marked dead. Export fib_flush so those entries can be removed in a timely manner. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-22 17:48:44 -07:00

1 2 3 4 5 ...

839830 Commits