linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-28 11:18:45 +07:00

Author	SHA1	Message	Date
Jesper Dangaard Brouer	99c07c43c4	xdp: tracking page_pool resources and safe removal This patch is needed before we can allow drivers to use page_pool for DMA-mappings. Today with page_pool and XDP return API, it is possible to remove the page_pool object (from rhashtable), while there are still in-flight packet-pages. This is safely handled via RCU and failed lookups in __xdp_return() fallback to call put_page(), when page_pool object is gone. In-case page is still DMA mapped, this will result in page note getting correctly DMA unmapped. To solve this, the page_pool is extended with tracking in-flight pages. And XDP disconnect system queries page_pool and waits, via workqueue, for all in-flight pages to be returned. To avoid killing performance when tracking in-flight pages, the implement use two (unsigned) counters, that in placed on different cache-lines, and can be used to deduct in-flight packets. This is done by mapping the unsigned "sequence" counters onto signed Two's complement arithmetic operations. This is e.g. used by kernel's time_after macros, described in kernel commit `1ba3aab303` and `5a581b367b`, and also explained in RFC1982. The trick is these two incrementing counters only need to be read and compared, when checking if it's safe to free the page_pool structure. Which will only happen when driver have disconnected RX/alloc side. Thus, on a non-fast-path. It is chosen that page_pool tracking is also enabled for the non-DMA use-case, as this can be used for statistics later. After this patch, using page_pool requires more strict resource "release", e.g. via page_pool_release_page() that was introduced in this patchset, and previous patches implement/fix this more strict requirement. Drivers no-longer call page_pool_destroy(). Drivers already call xdp_rxq_info_unreg() which call xdp_rxq_info_unreg_mem_model(), which will attempt to disconnect the mem id, and if attempt fails schedule the disconnect for later via delayed workqueue. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-19 11:23:13 -04:00
Jesper Dangaard Brouer	29b006a676	mlx5: more strict use of page_pool API The mlx5 driver is using page_pool, but not for DMA-mapping (currently), and is a little too relaxed about returning or releasing page resources, as it is not strictly necessary, when not using DMA-mappings. As this patchset is working towards tracking page_pool resources, to know about in-flight frames on shutdown. Then fix places where mlx5 leak page_pool resource. In case of dma_mapping_error, then recycle into page_pool. In mlx5e_free_rq() moved the page_pool_destroy() call to after the mlx5e_page_release() calls, as it is more correct. In mlx5e_page_release() when no recycle was requested, then release page from the page_pool, via page_pool_release_page(). Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-19 11:23:13 -04:00
Jesper Dangaard Brouer	e54cfd7e17	page_pool: introduce page_pool_free and use in mlx5 In case driver fails to register the page_pool with XDP return API (via xdp_rxq_info_reg_mem_model()), then the driver can free the page_pool resources more directly than calling page_pool_destroy(), which does a unnecessarily RCU free procedure. This patch is preparing for removing page_pool_destroy(), from driver invocation. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-19 11:23:13 -04:00
Jesper Dangaard Brouer	cbf3351067	veth: use xdp_release_frame for XDP_PASS Like cpumap use xdp_release_frame() when an xdp_frame got converted into an SKB and send towars the network stack. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-19 11:23:13 -04:00
Jesper Dangaard Brouer	6bf071bf09	xdp: page_pool related fix to cpumap When converting an xdp_frame into an SKB, and sending this into the network stack, then the underlying XDP memory model need to release associated resources, because the network stack don't have callbacks for XDP memory models. The only memory model that needs this is page_pool, when a driver use the DMA-mapping feature. Introduce page_pool_release_page(), which basically does the same as page_pool_unmap_page(). Add xdp_release_frame() as the XDP memory model interface for calling it, if the memory model match MEM_TYPE_PAGE_POOL, to save the function call overhead for others. Have cpumap call xdp_release_frame() before xdp_scrub_frame(). Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-19 11:23:13 -04:00
Jesper Dangaard Brouer	516a7593fd	xdp: fix leak of IDA cyclic id if rhashtable_insert_slow fails Fix error handling case, where inserting ID with rhashtable_insert_slow fails in xdp_rxq_info_reg_mem_model, which leads to never releasing the IDA ID, as the lookup in xdp_rxq_info_unreg_mem_model fails and thus ida_simple_remove() is never called. Fix by releasing ID via ida_simple_remove(), and mark xdp_rxq->mem.id with zero, which is already checked in xdp_rxq_info_unreg_mem_model(). Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-19 11:23:13 -04:00
Ilias Apalodimas	a25d50bfe6	net: page_pool: add helper function to unmap dma addresses On a previous patch dma addr was stored in 'struct page'. Use that to unmap DMA addresses used by network drivers Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-19 11:23:13 -04:00
Ilias Apalodimas	0afdeeed08	net: page_pool: add helper function to retrieve dma addresses On a previous patch dma addr was stored in 'struct page'. Use that to retrieve DMA addresses used by network drivers Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-19 11:23:13 -04:00
Ilias Apalodimas	9371a56f71	net: netsec: remove loops in napi Rx process netsec_process_rx was running in a loop trying to process as many packets as possible before re-enabling interrupts. With the recent DMA changes this is not needed anymore as we manage to consume all the budget without looping over the function. Since it has no performance penalty let's remove that and simplify the Rx path a bit Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-19 10:28:20 -04:00
Ilias Apalodimas	39e3622ede	net: netsec: initialize tx ring on ndo_open Since we changed the Tx ring handling and now depends on bit31 to figure out the owner of the descriptor, we should initialize this every time the device goes down-up instead of doing it once on driver init. If the value is not correctly initialized the device won't have any available descriptors Changes since v1: - Typo fixes Fixes: `35e07d2347` ("net: socionext: remove mmio reads on Tx") Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-19 10:28:20 -04:00
Rasmus Villemoes	e41d4bc554	net: dsa: mv88e6xxx: fix shift of FID bits in mv88e6250_g1_vtu_loadpurge() The comment is correct, but the code ends up moving the bits four places too far, into the VTUOp field. Fixes: `bec8e57252` (net: dsa: mv88e6xxx: implement vtu_getnext and vtu_loadpurge for mv88e6250) Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-19 10:25:59 -04:00
David S. Miller	23cdf8752b	act_ctinfo: Don't use BIT() in UAPI headers. Use _BITUL() instead. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-19 10:12:58 -04:00
David S. Miller	cfecf0d001	Merge branch 'mlxsw-Implement-flower-ingress-device-matching-offload' Ido Schimmel says: ==================== mlxsw: Implement flower ingress device matching offload Jiri says: In case of using shared block, user might find it handy to be able to insert filters to match on particular ingress device. This patchset exposes the ingress ifindex through flow_dissector and flow_offload so mlxsw can use it to push down to HW. See the selftests for examples of usage. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-19 10:09:22 -04:00
Jiri Pirko	dcc5e1f9ca	selftests: tc: add ingress device matching support Extend tc_flower to test plain ingress device matching and also tc_shblock to test ingress device matching on shared block. Add new tc_flower_router.sh where ingress device matching on egress (after routing) is done. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-19 10:09:22 -04:00
Jiri Pirko	0c1f391d19	mlxsw: spectrum_flower: Implement support for ingress device matching Benefit from the previously extended flow_dissector infrastructure and offload matching on ingress port. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-19 10:09:22 -04:00
Jiri Pirko	d8e9461446	mlxsw: spectrum_acl: Fix SRC_SYS_PORT element size Fix the size of the SRC_SYS_PORT element to be 16. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-19 10:09:22 -04:00
Jiri Pirko	ff5405f690	mlxsw: spectrum_acl: Avoid size check for RX_ACL_SYSTEM_PORT element RX_ACL_SYSTEM_PORT is 8 bit but SRC_SYS_PORT is 16 bits. Internally, SRC_SYS_PORT is used to carry the value. Relax the checker in case of RX_ACL_SYSTEM_PORT and allow different size. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-19 10:09:22 -04:00
Jiri Pirko	511a5adcaa	mlxsw: spectrum_acl: Write RX_ACL_SYSTEM_PORT acl element correctly RX_ACL_SYSTEM_PORT is equal to SRC_SYS_PORT - 1. So before write to block we need to adjust the key value. Introduce new "EXT" helper to implement this. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-19 10:09:22 -04:00
Jiri Pirko	9558a83aee	net: flow_offload: implement support for meta key Implement support for previously added flow dissector meta key. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-19 10:09:22 -04:00
Jiri Pirko	8212ed777f	net: sched: cls_flower: use flow_dissector for ingress ifindex Use previously introduced infra to obtain and store ingress ifindex instead doing it locally. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-19 10:09:22 -04:00
Jiri Pirko	82828b88f0	flow_dissector: add support for ingress ifindex dissection Add new key meta that contains ingress ifindex value and add a function to dissect this from skb. The key and function is prepared to cover other potential skb metadata values dissection. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-19 10:09:21 -04:00
Colin Ian King	39f5886032	net/mlx5: add missing void argument to function mlx5_devlink_alloc Function mlx5_devlink_alloc is missing a void argument, add it to clean up the non-ANSI function declaration. Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 22:28:50 -04:00
David S. Miller	da21ad276a	Merge branch 'net-mvpp2-cls-Allow-steering-based-on-vlan-tag' Maxime Chevallier says: ==================== net: mvpp2: cls: Allow steering based on vlan tag The PPv2 classifier can perform flow steering based on keys extracted from the VLAN tag. This series adds support for using the vlan id and the vlan prio as keys, using the ethtool interface. Patch 1 is a preparatory patch that prevent false-positive matches, using a dedicated lookup id for the RSS C2 lookup. Patch 2 allows to separate the flows based on the header fields they contain. The main goal is to be able to separate tagged traffic from untagged traffic for flow steering, just as we already do for RSS. Patch 3 solves an issue we have when extracting fields that aren't full bytes, such as the vlan tag which is 12 bits wide, or the priority which is 3 bits wide. Finally, patch 4 adds support for steering based on both vlan id and priority, extracted from the outermost tag. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 22:26:05 -04:00
Maxime Chevallier	1274daede3	net: mvpp2: cls: Add steering based on vlan Id and priority. This commit allows using the vlan Id and priority as parts of the key for classification offload. These fields are extracted from the outermost tag, if multiple tags are present. Vlan Id and priority are considered as 2 different fields by the classifier, however the fields are both appended in the Header Extracted Key in the same layout as they are found in the tags. This means that when steering only based on the prio, a 16-bit slot is still taken in the HEK. The classifier doesn't allow extracting the DEI bit from the tag, so we explicitly prevent user from using this bit in the key. This commit adds the vlan priotity as a compatible HEK field for tagged traffic, meaning that we limit the possibility of extracting this field only to the flows that contain tagged traffic. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 22:26:05 -04:00
Maxime Chevallier	12b8e2dd01	net: mvpp2: cls: right-justify the C2 TCAM keys The C2 TCAM used for classification uses a key (Header Extracted Key) built by concatenating several fields extracted from the packet header. After a lot of trial-and-error and some guess work, it seems the HEK is right justified, with the first fields being stored in the MSB, then concatenated up until the LSB. Until now, this doesn't cause any issue since all HEK fields we use are full bytes. However this is an issue for the upcoming VLAN id and pri extraction, which aren't full bytes. Rework the way we built that TCAM key, by changing the order in which we append the fields. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 22:26:05 -04:00
Maxime Chevallier	834df6ea95	net: mvpp2: cls: Only select applicable flows of classification offload The way we currently handle classification offload and RSS is by having dedicated lookup sequences in the flow table, each being selected depending on several fields being present in the packet header. We need to make sure the classification operation we want to perform can be done in each flow we want to insert it into. As an example, classifying on VLAN tag can only be done on flows used for tagged traffic. This commit makes sure we don't insert rules in flows we aren't compatible with. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 22:26:04 -04:00
Maxime Chevallier	c641af4f6f	net: mvpp2: cls: Use a dedicated lu_type for the RSS lookup When performing a TCAM lookup in the C2 engine, it's possible that multiple entries match the packet. To make sure the correct entry match when performing a lookup, the Flow Table can set a lookup type, which will be used in the TCAM lookup, thus preventing such false-positives. We need to make sure the RSS match doesn't interfere with other classification lookups, hence we use a dedicated lookup_type for it. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 22:26:04 -04:00
David S. Miller	9368b8e24b	Merge branch 'macb-SiFive-FU540-C000' Yash Shah says: ==================== Add macb support for SiFive FU540-C000 On FU540, the management IP block is tightly coupled with the Cadence MACB IP block. It manages many of the boundary signals from the MACB IP This patchset controls the tx_clk input signal to the MACB IP. It switches between the local TX clock (125MHz) and PHY TX clocks. This is necessary to toggle between 1Gb and 100/10Mb speeds. Future patches may add support for monitoring or controlling other IP boundary signals. This patchset is mostly based on work done by Wesley Terpstra <wesley@sifive.com> This patchset is based on Linux v5.2-rc1 and tested on HiFive Unleashed board with additional board related patches needed for testing can be found at dev/yashs/ethernet_v3 branch of: https://github.com/yashshah7/riscv-linux.git Change History: V3: - Revert "MACB_SIFIVE_FU540" config changes in Kconfig and driver code. The driver does not depend on SiFive GPIO driver. V2: - Change compatible string from "cdns,fu540-macb" to "sifive,fu540-macb" - Add "MACB_SIFIVE_FU540" in Kconfig to support SiFive FU540 in macb driver. This is needed because on FU540, the macb driver depends on SiFive GPIO driver. - Avoid writing the result of a comparison to a register. - Fix the issue of probe fail on reloading the module reported by: Andreas Schwab <schwab@suse.de> ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 22:02:27 -04:00
Yash Shah	c218ad5590	macb: Add support for SiFive FU540-C000 The management IP block is tightly coupled with the Cadence MACB IP block on the FU540, and manages many of the boundary signals from the MACB IP. This patch only controls the tx_clk input signal to the MACB IP. Future patches may add support for monitoring or controlling other IP boundary signals. Signed-off-by: Yash Shah <yash.shah@sifive.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 22:02:27 -04:00
Yash Shah	d4993e19da	macb: bindings doc: add sifive fu540-c000 binding Add the compatibility string documentation for SiFive FU540-C0000 interface. On the FU540, this driver also needs to read and write registers in a management IP block that monitors or drives boundary signals for the GEMGXL IP block that are not directly mapped to GEMGXL registers. Therefore, add additional range to "reg" property for SiFive GEMGXL management IP registers. Signed-off-by: Yash Shah <yash.shah@sifive.com> Reviewed-by: Paul Walmsley <paul.walmsley@sifive.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 22:02:27 -04:00
David S. Miller	d75d5f9764	Merge branch 'hinic-add-rss-support-and-rss-parameters-configuration' Xue Chaojing says: ==================== hinic: add rss support and rss parameters configuration This series add rss support for HINIC driver and implement the ethtool interface related to rss parameter configuration. user can use ethtool configure rss parameters or show rss parameters. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 21:52:27 -04:00
Xue Chaojing	4fdc51bb4e	hinic: add support for rss parameters with ethtool This patch adds support rss parameters with ethtool, user can change hash key, hash indirection table, hash function by ethtool -X, and show rss parameters by ethtool -x. Signed-off-by: Xue Chaojing <xuechaojing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 21:52:27 -04:00
Xue Chaojing	eb8ce9ac16	hinic: move ethtool code into hinic_ethtool This patch moves ethtool code from hinic_main.c to hinic_ethtool.c Signed-off-by: Xue Chaojing <xuechaojing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 21:52:27 -04:00
Xue Chaojing	421e952628	hinic: add rss support This patch adds rss support for the HINIC driver. Signed-off-by: Xue Chaojing <xuechaojing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 21:52:27 -04:00
Colin Ian King	760f1dc295	net: stmmac: add sanity check to device_property_read_u32_array call Currently the call to device_property_read_u32_array is not error checked leading to potential garbage values in the delays array that are then used in msleep delays. Add a sanity check to the property fetching. Addresses-Coverity: ("Uninitialized scalar variable") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 21:03:38 -04:00
Geert Uytterhoeven	cf29a49879	net: hns3: Add missing newline at end of file "git diff" says: \ No newline at end of file after modifying the file. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 20:56:42 -04:00
Arnd Bergmann	815deee0e3	qed: Fix -Wmaybe-uninitialized false positive A previous attempt to shut up the uninitialized variable use warning was apparently insufficient. When CONFIG_PROFILE_ANNOTATED_BRANCHES is set, gcc-8 still warns, because the unlikely() check in DP_NOTICE() causes it to no longer track the state of all variables correctly: drivers/net/ethernet/qlogic/qed/qed_dev.c: In function 'qed_llh_set_ppfid_affinity': drivers/net/ethernet/qlogic/qed/qed_dev.c:798:47: error: 'abs_ppfid' may be used uninitialized in this function [-Werror=maybe-uninitialized] addr = NIG_REG_PPF_TO_ENGINE_SEL + abs_ppfid * 0x4; ~~~~~~~~~~^~~~~ This is not a nice workaround, but always initializing the output from qed_llh_abs_ppfid() at least shuts up the false positive reliably. Fixes: `79284adeb9` ("qed: Add llh ppfid interface and 100g support for offload protocols") Fixes: `8e2ea3ea96` ("qed: Fix static checker warning") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 10:44:21 -07:00
Geert Uytterhoeven	b594850e65	ps3_gelic: Use [] to denote a flexible array member Flexible array members should be denoted using [] instead of [0], else gcc will not warn when they are no longer at the end of a struct. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 10:43:20 -07:00
Denis Kirjanov	75345f888f	ipoib: show VF broadcast address in IPoIB case we can't see a VF broadcast address for but can see for PF Before: 11: ib1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc pfifo_fast state UP mode DEFAULT group default qlen 256 link/infiniband 80:00:00:66:fe:80:00:00:00:00:00:00:24:8a:07:03:00:a4:3e:7c brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff vf 0 MAC 14:80:00:00:66:fe, spoof checking off, link-state disable, trust off, query_rss off ... After: 11: ib1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc pfifo_fast state UP mode DEFAULT group default qlen 256 link/infiniband 80:00:00:66:fe:80:00:00:00:00:00:00:24:8a:07:03:00:a4:3e:7c brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff vf 0 link/infiniband 80:00:00:66:fe:80:00:00:00:00:00:00:24:8a:07:03:00:a4:3e:7c brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off v1->v2: add the IFLA_VF_BROADCAST constant v2->v3: put IFLA_VF_BROADCAST at the end to avoid KABI breakage and set NLA_REJECT dev_setlink Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org> Acked-by: Doug Ledford <dledford@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 10:41:28 -07:00
Denis Kirjanov	64d701c608	ipoib: correcly show a VF hardware address in the case of IPoIB with SRIOV enabled hardware ip link show command incorrecly prints 0 instead of a VF hardware address. Before: 11: ib1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc pfifo_fast state UP mode DEFAULT group default qlen 256 link/infiniband 80:00:00:66:fe:80:00:00:00:00:00:00:24:8a:07:03:00:a4:3e:7c brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff vf 0 MAC 00:00:00:00:00:00, spoof checking off, link-state disable, trust off, query_rss off ... After: 11: ib1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc pfifo_fast state UP mode DEFAULT group default qlen 256 link/infiniband 80:00:00:66:fe:80:00:00:00:00:00:00:24:8a:07:03:00:a4:3e:7c brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff vf 0 link/infiniband 80:00:00:66:fe:80:00:00:00:00:00:00:24:8a:07:03:00:a4:3e:7c brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off v1->v2: just copy an address without modifing ifla_vf_mac v2->v3: update the changelog Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org> Acked-by: Doug Ledford <dledford@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 10:41:27 -07:00
David S. Miller	2ae6b594fb	Merge branch 'mlxsw-Improve-IPv6-route-insertion-rate' Ido Schimmel says: ==================== mlxsw: Improve IPv6 route insertion rate Unlike IPv4, an IPv6 multipath route in the kernel is composed from multiple sibling routes, each representing a single nexthop. Therefore, an addition of a multipath route with N nexthops translates to N in-kernel notifications. This is inefficient for device drivers that need to program the route to the underlying device. Each time a new nexthop is appended, a new nexthop group needs to be constructed and the old one deleted. This patchset improves the situation by sending a single notification for a multipath route addition / deletion instead of one per-nexthop. When adding thousands of multipath routes with 16 nexthops, I measured an improvement of about x10 in the insertion rate. Patches #1-#3 add a flag that indicates that in-kernel notifications need to be suppressed and extend the IPv6 FIB notification info with information about the number of sibling routes that are being notified. Patches #4-#5 adjust the two current listeners to these notifications to ignore notifications about IPv6 multipath routes. Patches #6-#7 adds add / delete notifications for IPv6 multipath routes. Patches #8-#14 do the same for mlxsw. Patch #15 finally removes the limitations added in patches #4-#5 and stops the kernel from sending a notification for each added / deleted nexthop. Patch #16 adds test cases. v2 (David Ahern): * Remove patch adjusting netdevsim to consume resources for each fib6_info. Instead, consume one resource for the entire multipath route * Remove 'multipath_rt' usage in patch #10 * Remove 'multipath_rt' from 'struct fib6_entry_notifier_info' in patch #15. The member is only removed in this patch to prevent drivers from processing multipath routes twice during the series ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 09:45:37 -07:00
Ido Schimmel	12ee822039	selftests: mlxsw: Add a test for FIB offload indication Test that the offload indication for unicast routes is correctly set in different scenarios. IPv4 support will be added in the future. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 09:45:37 -07:00
Ido Schimmel	d5382fef70	ipv6: Stop sending in-kernel notifications for each nexthop Both listeners - mlxsw and netdevsim - of IPv6 FIB notifications are now ready to handle IPv6 multipath notifications. Therefore, stop ignoring such notifications in both drivers and stop sending notification for each added / deleted nexthop. v2: * Remove 'multipath_rt' from 'struct fib6_entry_notifier_info' Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 09:45:37 -07:00
Ido Schimmel	2d9dd7ec79	mlxsw: spectrum_router: Create IPv6 multipath routes in one go Allow the driver to create an IPv6 multipath route in one go by passing an array of sibling routes and iterating over them. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 09:45:37 -07:00
Ido Schimmel	d21afd3029	mlxsw: spectrum_router: Add / delete multiple IPv6 nexthops Currently, the functions that take care of populating IPv6 nexthop groups only add / delete a single nexthop. Prepare them to handle multiple routes in one notification by passing an array of routes and adding / deleting all of them. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 09:45:37 -07:00
Ido Schimmel	921bc539cb	mlxsw: spectrum_router: Pass array of routes to route handling functions Prepare the driver to handle multiple routes in a single notification by passing an array of routes to the functions that actually add / delete a route. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 09:45:37 -07:00
Ido Schimmel	94d628d1f9	mlxsw: spectrum_router: Adjust IPv6 replace logic to new notifications Previously, IPv6 replace notifications were only sent from fib6_add_rt2node(). The function only emitted such notifications if a route actually replaced another route. A previous patch added another call site in ip6_route_multipath_add() from which such notification can be emitted even if a route was merely added and did not replace another route. Adjust the driver to take this into account and potentially set the 'replace' flag to 'false' if the notified route did not replace an existing route. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 09:45:37 -07:00
Ido Schimmel	928c0b534f	mlxsw: spectrum_router: Pass multiple routes to work item Prepare the driver to process IPv6 multipath notifications by passing an array of 'struct fib6_info' instead of just one route. A reference is taken on each sibling route in order to prevent them from being freed until they are processed by the workqueue. v2: * Remove 'multipath_rt' usage Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 09:45:37 -07:00
Ido Schimmel	ccd56a5f50	mlxsw: spectrum_router: Prepare function to return errors The function mlxsw_sp_router_fib6_event() takes care of preparing the needed information for the work item that actually inserts the route into the device. When processing an IPv6 multipath route, the function will need to allocate an array to store pointers to all the sibling routes. Change the function's signature to return an error code and adjust the single call site. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 09:45:36 -07:00
Ido Schimmel	20247fcab3	mlxsw: spectrum_router: Remove processing of IPv6 append notifications No such notifications are sent by the IPv6 code, so remove them. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 09:45:36 -07:00

1 2 3 4 5 ...

842385 Commits