linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-18 09:46:50 +07:00

Author	SHA1	Message	Date
YueHaibing	a0bc653b1d	net: dsa: bcm_sf2: Remove set but not used variables 'v6_spec, v6_m_spec' Fixes gcc '-Wunused-but-set-variable' warning: drivers/net/dsa/bcm_sf2_cfp.c: In function 'bcm_sf2_cfp_ipv6_rule_set': drivers/net/dsa/bcm_sf2_cfp.c:606:40: warning: variable 'v6_m_spec' set but not used [-Wunused-but-set-variable] drivers/net/dsa/bcm_sf2_cfp.c:606:30: warning: variable 'v6_spec' set but not used [-Wunused-but-set-variable] It not used any more after commit `e4f7ef54cb` ("dsa: bcm_sf2: use flow_rule infrastructure") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-17 15:31:47 -08:00
Pieter Jansen van Vuuren	0496743b20	nfp: flower: fix masks for tcp and ip flags fields Check mask fields of tcp and ip flags when setting the corresponding mask flag used in hardware. Fixes: `8f2566225a` ("flow_offload: add flow_rule and flow_match") Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-17 15:28:50 -08:00
David S. Miller	eaec2efbe4	Merge branch 'devlink-add-the-ability-to-update-device-flash' Jakub Kicinski says: ==================== devlink: add the ability to update device flash This series is the second step to allow trouble shooting and recovering devices in bad state without the use of netdevs as handles. We can already query FW versions over devlink, now we add the ability to update the FW. This will allow drivers to implement some from of "limp-mode" where the device can't really be used for networking and hence has no netdev, but we can interrogate it over devlink and fix the broken FW. Small but nice advantage of devlink is that it only holds the devlink instance lock during flashing, unlike ethtool which holds rtnl_lock(). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-17 15:27:39 -08:00
Jakub Kicinski	5c5696f3df	nfp: devlink: allow flashing the device via devlink Devlink now allows updating device flash. Implement this callback. Compared to ethtool update we no longer have to release the networking locks - devlink doesn't take them. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-17 15:27:38 -08:00
Jakub Kicinski	4eceba1720	ethtool: add compat for flash update If driver does not support ethtool flash update operation call into devlink. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-17 15:27:38 -08:00
Jakub Kicinski	76726ccb7f	devlink: add flash update command Add devlink flash update command. Advanced NICs have firmware stored in flash and often cryptographically secured. Updating that flash is handled by management firmware. Ethtool has a flash update command which served us well, however, it has two shortcomings: - it takes rtnl_lock unnecessarily - really flash update has nothing to do with networking, so using a networking device as a handle is suboptimal, which leads us to the second one: - it requires a functioning netdev - in case device enters an error state and can't spawn a netdev (e.g. communication with the device fails) there is no netdev to use as a handle for flashing. Devlink already has the ability to report the firmware versions, now with the ability to update the firmware/flash we will be able to recover devices in bad state. To enable updates of sub-components of the FW allow passing component name. This name should correspond to one of the versions reported in devlink info. v1: - replace target id with component name (Jiri). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-17 15:27:38 -08:00
David S. Miller	8e31c47424	Merge branch 'net-phy-improve-and-use-phy_resolve_aneg_linkmode' Heiner Kallweit says: ==================== net: phy: improve and use phy_resolve_aneg_linkmode Improve phy_resolve_aneg_linkmode and use it in genphy_read_status. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-17 15:21:39 -08:00
Heiner Kallweit	5502b218e0	net: phy: use phy_resolve_aneg_linkmode in genphy_read_status Now that we have phy_resolve_aneg_linkmode() we can make genphy_read_status() much simpler. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-17 15:21:38 -08:00
Heiner Kallweit	a2703de709	net: phy: improve phy_resolve_aneg_linkmode We have the settings array of modes which is sorted based on aneg priority. Instead of checking each mode manually let's simply iterate over the sorted settings. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-17 15:21:38 -08:00
Vlad Buslov	8b58d12f4a	net: sched: cgroup: verify that filter is not NULL during walk Check that filter is not NULL before passing it to tcf_walker->fn() callback in cls_cgroup_walk(). This can happen when cls_cgroup_change() failed to set first filter. Fixes: `ed76f5edcc` ("net: sched: protect filter_chain list with filter_chain_lock mutex") Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-17 13:26:57 -08:00
Vlad Buslov	d66022cd16	net: sched: matchall: verify that filter is not NULL in mall_walk() Check that filter is not NULL before passing it to tcf_walker->fn() callback. This can happen when mall_change() failed to offload filter to hardware. Fixes: `ed76f5edcc` ("net: sched: protect filter_chain list with filter_chain_lock mutex") Reported-by: Ido Schimmel <idosch@mellanox.com> Tested-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-17 13:26:39 -08:00
Vlad Buslov	3027ff41f6	net: sched: route: don't set arg->stop in route4_walk() when empty Some classifiers set arg->stop in their implementation of tp->walk() API when empty. Most of classifiers do not adhere to that convention. Do not set arg->stop in route4_walk() to unify tp->walk() behavior among classifier implementations. Fixes: `ed76f5edcc` ("net: sched: protect filter_chain list with filter_chain_lock mutex") Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-17 13:24:39 -08:00
Vlad Buslov	31a9984876	net: sched: fw: don't set arg->stop in fw_walk() when empty Some classifiers set arg->stop in their implementation of tp->walk() API when empty. Most of classifiers do not adhere to that convention. Do not set arg->stop in fw_walk() to unify tp->walk() behavior among classifier implementations. Fixes: `ed76f5edcc` ("net: sched: protect filter_chain list with filter_chain_lock mutex") Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-17 13:24:18 -08:00
Jann Horn	1eb00162f8	net: caif: use skb helpers instead of open-coding them Use existing skb_put_data() and skb_trim() instead of open-coding them, with the skb_put_data() first so that logically, `skb` still contains the data to be copied in its data..tail area when skb_put_data() reads it. This change on its own is a cleanup, and it is also necessary for potential future integration of skbuffs with things like KASAN. Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-17 11:01:17 -08:00
Vadim Pasternak	6a79507cfe	mlxsw: core: Extend thermal module with per QSFP module thermal zones Add a dedicated thermal zone for each QSFP/SFP module. The current temperature is obtained from the module's temperature sensor and the trip points are set based on the warning and critical thresholds read from the module. A cooling device (fan) is bound to all the thermal zones. The thermal zone governor is set to user space in order to avoid collisions between thermal zones. For example, one thermal zone might want to increase the speed of the fan, whereas another one would like to decrease it. Deferring this decision to user space allows the user to the take the most suitable decision. Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-17 10:57:49 -08:00
David S. Miller	3c136c542a	Merge branch 'neigh-tracepoints' Roopa Prabhu says: ==================== tracepoints in neighbor subsystem Roopa Prabhu (2): trace: events: add a few neigh tracepoints neigh: hook tracepoints in neigh update code ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-17 10:33:39 -08:00
Roopa Prabhu	56dd18a49f	neigh: hook tracepoints in neigh update code hook tracepoints at the end of functions that update a neigh entry. neigh_update gets an additional tracepoint to trace the update flags and old and new neigh states. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-17 10:33:39 -08:00
Roopa Prabhu	9c03b282ba	trace: events: add a few neigh tracepoints The goal here is to trace neigh state changes covering all possible neigh update paths. Plus have a specific trace point in neigh_update to cover flags sent to neigh_update. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-17 10:33:39 -08:00
David S. Miller	9e8ccd8957	Merge branch 'net-phy-add-and-use-genphy_c45_an_config_an' Heiner Kallweit says: ==================== net: phy: add and use genphy_c45_an_config_an This series adds genphy_c45_an_config_an() and uses it in the marvell10g diver. In addition patch 4 aligns the aneg configuration with what is done in genphy_config_aneg(). v2: - in patch 2 changed function name to genphy_c45_an_config_aneg - in patch 3 add a comment regarding 1000BaseT vendor registers v3: - rebase patch 3 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-17 10:27:00 -08:00
Heiner Kallweit	3ce2a027ae	net: phy: marvell10g: check for newly set aneg Even if the advertisement registers content didn't change, we may have just switched to aneg, and therefore have to trigger an aneg restart. This matches the behavior of genphy_config_aneg(). Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-17 10:26:52 -08:00
Andrew Lunn	3de97f3c63	net: phy: marvell10g: use genphy_c45_an_config_aneg Use new function genphy_c45_config_aneg() in mv3310_config_aneg(). v2: - add a comment regarding 1000BaseT vendor registers v3: - rebased Signed-off-by: Andrew Lunn <andrew@lunn.ch> [hkallweit1@gmail.com: patch splitted] Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-17 10:26:52 -08:00
Andrew Lunn	9a5dc8af44	net: phy: add genphy_c45_an_config_aneg C45 configuration of 10/100 and multi-giga bit auto negotiation advertisement is standardized. Configuration of 1000Base-T however appears to be vendor specific. Move the generic code out of the Marvell driver into the common phy-c45.c file. v2: - change function name to genphy_c45_an_config_aneg Signed-off-by: Andrew Lunn <andrew@lunn.ch> [hkallweit1@gmail.com: use new helper linkmode_adv_to_mii_10gbt_adv_t and split patch] Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-17 10:26:35 -08:00
Heiner Kallweit	744e458aeb	net: phy: add helper linkmode_adv_to_mii_10gbt_adv_t Add a helper linkmode_adv_to_mii_10gbt_adv_t(), similar to linkmode_adv_to_mii_adv_t. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-17 10:26:34 -08:00
David S. Miller	885e631959	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Alexei Starovoitov says: ==================== pull-request: bpf-next 2019-02-16 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) numerous libbpf API improvements, from Andrii, Andrey, Yonghong. 2) test all bpf progs in alu32 mode, from Jiong. 3) skb->sk access and bpf_sk_fullsock(), bpf_tcp_sock() helpers, from Martin. 4) support for IP encap in lwt bpf progs, from Peter. 5) remove XDP_QUERY_XSK_UMEM dead code, from Jan. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-16 22:56:34 -08:00
Andrii Nakryiko	5aab392c55	tools/libbpf: support bigger BTF data sizes While it's understandable why kernel limits number of BTF types to 65535 and size of string section to 64KB, in libbpf as user-space library it's too restrictive. E.g., pahole converting DWARF to BTF type information for Linux kernel generates more than 3 million BTF types and more than 3MB of strings, before deduplication. So to allow btf__dedup() to do its work, we need to be able to load bigger BTF sections using btf__new(). Singed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-02-16 18:47:18 -08:00
Peter Oskolkov	9d6b3584a7	selftests: bpf: test_lwt_ip_encap: add negative tests. As requested by David Ahern: - add negative tests (no routes, explicitly unreachable destinations) to exercize error handling code paths; - do not exit on test failures, but instead print a summary of passed/failed tests at the end. Future patches will add TSO and VRF tests. Signed-off-by: Peter Oskolkov <posk@google.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-02-16 18:41:44 -08:00
Alexandre Torgue	f186a82b10	net: stmmac: use correct define to get rx timestamp on GMAC4 In dwmac4_wrback_get_rx_timestamp_status we looking for a RX timestamp. For that receive descriptors are handled and so we should use defines related to receive descriptors. It'll no change the functional behavior as RDES3_RDES1_VALID=TDES3_RS1V=BIT(26) but it makes code easier to read. Signed-off-by: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-16 18:13:58 -08:00
Dan Carpenter	d0edde8d29	atm: clean up vcc_seq_next() It's confusing to call PTR_ERR(v). The PTR_ERR() function is basically a fancy cast to long so it makes you wonder, was IS_ERR() intended? But that doesn't make sense because vcc_walk() doesn't return error pointers. This patch doesn't affect runtime, it's just a cleanup. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-16 18:12:22 -08:00
Guillaume Nault	4057765f2d	sock: consistent handling of extreme SO_SNDBUF/SO_RCVBUF values SO_SNDBUF and SO_RCVBUF (and their BUFFORCE version) may overflow or underflow their input value. This patch aims at providing explicit handling of these extreme cases, to get a clear behaviour even with values bigger than INT_MAX / 2 or lower than INT_MIN / 2. For simplicity, only SO_SNDBUF and SO_SNDBUFFORCE are described here, but the same explanation and fix apply to SO_RCVBUF and SO_RCVBUFFORCE (with 'SNDBUF' replaced by 'RCVBUF' and 'wmem_max' by 'rmem_max'). Overflow of positive values =========================== When handling SO_SNDBUF or SO_SNDBUFFORCE, if 'val' exceeds INT_MAX / 2, the buffer size is set to its minimum value because 'val 2' overflows, and max_t() considers that it's smaller than SOCK_MIN_SNDBUF. For SO_SNDBUF, this can only happen with net.core.wmem_max > INT_MAX / 2. SO_SNDBUF and SO_SNDBUFFORCE are actually designed to let users probe for the maximum buffer size by setting an arbitrary large number that gets capped to the maximum allowed/possible size. Having the upper half of the positive integer space to potentially reduce the buffer size to its minimum value defeats this purpose. This patch caps the base value to INT_MAX / 2, so that bigger values don't overflow and keep setting the buffer size to its maximum. Underflow of negative values ============================ For negative numbers, SO_SNDBUF always considers them bigger than net.core.wmem_max, which is bounded by [SOCK_MIN_SNDBUF, INT_MAX]. Therefore such values are set to net.core.wmem_max and we're back to the behaviour of positive integers described above (return maximum buffer size if wmem_max <= INT_MAX / 2, return SOCK_MIN_SNDBUF otherwise). However, SO_SNDBUFFORCE behaves differently. The user value is directly multiplied by two and compared with SOCK_MIN_SNDBUF. If 'val * 2' doesn't underflow or if it underflows to a value smaller than SOCK_MIN_SNDBUF then buffer size is set to its minimum value. Otherwise the buffer size is set to the underflowed value. This patch treats negative values passed to SO_SNDBUFFORCE as null, to prevent underflows. Therefore negative values now always set the buffer size to its minimum value. Even though SO_SNDBUF behaves inconsistently by setting buffer size to the maximum value when passed a negative number, no attempt is made to modify this behaviour. There may exist some programs that rely on using negative numbers to set the maximum buffer size. Avoiding overflows because of extreme net.core.wmem_max values is the most we can do here. Summary of altered behaviours ============================= val : user-space value passed to setsockopt() val_uf : the underflowed value resulting from doubling val when val < INT_MIN / 2 wmem_max : short for net.core.wmem_max val_cap : min(val, wmem_max) min_len : minimal buffer length (that is, SOCK_MIN_SNDBUF) max_len : maximal possible buffer length, regardless of wmem_max (that is, INT_MAX - 1) ^^^^ : altered behaviour SO_SNDBUF: +-------------------------+-------------+------------+----------------+ \| CONDITION \| OLD RESULT \| NEW RESULT \| COMMENT \| +-------------------------+-------------+------------+----------------+ \| val < 0 && \| \| \| No overflow, \| \| wmem_max <= INT_MAX/2 \| wmem_max2 \| wmem_max2 \| keep original \| \| \| \| \| behaviour \| +-------------------------+-------------+------------+----------------+ \| val < 0 && \| \| \| Cap wmem_max \| \| INT_MAX/2 < wmem_max \| min_len \| max_len \| to prevent \| \| \| \| ^^^^^^^ \| overflow \| +-------------------------+-------------+------------+----------------+ \| 0 <= val <= min_len/2 \| min_len \| min_len \| Ordinary case \| +-------------------------+-------------+------------+----------------+ \| min_len/2 < val && \| val_cap2 \| val_cap2 \| Ordinary case \| \| val_cap <= INT_MAX/2 \| \| \| \| +-------------------------+-------------+------------+----------------+ \| min_len < val && \| \| \| Cap val_cap \| \| INT_MAX/2 < val_cap \| min_len \| max_len \| again to \| \| (implies that \| \| ^^^^^^^ \| prevent \| \| INT_MAX/2 < wmem_max) \| \| \| overflow \| +-------------------------+-------------+------------+----------------+ SO_SNDBUFFORCE: +------------------------------+---------+---------+------------------+ \| CONDITION \| BEFORE \| AFTER \| COMMENT \| \| \| PATCH \| PATCH \| \| +------------------------------+---------+---------+------------------+ \| val < INT_MIN/2 && \| min_len \| min_len \| Underflow with \| \| val_uf <= min_len \| \| \| no consequence \| +------------------------------+---------+---------+------------------+ \| val < INT_MIN/2 && \| val_uf \| min_len \| Set val to 0 to \| \| val_uf > min_len \| \| ^^^^^^^ \| avoid underflow \| +------------------------------+---------+---------+------------------+ \| INT_MIN/2 <= val < 0 \| min_len \| min_len \| No underflow \| +------------------------------+---------+---------+------------------+ \| 0 <= val <= min_len/2 \| min_len \| min_len \| Ordinary case \| +------------------------------+---------+---------+------------------+ \| min_len/2 < val <= INT_MAX/2 \| val2 \| val2 \| Ordinary case \| +------------------------------+---------+---------+------------------+ \| INT_MAX/2 < val \| min_len \| max_len \| Cap val to \| \| \| \| ^^^^^^^ \| prevent overflow \| +------------------------------+---------+---------+------------------+ Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-16 18:09:54 -08:00
David S. Miller	f2281c245d	Support Mellanox BlueField SmartNIC (mlx5-updates-2019-02-15) Bodong Wang says, BlueField device is a multi-core ARM processor in a highly integrated system on chip coupled with the ConnectX interconnect controller. BlueField device can be presented in one out of two modes: - SEPARATED_HOST: ARM processors as a separated and orthogonal host like any other external host in the multi-host virtualization model. - EMBEDDED_CPU: ARM processors as Embedded CPU (EC) and part of the external hosts virtualization model. While existing driver already supports the device on separated_host mode, this patch series focus on the functionalities of embedded_cpu mode. On embedded_cpu mode, BlueField device exposes regular network controller PCI function in the BlueField host(e.g, x86). However, a separate PCI function called Embedded CPU Physical Function(ECPF) is also added to the ARM host side, where standard Linux distributions is able to run on the ARM cores. Depends on the NV configuration from firmware, ECPF can be the e-switch manager and firmware pages supplier. If ECPF is configured as e-switch manager and page supplier, it will take over the responsibilities from the PF on BlueField host includes: - Owns, controls and manages all e-switch parts, and takes e-switch traffic by default. It also should perform ENABLE_HCA for the host PF just like a PF does for its VFs. - Provides and manages the ICM host memory required for the HCA to store various contexts for itself, the PF and VFs belong the e-switch it manages. The PF on BlueField host side is still responsible for: - Control its own permanent MAC. - PCI and SRIOV configurations and perform ENABLE_HCA for its VFs. The ECPF can also retrieve information about the external host it controls, like host identifier, PCI BDF and number of virtual functions. As these parameters may be changed dynamically, an event will be triggered to the driver on ECPF side. -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJcZ2amAAoJEEg/ir3gV/o+lzMIALvLNUoD6pXi41MWsOwvmAHg 07mzg1N80Z66MCcFau40I8T3h9NLiRMzNtFrBNtxx+ruwKFUpAjJHjaU0sms0yQH WVjr35vk4XsZyDPSCJ4g/hCQVlgCT/1tIUvPO0YM9hjhDuVa9mT4wEpucQRDu8bO KfeXNXLDFnlWxjokhpSVj369ozh+LTv4Kzy0MBBbji97bG6MktGAT8uCimUy7wG0 7dlYimnZ1+iUD1/DZQadiLCUHUu/rTcnvF2+DcdG/nbSU8ydVLgj6vgtIfCYt4e5 kcQO5hmatnl0iJUA8GNpdHQwGjYytKneoGmIfbMAK4KjvFNF+3/8N4Bytoa5kzk= =DgTl -----END PGP SIGNATURE----- Merge tag 'mlx5-updates-2019-02-15' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== Support Mellanox BlueField SmartNIC (mlx5-updates-2019-02-15) Bodong Wang says, BlueField device is a multi-core ARM processor in a highly integrated system on chip coupled with the ConnectX interconnect controller. BlueField device can be presented in one out of two modes: - SEPARATED_HOST: ARM processors as a separated and orthogonal host like any other external host in the multi-host virtualization model. - EMBEDDED_CPU: ARM processors as Embedded CPU (EC) and part of the external hosts virtualization model. While existing driver already supports the device on separated_host mode, this patch series focus on the functionalities of embedded_cpu mode. On embedded_cpu mode, BlueField device exposes regular network controller PCI function in the BlueField host(e.g, x86). However, a separate PCI function called Embedded CPU Physical Function(ECPF) is also added to the ARM host side, where standard Linux distributions is able to run on the ARM cores. Depends on the NV configuration from firmware, ECPF can be the e-switch manager and firmware pages supplier. If ECPF is configured as e-switch manager and page supplier, it will take over the responsibilities from the PF on BlueField host includes: - Owns, controls and manages all e-switch parts, and takes e-switch traffic by default. It also should perform ENABLE_HCA for the host PF just like a PF does for its VFs. - Provides and manages the ICM host memory required for the HCA to store various contexts for itself, the PF and VFs belong the e-switch it manages. The PF on BlueField host side is still responsible for: - Control its own permanent MAC. - PCI and SRIOV configurations and perform ENABLE_HCA for its VFs. The ECPF can also retrieve information about the external host it controls, like host identifier, PCI BDF and number of virtual functions. As these parameters may be changed dynamically, an event will be triggered to the driver on ECPF side. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-16 12:11:17 -08:00
David S. Miller	bb015f2216	Merge branch 's390-next' Julian Wiedmann says: ==================== s390/qeth: updates 2019-02-15 please apply a few more qeth patches to net-next. Along with some smaller improvements, this revamps our code for the SW statistics that are exposed through ETHTOOL_GSTATS. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-15 20:35:30 -08:00
Julian Wiedmann	8024cc9e85	s390/qeth: split out OSN netdev ops Rather than special-casing OSN in a number of places, just give this device type its own netdev_ops structure. When setting up the OSN net_device, also skip the handling of the various HW offloads (eg TSO). The device shouldn't be advertising any of them, and the OSN code paths in qeth don't have support for them. In particular RX VLAN filtering is not supported, so don't hook up those callbacks in the netdev_ops. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-15 20:35:30 -08:00
Julian Wiedmann	1b4d5e1c61	s390/qeth: add support for ETHTOOL_GRINGPARAM Implement a trivial callback that exposes the queue sizes. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-15 20:35:30 -08:00
Julian Wiedmann	b0abc4f5df	s390/qeth: overhaul ethtool statistics Accumulate per-TX queue statistics, and increase their size to 64 bit. Don't bother with enabling/disabling the statistics, the overhead is negligible. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-15 20:35:29 -08:00
Julian Wiedmann	d896ac62d0	s390/qeth: move ethtool code into its own file Most of this is self-contained code. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-15 20:35:29 -08:00
Julian Wiedmann	4326b5b461	s390/qeth: reduce ethtool statistics Counting the number of function calls and the time spent in functions is best left to proper tracing facilities. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-15 20:35:29 -08:00
Julian Wiedmann	bb92d3f866	s390/qeth: use a static Output Queue array qeth dynamically allocates an array for storing pointers to its Output Queue structures. Switch this to a static array - we are currently limited to 4 Output Queues, so shrinking the qeth_qdio_info struct by just a few bytes doesn't justify the additional complexity. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-15 20:35:29 -08:00
Julian Wiedmann	0aa35a3689	s390/qeth: allow manual recovery when device is SOFTSETUP Once a qeth ccwgroup device is set online, it's also armed for internal recovery. So allow for testing that code path via sysfs, regardless of whether the interface is up or down. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-15 20:35:29 -08:00
Florian Fainelli	ff326d3cdf	selftests: forwarding: Add some missing configuration symbols For the forwarding selftests to work, we need network namespaces when using veth/vrf otherwise ping/ping6 commands like these: ip vrf exec vveth0 /bin/ping 192.0.2.2 -c 10 -i 0.1 -w 5 will fail because network namespaces may not be enabled. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-15 20:32:22 -08:00
Paolo Abeni	1490ed2abc	net/ipv6: prefer rcu_access_pointer() over rcu_dereference() rt6_cache_allowed_for_pmtu() checks for rt->from presence, but it does not access the RCU protected pointer. We can use rcu_access_pointer() and clean-up the code a bit. No functional changes intended. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-15 20:25:26 -08:00
Colin Ian King	59e6158aca	mlxsw: core: fix spelling mistake "temprature" -> "temperature" There is a spelling mistake in several dev_err messages, fix these. Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-15 20:16:52 -08:00
Bodong Wang	c96692fb8f	net/mlx5: E-Switch, Allow transition to offloads mode for ECPF Currently, the e-switch driver requires going to legacy mode before changing to the offloads mode. This makes sense for regular case as the legacy mode is done by creating VFs. However, it's problematic when ECPF is the eswitch manager. In such case, ECPF will control the vports on peer host including the peer PF and VFs. But ECPF doesn't need and shall not create VFs as the VFs are created in the peer PF host. Grant ECPF the ability to change from none to the offloads mode. Note that currently the only way to go back to none mode is by unloading the ECPF driver. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-02-15 17:25:58 -08:00
Bodong Wang	a3888f33db	net/mlx5: E-Switch, Load/unload VF reps according to event from host PF When host PF changes the number of VFs, the ECPF esw driver will get a FW event. It should query the number of VFs enabled by host PF and update the VF reps accordingly. Note that host PF can't change the number of VFs dynamically, it has to reset the number of VFs to 0 before changing to a new positive number. The host event is registered when driver is moving to switchdev mode, and it's the last step to do in esw_offloads_init. It's unregistered and the work queue is flushed when driver quits from switchdev mode. In this way, the host event and devlink command are serialized. When driver is enabling switchdev mode, pay attention to the following two facts: 1. Host PF must not have VF initialized as the flow table in ECPF has ENCAP enabled as default. Such flow table can't be created with existing initialized VFs. 2. ECPF doesn't know how many VFs the host PF will enable, ECPF offloads flow steering shall create the flow table/groups based on the max number of VFs possibly supported by host PF. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-02-15 17:25:58 -08:00
Bodong Wang	81cd229c29	net/mlx5: E-Switch, Consider ECPF vport depends on eswitch ownership ECPF connects to the eswitch through vport 0xfffe. ECPF may or may not be the eswitch manager depending on firmware configuration. 1. If ECPF is eswitch manager: ECPF will take over the eswitch manager responsibility. A rep of the host PF shall be created at the ECPF side for the eswitch manager to control. 2. If ECPF is not eswitch manager: host PF will be the eswitch manager, ECPF acts similar as a VF to the host PF. Host PF will be aware of the ECPF vport presence and control it's rep. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-02-15 17:25:58 -08:00
Bodong Wang	5ae5162066	net/mlx5: E-Switch, Assign a different position for uplink rep and vport In offloads mode, the current implementation puts the uplink representor at index zero of the vport reps array. It is not "natural" to place it at index 0 since we want to put the representor for vport 0 at index 0 with the introduction of SmartNIC. A separate patch will handle the case whether a rep is needed for vport 0 (PF vport). So, we want to have a different placeholder for uplink vport and representor. It was placed at the end of vport and rep array. Since vport number can no longer act as an index into the vport or representors arrays, use functions to map vport numbers to indices when accessing the vports or representors arrays, and vice versa. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-02-15 17:25:58 -08:00
Bodong Wang	f8e8fa0262	net/mlx5: E-Switch, Centralize repersentor reg/unreg to eswitch driver Eswitch has two users: IB and ETH. They both register repersentors when mlx5 interface is added, and unregister the repersentors when mlx5 interface is removed. Ideally, each driver should only deal with the entities which are unique to itself. However, current IB and ETH drivers have to perform the following eswitch operations: 1. When registering, specify how many vports to register. This number is the same for both drivers which is the total available vport numbers. 2. When unregistering, specify the number of registered vports to do unregister. Also, unload the repersentors which are already loaded. It's unnecessary for eswitch driver to hands out the control of above operations to individual driver users, as they're not unique to each driver. Instead, such operations should be centralized to eswitch driver. This consolidates eswitch control flow, and simplified IB and ETH driver. This patch doesn't change any functionality. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-02-15 17:25:58 -08:00
Bodong Wang	29d9fd7d5a	net/mlx5: E-Switch, Support load/unload reps of specific vport types Currently the driver loads and unloads all reps in an unbreakable group. However, with ECPF, the reps of special vports such as uplink and host PF should always be loaded in switchdev mode where the reps for VFs will be loaded on-demand and unloaded on no-demand. This is a pre-step for that change. This patch doesn't change any functionality. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-02-15 17:25:57 -08:00
Bodong Wang	f121e0ea95	net/mlx5: E-Switch, Add state to eswitch vport representors Currently the eswitch vport reps have a valid indicator, which is set on register and unset on unregister. However, a rep can be loaded or not loaded when doing unregister, current driver checks if the vport of that rep is enabled as a flag to imply the rep is loaded. However, for ECPF, this is not valid as the host PF will enable the vports for its VFs instead. Add three states: {unregistered, registered, loaded}, with the following state changes across different operations: create: (none) -> unregistered reg: unregistered -> registered load: registered -> loaded unload: loaded -> registered unreg: registered -> unregistered Note that the state shall only be updated inside eswitch driver rather than individual drivers such as ETH or IB. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Suggested-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-02-15 17:25:57 -08:00
Bodong Wang	879c8f84e3	net/mlx5: E-Switch, Use getter and iterator to access vport/rep With only PF and VF, it is sufficient to have the vport/rep array index as the vport number. This is because PF and VF vports numbers are consecutive serial numbers. In downstream patches with introducing of ECPF and UPLINK vports, it's not consecutive any more. Use getter to get specific vport/rep, and use iterator to traversal a list of vport/rep. This hides the translation between array index and vport number, and provides flexibility of using different translation mechanism in the future. This patch doesn't change any functionality. Signed-off-by: Bodong Wang <bodong@mellanox.com> Suggested-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-02-15 17:25:57 -08:00
Bodong Wang	c9b99abcf2	net/mlx5: E-Switch, Split VF and special vports for offloads mode When driver is entering offloads mode, there are two major tasks to do: initialize flow steering and create representors. Flow steering should make sure enough flow table/group spaces are reserved for all reps. Representors will be created in a group, all or none. With the introduction of ECPF, flow steering should still reserve the same spaces. But, the representors are not always loaded/unloaded in a single piece. Once ECPF is in offloads mode, it will get the number of VF changing event from host PF. In such scenario, only the VF reps should be loaded/unloaded, not the reps for special vports (such as the uplink vport). Thus, when entering offloads mode, driver should specify the total number of reps, and the number of VF reps separately. When leaving offloads mode, the cleanup should use the information self-contained in eswitch such as number of VFs. This patch doesn't change any functionality. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2019-02-15 17:25:57 -08:00

1 2 3 4 5 ...

813576 Commits