linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-26 11:15:10 +07:00

Author	SHA1	Message	Date
Ido Schimmel	a2d2a20553	mlxsw: spectrum: Replace hard-coded default VID with a define Subsequent patches are going to replace the current default VID (1) with VLAN_N_VID - 1 (4095). Prepare for this conversion by replacing the hard-coded '1' with a define. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 15:48:54 -08:00
Ido Schimmel	f40be47a3e	mlxsw: spectrum_router: Do not force specific configuration order In symmetric routing, the only two members in the VLAN corresponding to the L3 VNI are the router port and the VXLAN tunnel. In case the VXLAN device is already enslaved to the bridge and only later the VLAN interface is configured, the tunnel will not be offloaded. The reason for this is that when the router interface (RIF) corresponding to the VLAN interface is configured, it calls the core fid_get() API which does not check if NVE should be enabled on the FID. Instead, call into the bridge code which will check if NVE should be enabled on the FID. This effectively means that the same code path is used to retrieve a FID when either a local port or a router port joins the FID. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 15:48:54 -08:00
David S. Miller	6eea2db210	Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2018-12-20 This series contains updates to e100, igb, ixgbe, i40e and ice drivers. I replaced spinlocks for mutex locks to reduce the latency on CPU0 for igb when updating the statistics. This work was based off a patch provided by Jan Jablonsky, which was against an older version of the igb driver. Jesus adjusts the receive packet buffer size from 32K to 30K when running in QAV mode, to stay within 60K for total packet buffer size for igb. Vinicius adds igb kernel documentation regarding the CBS algorithm and its implementation in the i210 family of NICs. YueHaibing from Huawei fixed the e100 driver that was potentially passing a NULL pointer, so use the kernel macro IS_ERR_OR_NULL() instead. Konstantin Khorenko fixes i40e where we were not setting up the neigh_priv_len in our net_device, which caused the driver to read beyond the neighbor entry allocated memory. Miroslav Lichvar extends the PTP gettime() to read the system clock by adding support for PTP_SYS_OFFSET_EXTENDED ioctl in i40e. Young Xiao fixed the ice driver to only enable NAPI on q_vectors that actually have transmit and receive rings. Kai-Heng Feng fixes an igb issue that when placed in suspend mode, the NIC does not wake up when a cable is plugged in. This was due to the driver not setting PME during runtime suspend. Stephen Douthit enables the ixgbe driver allow DSA devices to use the MII interface to talk to switches. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 15:34:30 -08:00
Jason Gunthorpe	ed50edfb72	Merge branch 'mlx5-next' into rdma.git From git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux mlx5 updates taken for dependencies on following patches. * branche 'mlx5-next': (23 commits) IB/mlx5: Introduce uid as part of alloc/dealloc transport domain net/mlx5: Add shared Q counter bits net/mlx5: Continue driver initialization despite debugfs failure net/mlx5: Fold the modify lag code into function net/mlx5: Add lag affinity info to log net/mlx5: Split the activate lag function into two routines net/mlx5: E-Switch, Introduce flow counter affinity IB/mlx5: Unify e-switch representors load approach between uplink and VFs net/mlx5: Use lowercase 'X' for hex values net/mlx5: Remove duplicated include from eswitch.c net/mlx5: Remove the get protocol device interface entry net/mlx5: Support extended destination format in flow steering command net/mlx5: E-Switch, Change vhca id valid bool field to bit flag net/mlx5: Introduce extended destination fields net/mlx5: Revise gre and nvgre key formats net/mlx5: Add monitor commands layout and event data net/mlx5: Add support for plugged-disabled cable status in PME net/mlx5: Add support for PCIe power slot exceeded error in PME net/mlx5: Rework handling of port module events net/mlx5: Move flow counters data structures from flow steering header ...	2018-12-20 13:24:50 -07:00
Steve Douthit	643bae17fd	ixgbe: use mii_bus to handle MII related ioctls Use the mii_bus callbacks to address the entire clause 22/45 address space. Enables userspace to poke switch registers instead of a single PHY address. The ixgbe firmware may be polling PHYs in a way that is not protected by the mii_bus lock. This isn't new behavior, but as Andrew Lunn pointed out there are more addresses available for conflicts. Signed-off-by: Stephen Douthit <stephend@silicom-usa.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-12-20 12:22:39 -08:00
Steve Douthit	8fa10ef012	ixgbe: register a mdiobus Most dsa devices expect a 'struct mii_bus' pointer to talk to switches via the MII interface. While this works for dsa devices, it will not work safely with Linux PHYs in all configurations since the firmware of the ixgbe device may be polling some PHY addresses in the background. Signed-off-by: Stephen Douthit <stephend@silicom-usa.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-12-20 12:19:11 -08:00
Kai-Heng Feng	1fb3a7a75e	igb: Fix an issue that PME is not enabled during runtime suspend I210 ethernet card doesn't wakeup when a cable gets plugged. It's because its PME is not set. Since commit `42eca23021` ("PCI: Don't touch card regs after runtime suspend D3"), if the PCI state is saved, pci_pm_runtime_suspend() stops calling pci_finish_runtime_suspend(), which enables the PCI PME. To fix the issue, let's not to save PCI states when it's runtime suspend, to let the PCI subsystem enables PME. Fixes: `42eca23021` ("PCI: Don't touch card regs after runtime suspend D3") Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-12-20 12:14:23 -08:00
Young Xiao	eec903769b	ice: Do not enable NAPI on q_vectors that have no rings If ice driver has q_vectors w/ active NAPI that has no rings, then this will result in a divide by zero error. To correct it I am updating the driver code so that we only support NAPI on q_vectors that have 1 or more rings allocated to them. See commit `13a8cd191a` ("i40e: Do not enable NAPI on q_vectors that have no rings") for detail. Signed-off-by: Young Xiao <YangX92@hotmail.com> Acked-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-12-20 12:10:24 -08:00
Miroslav Lichvar	9a2d57a7a0	i40e: extend PTP gettime function to read system clock This adds support for the PTP_SYS_OFFSET_EXTENDED ioctl. Cc: Richard Cochran <richardcochran@gmail.com> Cc: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Acked-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-12-20 12:06:35 -08:00
Konstantin Khorenko	31389b53b3	i40e: define proper net_device::neigh_priv_len Out of bound read reported by KASan. i40iw_net_event() reads unconditionally 16 bytes from neigh->primary_key while the memory allocated for "neighbour" struct is evaluated in neigh_alloc() as tbl->entry_size + dev->neigh_priv_len where "dev" is a net_device. But the driver does not setup dev->neigh_priv_len and we read beyond the neigh entry allocated memory, so the patch in the next mail fixes this. Signed-off-by: Konstantin Khorenko <khorenko@virtuozzo.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-12-20 12:02:26 -08:00
YueHaibing	cd0d465bb6	e100: Fix passing zero to 'PTR_ERR' warning in e100_load_ucode_wait Fix a static code checker warning: drivers/net/ethernet/intel/e100.c:1349 e100_load_ucode_wait() warn: passing zero to 'PTR_ERR' Signed-off-by: YueHaibing <yuehaibing@huawei.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-12-20 11:54:27 -08:00
David S. Miller	2be09de7d6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Lots of conflicts, by happily all cases of overlapping changes, parallel adds, things of that nature. Thanks to Stephen Rothwell, Saeed Mahameed, and others for their guidance in these resolutions. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 11:53:36 -08:00
Jesus Sanchez-Palencia	6f9ae17530	igb: Change RXPBSIZE size when setting Qav mode Section 4.5.9 of the datasheet says that the total size of all packet buffers combined (TxPB 0 + 1 + 2 + 3 + RxPB + BMC2OS + OS2BMC) must not exceed 60KB. Today we are configuring a total of 62KB, so reduce the RxPB from 32KB to 30KB in order to respect that. The choice of changing RxPBSIZE here is mainly because it seems more correct to give more priority to the transmit packet buffers over the receiver ones when running in Qav mode. Also, the BMC2OS and OS2BMC sizes are already too short. Signed-off-by: Jesus Sanchez-Palencia <jesus.s.palencia@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-12-20 11:45:10 -08:00
Jeff Kirsher	59361316af	igb: reduce CPU0 latency when updating statistics This change is based off of the work and suggestion of Jan Jablonsky <jan.jablonsky@thalesgroup.com>. The Watchdog workqueue in igb driver is scheduled every 2s for each network interface. That includes updating a statistics protected by spinlock. Function igb_update_stats in this case will be protected against preemption. According to number of a statistics registers (cca 60), processing this function might cause additional cpu load on CPU0. In case of statistics spinlock may be replaced with mutex, which reduce latency on CPU0. CC: Bernhard Kaindl <bernhard.kaindl@thalesgroup.com> CC: Jan Jablonsky <jan.jablonsky@thalesgroup.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-12-20 11:02:06 -08:00
Jakub Kicinski	4987eaccd2	nfp: bpf: optimize codegen for JSET with a constant The top word of the constant can only have bits set if sign extension set it to all-1, therefore we don't really have to mask the top half of the register. We can just OR it into the result as is. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-20 17:28:29 +01:00
Jakub Kicinski	6e774845b3	nfp: bpf: remove the trivial JSET optimization The verifier will now understand the JSET instruction, so don't mark the dead branch in the JIT as noop. We won't generate any code, anyway. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-20 17:28:28 +01:00
Michael Chan	0c2ff8d796	bnxt_en: Adjust default RX coalescing ticks to 10 us. For a little better performance on faster machines and faster link speeds. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 08:26:16 -08:00
Venkat Duvvuru	abd43a1352	bnxt_en: Support for 64-bit flow handle. Older firmware only supports 16-bit flow handle, because of which the number of flows that can be offloaded can’t scale beyond a point. Newer firmware supports 64-bit flow handle enabling the host to scale upto millions of flows. With the new 64-bit flow handle support, driver has to query flow stats in a different way compared to the older approach. This patch adds support for 64-bit flow handle and new way to query flow stats. Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com> Reviewed-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 08:26:16 -08:00
Michael Chan	cf6daed098	bnxt_en: Increase context memory allocations on 57500 chips for RDMA. If RDMA is supported on the 57500 chip, increase context memory allocations for the resources used by RDMA. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 08:26:16 -08:00
Michael Chan	08fe9d1816	bnxt_en: Add Level 2 context memory paging support. Add the new functions bnxt_alloc_ctx_pg_tbls()/bnxt_free_ctx_pg_tbls() to allocate and free pages for context memory. The new functions will handle the different levels of paging support and allocate/free the pages accordingly using the existing functions. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 08:26:16 -08:00
Michael Chan	4f49b2b8d4	bnxt_en: Enhance bnxt_alloc_ring()/bnxt_free_ring(). To support level 2 context page memory structures, enhance the bnxt_ring_mem_info structure with a "depth" field to specify the page level and add a flag to specify using full pages for L1 and L2 page tables. This is needed to support RDMA functionality on 57500 chips since RDMA requires more context memory. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 08:26:16 -08:00
Venkat Duvvuru	760b6d3341	bnxt_en: Add support for 2nd firmware message channel. Earlier, some of the firmware commands (ex: CFA_FLOW_*) which are processed by KONG processor were sent to the CHIMP processor from the host. This approach was taken as there was no direct message channel to KONG. CHIMP in turn used to send them to KONG. Newer firmware supports a new message channel which the host can send messages directly to the KONG processor. This patch adds support for required changes needed in the driver to support direct KONG message channel. This speeds up flow related messages sent to the firmware for CLS_FLOWER offload. Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 08:26:16 -08:00
Venkat Duvvuru	5c209fc821	bnxt_en: Introduce bnxt_get_hwrm_resp_addr & bnxt_get_hwrm_seq_id routines. These routines will be enhanced in the subsequent patch to return the 2nd firmware comm. channel's hwrm response address & sequence id respectively. Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 08:26:16 -08:00
Venkat Duvvuru	89455017fb	bnxt_en: Avoid arithmetic on void * pointer. Typecast hwrm_cmd_resp_addr to (u8 ) from (void ) before doing arithmetic. Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 08:26:15 -08:00
Venkat Duvvuru	2e9ee39877	bnxt_en: Use macros for firmware message doorbell offsets. In preparation for adding a 2nd communication channel to firmware. Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 08:26:15 -08:00
Venkat Duvvuru	fc718bb2d1	bnxt_en: Set hwrm_intr_seq_id value to its inverted value. Set hwrm_intr_seq_id value to its inverted value instead of HWRM_SEQ_INVALID, when an hwrm completion of type CMPL_BASE_TYPE_HWRM_DONE is received. This will enable us to use the complete 16-bit sequence ID space. Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 08:26:15 -08:00
Michael Chan	3322479e6d	bnxt_en: Update firmware interface spec. to 1.10.0.33. The major changes are in the flow offload firmware APIs. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 08:26:15 -08:00
Aviv Heller	a64917446e	net/mlx5: Fix LAG requirement when CONFIG_MLX5_ESWITCH is off If CONFIG_MLX5_ESWITCH is not defined, test for SR-IOV being disabled, instead of calling e-switch LAG prereq routine. Since LAG with SRIOV is allowed only when switchdev mode is on. Fixes: `eff849b2c6` ("net/mlx5: Allow/disallow LAG according to pre-req only") Signed-off-by: Aviv Heller <avivh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-20 05:06:03 -08:00
Aviv Heller	0a5b589111	net/mlx5: Fix query_nic_sys_image_guid() error during init vport system image guid should be queried using vport nic API for Ethernet ports, and vport hca API for Infiniband ports. Fixes: `fadd59fc50` ("net/mlx5: Introduce inter-device communication mechanism") Signed-off-by: Aviv Heller <avivh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-20 05:06:03 -08:00
Eli Britstein	e32ee6c78e	net/mlx5e: Support tunnel encap over tagged Ethernet Generate encap header depending on the routed device to support native/tagged Ethernet header. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-20 05:06:03 -08:00
Eli Britstein	aa331450b8	net/mlx5e: Support VLAN encap ETH header generation Support generation of native or tagged Ethernet header for encap header, depending on provided net device. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-20 05:06:03 -08:00
Eli Britstein	c7bcb277bd	net/mlx5e: Re-order route and encap header memory allocation Change the order to first route IPv4/6 and return if error. Only after successful route continue to allocate an encap header, with no functional change. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-20 05:06:02 -08:00
Eli Britstein	05ada1adb6	net/mlx5e: Tunnel encap ETH header helper function In tunnel encap we prepare the encap header for IPv4/6 cases, in two separate functions. For ETH header generation the code is almost duplicated. Move the ETH header generation code from IPv4/6 functions to a helper function, with no functional change. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-20 05:06:02 -08:00
Eli Britstein	b168cff0b9	net/mlx5e: Fail attempt to offload e-switch TC encap flows with vlan on underlay Currently we don't support nor fail attempts to offload encap flows routed to vlan device on the underlay network. We wrongly consider a vlan underlay device to be on the same e-switch b/c the switchdev ID is retrieved recursively. Add explicit check for that and fail such attempts. Also align to a more strict check for the ingress and the underlay devices to practically be on the same eswitch. Fixes: `ce99f6b97f` ('net/mlx5e: Support SRIOV TC encapsulation offloads for IPv6 tunnels') Fixes: `3e621b19b0` ('net/mlx5e: Support TC encapsulation offloads with upper devices') Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-20 05:06:02 -08:00
Eli Britstein	442e1228cb	net/mlx5e: Tunnel routing output devs helper function For tunnel we determine the output devs for IPv4/6 cases, in two separate functions, with a duplicated code. Move that code from IPv4/6 functions to a helper function, with no functional change. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-20 05:06:01 -08:00
Eli Britstein	a0646c88ed	net/mlx5e: Fail attempt to offload e-switch TC flows with egress upper devices We use the switchdev parent HW id helper to identify if the mirred device shares the same ASIC/port with the ingress device. This can get us wrong in the presence of upper devices such as vlan or bridge set over the HW devices (VF or uplink representors), b/c the switchdev ID is retrieved recursively. To fail offload attempts in such cases, we condition the check on the egress device to have not only the same switchdev ID but also the relevant mlx5 netdev ops. Fixes: `03a9d11e6e` ('net/mlx5e: Add TC drop and mirred/redirect action parsing for SRIOV offloads') Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-20 05:06:01 -08:00
Or Gerlitz	1ee4457c5c	net/mlx5e: Allow vlans on e-switch uplink reps There are cases (e.g tunneling with vlan on underlay and potentially more) where this makes sense, so allow that. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-20 05:06:01 -08:00
Gavi Teitz	4c8fb2986d	net/mlx5e: Increase VF representors' SQ size to 128 The default size for the VF representors' SQ was too small to handle high packet rates. Doubling the size from 64 to 128 drastically improves the packet rate under stress (by about 50%), whereas increasing the size beyond 128 has not shown to make any further difference. The impact of the SQ size was measured with UDP traffic, in the following topology: TG <-> PF <-> TC forwarding <-> VF representor <-> VF in VM over a single core processing bi-directional traffic, with the following results: SQ size of 64: SQ size of 128: Packet rate for 64B UDP packets: 860 [Kpps] 1280 [Kpps] Packet rate for 114B VxLan encapsulated UDP packets: 320 [Kpps] 500 [Kpps] Signed-off-by: Gavi Teitz <gavi@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-20 05:06:00 -08:00
Miroslav Lichvar	4a0475d57a	mlx5: extend PTP gettime function to read system clock Read the system time right before and immediately after reading the low register of the internal timer. This adds support for the PTP_SYS_OFFSET_EXTENDED ioctl. Cc: Richard Cochran <richardcochran@gmail.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-20 05:06:00 -08:00
Miroslav Lichvar	5d8678365c	mlx5: update timecounter at least twice per counter overflow The timecounter needs to be updated at least once in half of the cyclecounter interval to prevent timecounter_cyc2time() interpreting a new timestamp as an old value and causing a backward jump. This would be an issue if the timecounter multiplier was so small that the update interval would not be limited by the 64-bit overflow in multiplication. Shorten the calculated interval to make sure the timecounter is updated in time even when the system clock is slowed down by up to 10%, the multiplier is increased by up to 10%, and the scheduled overflow check is late by 15%. Cc: Richard Cochran <richardcochran@gmail.com> Cc: Ariel Levkovich <lariel@mellanox.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-20 05:06:00 -08:00
Peng Li	1154bb26c8	net: hns3: remove redundant variable initialization This patch removes the redundant variable initialization, as driver will devm_kzalloc to set value to hdev soon. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-19 23:47:59 -08:00
Peng Li	31a16f99e0	net: hns3: fix the descriptor index when get rss type Driver gets rss information from the last descriptor of the packet. When driver handle the rss type, ring->next_to_clean indicates the first descriptor of next packet. This patch fix the descriptor index with "ring->next_to_clean - 1". Fixes: `232fc64b6e` ("net: hns3: Add HW RSS hash information to RX skb") Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-19 23:47:59 -08:00
Jian Shen	8edc2285b7	net: hns3: don't restore rules when flow director is disabled When user disables flow director, all the rules will be disabled. But when reset happens, it will restore all the rules again. It's not reasonable. This patch fixes it by add flow director status check before restore fules. Fixes: `6871af29b3` ("net: hns3: Add reset handle for flow director") Fixes: `c17852a893` ("net: hns3: Add support for enable/disable flow director") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-19 23:47:59 -08:00
Jian Shen	0285dbae5d	net: hns3: fix vf id check issue when add flow director rule When add flow director fule for vf, the vf id is used as array subscript before valid checking, which may cause memory overflow. Fixes: `dd74f815dd` ("net: hns3: Add support for rule add/delete for flow director") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-19 23:47:58 -08:00
Huazhong Tan	39cfbc9c4f	net: hns3: reset tqp while doing DOWN operation While doing DOWN operation, the driver will reclaim the memory which has already used for TX. If the hardware is processing this memory, it will cause a RCB error to the hardware. According the hardware's description, the driver should reset the tqp before reclaim the memory during DOWN. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-19 23:47:58 -08:00
Jian Shen	75edb61086	net: hns3: add max vector number check for pf Each pf supports max 64 vectors and 128 tqps. For 2p/4p core scenario, there may be more than 64 cpus online. So the result of min_t(u16, num_Online_cpus(), tqp_num) may be more than 64. This patch adds check for the vector number. Fixes: `dd38c72604` ("net: hns3: fix for coalesce configuration lost during reset") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-19 23:47:58 -08:00
Peng Li	1b7d7b0581	net: hns3: fix a bug caused by udelay udelay() in driver may always occupancy processor. If there is only one cpu in system, the VF driver may initialize fail when insmod PF and VF driver in the same system. This patch use msleep() to free cpu when VF wait PF message. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-19 23:47:58 -08:00
Jian Shen	a298797532	net: hns3: change default tc state to close In original codes, default tc value is set to the max tc. It's more reasonable to close tc by changing default tc value to 1. Users can enable it with lldp tool when they want to use tc. Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-19 23:47:58 -08:00
Jian Shen	8cdb992f0d	net: hns3: refine the handle for hns3_nic_net_open/stop() When triggering nic down, there is a time window between bringing down the protocol stack and stopping the work task. If the net is up in the time window, it may bring up the protocol stack again. This patch fixes it by stop the work task at the beginning of hns3_nic_net_stop(). To keep symmetrical, start the work task at the end of hns3_nic_net_open(). Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-19 23:47:58 -08:00
Antoine Tenart	1b451fb205	net: mvpp2: fix the phylink mode validation The mvpp2_phylink_validate() sets all modes that are supported by a given PPv2 port. An mistake made the 10000baseT_Full mode being advertised in some cases when a port wasn't configured to perform at 10G. This patch fixes this. Fixes: `d97c9f4ab0` ("net: mvpp2: 1000baseX support") Reported-by: Russell King <linux@armlinux.org.uk> Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-19 16:38:35 -08:00
Biao Huang	22a3a5403b	net-next: stmmac: dwmac-mediatek: remove fine-tune property 1. remove fine-tune property and related setting to simplify the timing adjustment flow. 2. set timing value according to the value from device tree, and will not care whether PHY insert internal delay. Signed-off-by: Biao Huang <biao.huang@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-19 16:24:58 -08:00
Bryan Whitehead	e0e587878f	lan743x: Remove MAC Reset from initialization The MAC Reset was noticed to erase important EEPROM settings. It is also unnecessary since a chip wide reset was done earlier in initialization, and that reset preserves EEPROM settings. There for this patch removes the unnecessary MAC specific reset. Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-19 15:48:11 -08:00
Alaa Hleihel	4765420439	net/mlx5e: Remove the false indication of software timestamping support mlx5 driver falsely advertises support of software timestamping. Fix it by removing the false indication. Fixes: `ef9814deaf` ("net/mlx5e: Add HW timestamping (TS) support") Signed-off-by: Alaa Hleihel <alaa@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-19 13:31:16 -08:00
Yuval Avnery	f033788914	net/mlx5: Typo fix in del_sw_hw_rule Expression terminated with "," instead of ";", resulted in set_fte getting bad value for modify_enable_mask field. Fixes: `bd5251dbf1` ("net/mlx5_core: Introduce flow steering destination of type counter") Signed-off-by: Yuval Avnery <yuvalav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-19 13:31:16 -08:00
Tariq Toukan	bfc698254b	net/mlx5e: RX, Fix wrong early return in receive queue poll When the completion queue of the RQ is empty, do not immediately return. If left-over decompressed CQEs (from the previous cycle) were processed, need to go to the finalization part of the poll function. Bug exists only when CQE compression is turned ON. This solves the following issue: mlx5_core 0000:82:00.1: mlx5_eq_int:544:(pid 0): CQ error on CQN 0xc08, syndrome 0x1 mlx5_core 0000:82:00.1 p4p2: mlx5e_cq_error_event: cqn=0x000c08 event=0x04 Fixes: `4b7dfc9925` ("net/mlx5e: Early-return on empty completion queues") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-19 13:31:16 -08:00
Ido Schimmel	b61cd7c6f9	mlxsw: spectrum_router: Hold a reference on RIF's netdev Previous patches tried to make RIF deletion more robust and avoid use-after-free situations. As another precaution, hold a reference on a RIF's netdev and release it when the RIF is deleted. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-19 12:28:07 -08:00
Ido Schimmel	965fa8e600	mlxsw: spectrum_router: Make RIF deletion more robust In the past we had multiple instances where RIFs were not properly deleted. One of the reasons for leaking a RIF was that at the time when IP addresses were flushed from the respective netdev (prompting the destruction of the RIF), the netdev was no longer a mlxsw upper. This caused the inet{,6}addr notification blocks to ignore the NETDEV_DOWN event and leak the RIF. Instead of checking whether the netdev is our upper when an IP address is removed, we can instead check if the netdev has a RIF configured. To look up a RIF we need to access mlxsw private data, so the patch stores the notification blocks inside a mlxsw struct. This then allows us to use container_of() and extract the required private data. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-19 12:28:07 -08:00
Ido Schimmel	21ffedb6db	mlxsw: spectrum_router: Propagate 'struct mlxsw_sp' further Next patch is going to make RIF deletion more robust by removing reliance on fragile mlxsw_sp_lower_get(). This is because a netdev is not necessarily our upper anymore when its IP addresses are flushed. The inet{,6}addr notification blocks are going to resolve 'struct mlxsw_sp' using container_of(), but the functions they call still use mlxsw_sp_lower_get(). As a preparation for the next patch, propagate 'struct mlxsw_sp' down to the functions called from the notification blocks and remove reliance on mlxsw_sp_lower_get(). Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-19 12:28:07 -08:00
Ido Schimmel	be2d6f421f	mlxsw: spectrum: Properly cleanup LAG uppers when removing port from LAG When a LAG device or a VLAN device on top of it is enslaved to a bridge, the driver propagates the CHANGEUPPER event to the LAG's slaves. This causes each physical port to increase the reference count of the internal representation of the bridge port by calling mlxsw_sp_port_bridge_join(). However, when a port is removed from a LAG, the corresponding leave() function is not called and the reference count is not decremented. This leads to ugly hacks such as mlxsw_sp_bridge_port_should_destroy() that try to understand if the bridge port should be destroyed even when its reference count is not 0. Instead, make sure that when a port is unlinked from a LAG it would see the same events as if the LAG (or its uppers) were unlinked from a bridge. The above is achieved by walking the LAG's uppers when a port is unlinked and calling mlxsw_sp_port_bridge_leave() for each upper that is enslaved to a bridge. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-19 12:28:07 -08:00
Ido Schimmel	635c8c8bba	mlxsw: spectrum: Remove reference count from VLAN entries Commit `b3529af6bb` ("spectrum: Reference count VLAN entries") started reference counting port-VLAN entries in a similar fashion to the 8021q driver. However, this is not actually needed and only complicates things. Instead, the driver should forbid the creation of a VLAN on a port if this VLAN already exists. This would also solve the issue fixed by the mentioned commit. Therefore, remove the get()/put() API and use create()/destroy() instead. One place that needs special attention is VLAN addition in a VLAN-aware bridge via switchdev operations. In case the VLAN flags (e.g., 'pvid') are toggled, then the VLAN entry already exists. To prevent the driver from wrongly returning EEXIST, the driver is changed to check in the prepare phase whether the entry already exists and only returns an error in case it is not associated with the correct bridge port. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-19 12:28:07 -08:00
Ido Schimmel	e149113a74	mlxsw: spectrum: Handle VLAN device unlinking In commit `993107fea5` ("mlxsw: spectrum_switchdev: Fix VLAN device deletion via ioctl") I fixed a bug caused by the fact that the driver views differently the deletion of a VLAN device when it is deleted via an ioctl and netlink. Instead of relying on a specific order of events (device being unregistered vs. VLAN filter being updated), simply make sure that the driver performs the necessary cleanup when the VLAN device is unlinked, which always happens before the other two events. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-19 12:28:07 -08:00
Ido Schimmel	f1d7c33d6a	mlxsw: spectrum_fid: Remove unused function This function is no longer used. Remove it. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-19 12:28:07 -08:00
Ido Schimmel	32fd4b49a3	mlxsw: spectrum_router: Do not destroy RIFs based on FID's reference count Currently, when a RIF is constructed on top of a FID, the RIF increments the FID's reference count and the RIF is destroyed when the FID's reference count drops to 1. This effectively means that when no local ports are member in the FID, the FID is destroyed regardless if the router port is a member in the FID or not. The above can lead to the unexpected behavior in which routes using a VLAN interface as their nexthop device are no longer offloaded after the last local port leaves the corresponding VLAN (FID). Example: # ip -4 route show dev br0.10 192.0.2.0/24 proto kernel scope link src 192.0.2.1 offload # bridge vlan del vid 10 dev swp3 # ip -4 route show dev br0.10 192.0.2.0/24 proto kernel scope link src 192.0.2.1 After the patch, the route is offloaded before and after the VLAN is removed from local port 'swp3', as the RIF corresponding to 'br0.10' continues to exists. In order to remove RIFs' reliance on the underlying FID's reference count, we need to add a reference count to sub-port RIFs, which are RIFs that correspond to physical ports and their uppers (e.g., LAG devices). In this case, each {Port, VID} ('struct mlxsw_sp_port_vlan') needs to hold a reference on the RIF. For example: bond0.10 \| bond0 \| +-------+ \| \| swp1 swp2 Both {Port 1, VID 10} and {Port 2, VID 10} will hold a reference on the RIF corresponding to 'bond0.10'. When the last reference is dropped, the RIF will be destroyed. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-19 12:28:07 -08:00
Ido Schimmel	927d0ef10a	mlxsw: spectrum: Sanitize VLAN interface's uppers Currently, only VRF and macvlan uppers are supported on top of VLAN device configured over a bridge, so make sure the driver forbids other uppers. Note that enslavement to a VRF is handled earlier in the notification block, so there is no need to check for a VRF upper here. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-19 12:28:07 -08:00
Michael Chan	84404d5fd5	bnxt_en: Fix ethtool self-test loopback. The current code has 2 problems. It assumes that the RX ring for the loopback packet is combined with the TX ring. This is not true if the ethtool channels are set to non-combined mode. The second problem is that it won't work on 57500 chips without adjusting the logic to get the proper completion ring (cpr) pointer. Fix both issues by locating the proper cpr pointer through the RX ring. Fixes: `e44758b78a` ("bnxt_en: Use bnxt_cp_ring_info struct pointer as parameter for RX path.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-19 11:31:09 -08:00
Florian Westphal	a84e3f5333	xfrm: prefer secpath_set over secpath_dup secpath_set is a wrapper for secpath_dup that will not perform an allocation if the secpath attached to the skb has a reference count of one, i.e., it doesn't need to be COW'ed. Also, secpath_dup doesn't attach the secpath to the skb, it leaves this to the caller. Use secpath_set in places that immediately assign the return value to skb. This allows to remove skb->sp without touching these spots again. secpath_dup can eventually be removed in followup patch. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-19 11:21:38 -08:00
Florian Westphal	6362a6a040	drivers: net: ethernet: mellanox: use skb_sec_path helper Will avoid touching this when sp pointer is removed from sk_buff struct. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-19 11:21:37 -08:00
Florian Westphal	2fdb435bc0	drivers: net: intel: use secpath helpers in more places Use skb_sec_path and secpath_exists helpers where possible. This reduces noise in followup patch that removes skb->sp pointer. v2: no changes, preseve acks from v1. Acked-by: Shannon Nelson <shannon.lee.nelson@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-19 11:21:37 -08:00
Ioana Radulescu	610febc68a	dpaa2-eth: Add QBMAN related stats Add statistics for pending frames in Rx/Tx conf FQs and number of buffers in pool. Available through ethtool -S. Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: Ioana ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-19 10:37:22 -08:00
Heiner Kallweit	33f18c96af	net: ethernet: don't set phylib state CHANGELINK in drivers After phy_start() phylib takes care of all needed actions, including aneg settings and checking link state. There's no need to set state PHY_CHANGELINK in drivers. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-18 22:07:20 -08:00
Colin Ian King	f7db2beb4c	vxge: ensure data0 is initialized in when fetching firmware version information Currently variable data0 is not being initialized so a garbage value is being passed to vxge_hw_vpath_fw_api and this value is being written to the rts_access_steer_data0 register. There are other occurrances where data0 is being initialized to zero (e.g. in function vxge_hw_upgrade_read_version) so I think it makes sense to ensure data0 is initialized likewise to 0. Detected by CoverityScan, CID#140696 ("Uninitialized scalar variable") Fixes: `8424e00dfd` ("vxge: serialize access to steering control register") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-18 22:00:40 -08:00
Bryan Whitehead	0db7d253e9	lan743x: Expand phy search for LAN7431 The LAN7431 uses an external phy, and it can be found anywhere in the phy address space. This patch uses phy address 1 for LAN7430 only. And searches all addresses otherwise. Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-18 21:35:56 -08:00
David S. Miller	6c86bc2342	mlx5-uplink-rep-2018-12-15 Or Gerlitz says: This series is essentially a cleanup to align with the rest of the NIC switchdev drivers and make us more robust and clear/n: currently the PF netdev serves as the mlx5 e-switch uplink netdev representor when going into switchdev mode and back as plain NIC netdev when going out. This causes some irregularities and misc troubles. Move to use dedicated uplink rep, as we have for the VF vports. The uplink rep netdev does has sysfs link and supports the sriov vf mac ndo, these two are in use by libvirt and other orchestrators, It also has richer ethtool support to allow controlling the port link & mtu along with supporting dcb and plugging into the mlx5 lag logic. -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJcF/MBAAoJEEg/ir3gV/o+apkIAKVeB/ha8qiY+9ntA9B8HVxO cYb+fv4+DuYjTg/nN5QdYSAisN4SwFDA5ukz/ckr6XwV65EgTjAtlon/fHg/rDyz YT3dJRP9yjNXKxRvxa0K4PmTasmXGXzaJzsXYemHsfCDdnkNknZsqA+yoNe+2c40 HoNsdy/Pvr18PjzYtpX0BiuSjH623daBmdFH0o+rwuwsApkAUlBJfosTg/7xyU7v TAjy8S1HCloJ7OAjOlzXVmYr65sAlQoFr860vLDQ7Og7JemxjSXQ+AdNb6mSAvZ5 LBEvCA3G5tb4OkFX+1i/alxOcuG8/FoO+O3ZQWve1DNxoS76po851bgHXIdZCPg= =pqTq -----END PGP SIGNATURE----- Merge tag 'mlx5-uplink-rep-2018-12-15' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed: ==================== mlx5-uplink-rep-2018-12-15 Or Gerlitz says: This series is essentially a cleanup to align with the rest of the NIC switchdev drivers and make us more robust and clear/n: currently the PF netdev serves as the mlx5 e-switch uplink netdev representor when going into switchdev mode and back as plain NIC netdev when going out. This causes some irregularities and misc troubles. Move to use dedicated uplink rep, as we have for the VF vports. The uplink rep netdev does has sysfs link and supports the sriov vf mac ndo, these two are in use by libvirt and other orchestrators, It also has richer ethtool support to allow controlling the port link & mtu along with supporting dcb and plugging into the mlx5 lag logic. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-18 16:44:45 -08:00
Anssi Hannula	6e0af29806	net: macb: add missing barriers when reading descriptors When reading buffer descriptors on RX or on TX completion, an RX_USED/TX_USED bit is checked first to ensure that the descriptors have been populated, i.e. the ownership has been transferred. However, there are no memory barriers to ensure that the data protected by the RX_USED/TX_USED bit is up-to-date with respect to that bit. Specifically: - TX timestamp descriptors may be loaded before ctrl is loaded for the TX_USED check, which is racy as the descriptors may be updated between the loads, causing old timestamp descriptor data to be used. - RX ctrl may be loaded before addr is loaded for the RX_USED check, which is racy as a new frame may be written between the loads, causing old ctrl descriptor data to be used. This issue exists for both macb_rx() and gem_rx() variants. Fix the races by adding DMA read memory barriers on those paths and reordering the reads in macb_rx(). I have not observed any actual problems in practice caused by these being missing, though. Tested on a ZynqMP based system. Fixes: `89e5785fc8` ("[PATCH] Atmel MACB ethernet driver") Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Cc: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-18 16:17:48 -08:00
Anssi Hannula	8159ecab0d	net: macb: fix dropped RX frames due to a race Bit RX_USED set to 0 in the address field allows the controller to write data to the receive buffer descriptor. The driver does not ensure the ctrl field is ready (cleared) when the controller sees the RX_USED=0 written by the driver. The ctrl field might only be cleared after the controller has already updated it according to a newly received frame, causing the frame to be discarded in gem_rx() due to unexpected ctrl field contents. A message is logged when the above scenario occurs: macb ff0b0000.ethernet eth0: not whole frame pointed by descriptor Fix the issue by ensuring that when the controller sees RX_USED=0 the ctrl field is already cleared. This issue was observed on a ZynqMP based system. Fixes: `4df95131ea` ("net/macb: change RX path for GEM") Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Tested-by: Claudiu Beznea <claudiu.beznea@microchip.com> Cc: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-18 16:17:48 -08:00
Anssi Hannula	e100a897bf	net: macb: fix random memory corruption on RX with 64-bit DMA 64-bit DMA addresses are split in upper and lower halves that are written in separate fields on GEM. For RX, bit 0 of the address is used as the ownership bit (RX_USED). When the RX_USED bit is unset the controller is allowed to write data to the buffer. The driver does not guarantee that the controller already sees the upper half when the RX_USED bit is cleared, possibly resulting in the controller writing an incoming frame to an address with an incorrect upper half and therefore possibly corrupting unrelated system memory. Fix that by adding the necessary DMA memory barrier between the writes. This corruption was observed on a ZynqMP based system. Fixes: `fff8019a08` ("net: macb: Add 64 bit addressing support for GEM") Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Acked-by: Harini Katakam <harini.katakam@xilinx.com> Tested-by: Claudiu Beznea <claudiu.beznea@microchip.com> Cc: Nicolas Ferre <nicolas.ferre@microchip.com> Cc: Michal Simek <michal.simek@xilinx.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-18 16:17:48 -08:00
Claudiu Beznea	4298388574	net: macb: restart tx after tx used bit read On some platforms (currently detected only on SAMA5D4) TX might stuck even the pachets are still present in DMA memories and TX start was issued for them. This happens due to race condition between MACB driver updating next TX buffer descriptor to be used and IP reading the same descriptor. In such a case, the "TX USED BIT READ" interrupt is asserted. GEM/MACB user guide specifies that if a "TX USED BIT READ" interrupt is asserted TX must be restarted. Restart TX if used bit is read and packets are present in software TX queue. Packets are removed from software TX queue if TX was successful for them (see macb_tx_interrupt()). Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-18 15:57:07 -08:00
YueHaibing	8937388acb	qlcnic: remove set but not used variables 'op, cmd_op' Fixes gcc '-Wunused-but-set-variable' warning: drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:1070:5: warning: variable 'op' set but not used [-Wunused-but-set-variable] drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:1342:5: warning: variable 'cmd_op' set but not used [-Wunused-but-set-variable] 'op' never used since introduction in commit `7cb03b2347` ("qlcnic: Support VF-PF communication channel commands.") 'cmd_op' not used since commit `6226204bcf` ("qlcnic: Fix operation type and command type.") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-18 15:49:46 -08:00
Dan Carpenter	b26322d2ac	net: stmmac: Fix an error code in probe() The function should return an error if create_singlethread_workqueue() fails. Fixes: `34877a15f7` ("net: stmmac: Rework and fix TX Timeout code") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-18 15:45:28 -08:00
Dan Carpenter	f07d427689	qed: Fix an error code qed_ll2_start_xmit() We accidentally deleted the code to set "rc = -ENOMEM;" and this patch adds it back. Fixes: `d2201a2159` ("qed: No need for LL2 frags indication") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-18 15:43:44 -08:00
Heiner Kallweit	2429f13870	net: fec: remove workaround to restart phylib state machine on MDIO timeout There's a workaround to restart the phylib state machine in case of a MDIO access timeout. Seems it was introduced to deal with the consequences of a too small MDIO timeout. See also commit message of `c3b084c24c` ("net: fec: Adjust ENET MDIO timeouts") which increased the timeout value later. Due to the later timeout value fix it seems to be safe to remove the workaround. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-18 15:01:55 -08:00
Antoine Tenart	0067917720	net: mvpp2: 10G modes aren't supported on all ports The mvpp2_phylink_validate() function sets all modes that are supported by a given PPv2 port. A recent change made all ports to advertise they support 10G modes in certain cases. This is not true, as only the port #0 can do so. This patch fixes it. Fixes: `01b3fd5ac9` ("net: mvpp2: fix detection of 10G SFP modules") Cc: Baruch Siach <baruch@tkos.co.il> Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-18 14:48:15 -08:00
Yunsheng Lin	af854724e5	net: hns3: fix a SSU buffer checking bug When caculating the SSU buffer, it first allocate tx and rx private buffer, then the remaining buffer is for rx shared buffer. The remaining buffer size should be at least bigger than or equal to the shared_std, which is the minimum shared buffer size required by the driver, but currently if the remaining buffer size is equal to the shared_std, it returns failure, which causes SSU buffer allocation failure problem. This patch fixes this problem by rounding up shared_std before checking the the remaining buffer size bigger than or equal to the shared_std. Fixes: `46a3df9f97` ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support") Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-18 12:01:01 -08:00
Yunsheng Lin	b9a400ac29	net: hns3: aligning buffer size in SSU to 256 bytes The hardware expects the buffer size set to SSU is aligned to 256 bytes, this patch aligns the buffer size to 256 byte using roundup or rounddown function. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-18 12:01:01 -08:00
Yunsheng Lin	368686be23	net: hns3: getting tx and dv buffer size through firmware This patch adds support of getting tx and dv buffer size through firmware, because different version of hardware requires different size of tx and dv buffer. This patch also add dv_buf_size to tc' private buffer size even if pfc is not enable for the tc. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-18 12:01:01 -08:00
Peng Li	0ad5ea5dbd	net: hns3: synchronize speed and duplex from phy when phy link up Driver calls phy_connect_direct and registers hclge_mac_adjust_link to synchronize mac speed and duplex from phy. It is better to synchronize mac speed and duplex from phy when phy link up. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-18 12:01:01 -08:00
Fuyun Liang	8362089d78	net: hns3: remove 1000M/half support of phy Our phy does not support 1000M/half, this patch removes 1000M/half from PHY_SUPPORTED_FEATURES. Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-18 12:01:01 -08:00
Peng Li	7445565cd0	net: hns3: update coalesce param per second coalesce param updates every 100 napi times, it may update a little late if ping test after a high rate flow, may over napi poll is called 100 times as ping test sends packets every second. This patch updates coalesce param every second, instead with every 100 napi times. It can not update the param 100% in time, but the lag time is very short. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-18 12:01:01 -08:00
Huazhong Tan	ae6017a711	net: hns3: fix incomplete uninitialization of IRQ in the hns3_nic_uninit_vector_data() In the hns3_nic_uninit_vector_data(), the procedure of uninitializing the tqp_vector's IRQ has not set affinity_notify to NULL and changes its init flag. This patch fixes it. And for simplificaton, local variable tqp_vector is used instead of priv->tqp_vector[i]. Fixes: `424eb834a9` ("net: hns3: Unified HNS3 {VF\|PF} Ethernet Driver for hip08 SoC") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-18 12:01:01 -08:00
Huazhong Tan	b51c366df7	net: hns3: remove unnecessary configuration recapture while resetting When doing reset, it is unnecessary to get the hardware's default configuration again, otherwise, the user's configuration will be overwritten. Fixes: `4ed340ab8f` ("net: hns3: Add reset process in hclge_main") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-18 12:01:01 -08:00
Huazhong Tan	b644a8d4cb	net: hns3: update some variables while hclge_reset()/hclgevf_reset() done When hclge_reset() completes successfully, it should update the last_reset_time, set reset_fail_cnt to 0, and set reset_type of hnae3_ae_dev to HNAE3_NONE_RESET. Also when hclgevf_reset() completes successfully, it should update the last_reset_time, and set reset_type of hnae3_ae_dev to HNAE3_NONE_RESET. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-18 12:01:01 -08:00
Huazhong Tan	531eba0fe2	net: hns3: fix napi_disable not return problem While doing DOWN, the calling of napi_disable() may not return, since the napi_complete() in the hns3_nic_common_poll() will never be called when HNS3_NIC_STATE_DOWN is set. So we need to call napi_complete() before checking HNS3_NIC_STETE_DOWN. Fixes: `ff0699e04b` ("net: hns3: stop napi polling when HNS3_NIC_STATE_DOWN is set") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-18 12:01:01 -08:00
Huazhong Tan	e3338205f0	net: hns3: uninitialize pci in the hclgevf_uninit In the hclgevf_pci_reset(), it only uninitialize and initialize the msi, so if the initialization fails, hclgevf_uninit_hdev() does not need to uninitialize the msi, but needs to uninitialize the pci, otherwise it will cause pci resource not free. Fixes: `862d969a3a` ("net: hns3: do VF's pci re-initialization while PF doing FLR") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-18 12:01:01 -08:00
Huazhong Tan	cda69d2445	net: hns3: fix error handling int the hns3_get_vector_ring_chain When hns3_get_vector_ring_chain() failed in the hns3_nic_init_vector_data(), it should do the error handling instead of return directly. Also, cur_chain should be freed instead of chain and head->next should be set to NULL in error handling of hns3_get_vector_ring_chain. This patch fixes them. Fixes: `73b907a083` ("net: hns3: bugfix for buffer not free problem during resetting") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-18 12:01:00 -08:00
Ido Schimmel	5edb7e8bd5	mlxsw: spectrum_nve: Fix memory leak upon driver reload The pointer was NULLed before freeing the memory, resulting in a memory leak. Trace from kmemleak: unreferenced object 0xffff88820ae36528 (size 512): comm "devlink", pid 5374, jiffies 4295354033 (age 10829.296s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000a43f5195>] kmem_cache_alloc_trace+0x1be/0x330 [<00000000312f8140>] mlxsw_sp_nve_init+0xcb/0x1ae0 [<0000000009201d22>] mlxsw_sp_init+0x1382/0x2690 [<000000007227d877>] mlxsw_sp1_init+0x1b5/0x260 [<000000004a16feec>] __mlxsw_core_bus_device_register+0x776/0x1360 [<0000000070ab954c>] mlxsw_devlink_core_bus_device_reload+0x129/0x220 [<00000000432313d5>] devlink_nl_cmd_reload+0x119/0x1e0 [<000000003821a06b>] genl_family_rcv_msg+0x813/0x1150 [<00000000d54d04c0>] genl_rcv_msg+0xd1/0x180 [<0000000040543d12>] netlink_rcv_skb+0x152/0x3c0 [<00000000efc4eae8>] genl_rcv+0x2d/0x40 [<00000000ea645603>] netlink_unicast+0x52f/0x740 [<00000000641fca1a>] netlink_sendmsg+0x9c7/0xf50 [<00000000fed4a4b8>] sock_sendmsg+0xbe/0x120 [<00000000d85795a9>] __sys_sendto+0x397/0x620 [<00000000c5f84622>] __x64_sys_sendto+0xe6/0x1a0 Fixes: `6e6030bd54` ("mlxsw: spectrum_nve: Implement common NVE core") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-18 09:17:39 -08:00
Ido Schimmel	5d5043917a	mlxsw: spectrum: Add trap for decapsulated ARP packets After a packet was decapsulated it is classified to the relevant FID based on its VNI and undergoes L2 forwarding. Unlike regular (non-encapsulated) ARP packets, Spectrum does not trap decapsulated ARP packets during L2 forwarding and instead can only trap such packets in the underlay router during decapsulation. Add this missing packet trap, which is required for VXLAN routing when the MAC of the target host is not known. Fixes: `b02597d513` ("mlxsw: spectrum: Add NVE packet traps") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-18 09:17:38 -08:00
Shalom Toledo	cf0b70e71b	mlxsw: core: Increase timeout during firmware flash process During the firmware flash process, some of the EMADs get timed out, which causes the driver to send them again with a limit of 5 retries. There are some situations in which 5 retries is not enough and the EMAD access fails. If the failed EMAD was related to the flashing process, the driver fails the flashing. The reason for these timeouts during firmware flashing is cache misses in the CPU running the firmware. In case the CPU needs to fetch instructions from the flash when a firmware is flashed, it needs to wait for the flashing to complete. Since flashing takes time, it is possible for pending EMADs to timeout. Fix by increasing EMADs' timeout while flashing firmware. Fixes: `ce6ef68f43` ("mlxsw: spectrum: Implement the ethtool flash_device callback") Signed-off-by: Shalom Toledo <shalomt@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-18 09:17:38 -08:00
John Hurley	b12c97d45c	nfp: flower: fix cb_ident duplicate in indirect block register Previously the identifier used for indirect block callback registry and for block rule cb registry (when done via indirect blocks) was the pointer to the netdev we were interested in receiving updates on. This worked fine if a single app existed that registered one callback per netdev of interest. However, if multiple cards are in place and, in turn, multiple apps, then each app may register the same callback with the same identifier to both the netdev's indirect block cb list and to a block's cb list. This can lead to EEXIST errors and/or incorrect cb deletions. Prevent this conflict by using the app pointer as the identifier for netdev indirect block cb registry, allowing each app to register a unique callback per netdev. For block cb registry, the same app may register multiple cbs to the same block if using TC shared blocks. Instead of the app, use the pointer to the allocated cb_priv data as the identifier here. This means that there can be a unique block callback for each app/netdev combo. Fixes: `3166dd07a9` ("nfp: flower: offload tunnel decap rules via indirect TC blocks") Reported-by: Edward Cree <ecree@solarflare.com> Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-17 23:34:12 -08:00
Shalom Toledo	d1675a1602	mlxsw: spectrum: Update the supported firmware to version 13.1910.622 This new firmware contains: * New packet traps for discarded packets * Secure firmware flash bug fix * Fence mechanism bug fix * TCAM RMA bug fix Signed-off-by: Shalom Toledo <shalomt@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-17 23:32:29 -08:00
Vasundhara Volam	56d3746247	bnxt_en: query force speeds before disabling autoneg mode. With autoneg enabled, PHY loopback test fails. To disable autoneg, driver needs to send a valid forced speed to FW. FW is not sending async event for invalid speeds. To fix this, query forced speeds and send the correct speed when disabling autoneg mode. Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-17 23:08:53 -08:00
Michael Chan	fd3ab1c70e	bnxt_en: Do not free port statistics buffer when device is down. Port statistics which include RDMA counters are useful even when the netdevice is down. Do not free the port statistics DMA buffers when the netdevice is down. This is keep the snapshot of the port statistics and counters will just continue counting when the netdevice goes back up. Split the bnxt_free_stats() function into 2 functions. The port statistics buffers will only be freed when the netdevice is removed. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-17 23:08:53 -08:00
Michael Chan	b8875ca356	bnxt_en: Save ring statistics before reset. With the current driver, the statistics reported by .ndo_get_stats64() are reset when the device goes down. Store a snapshot of the rtnl_link_stats64 before shutdown. This snapshot is added to the current counters in .ndo_get_stats64() so that the counters will not get reset when the device is down. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-17 23:08:53 -08:00
Vasundhara Volam	7c675421af	bnxt_en: Return linux standard errors in bnxt_ethtool.c Currently firmware specific errors are returned directly in flash_device and reset ethtool hooks. Modify it to return linux standard errors to userspace when flashing operations fail. Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-17 23:08:53 -08:00
Michael Chan	24654f095e	bnxt_en: Don't set ETS on unused TCs. Currently, the code allows ETS bandwidth weight 0 to be set on unused TCs. We should not set any DCB parameters on unused TCs at all. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-17 23:08:53 -08:00
Michael Chan	e37fed7903	bnxt_en: Add ethtool -S priority counters. Display the CoS counters as additional priority counters by looking up the priority to CoS queue mapping. If the TX extended port statistics block size returned by firmware is big enough to cover the CoS counters, then we will display the new priority counters. We call firmware to get the up-to-date pri2cos mapping to convert the CoS counters to priority counters. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-17 23:08:53 -08:00
Michael Chan	b16b689186	bnxt_en: Add SR-IOV support for 57500 chips. There are some minor differences when assigning VF resources on the new chips. The MSIX (NQ) resource has to be assigned and ring group is not needed on the new chips. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-17 23:08:53 -08:00
Michael Chan	36d65be9a8	bnxt_en: Disable MSIX before re-reserving NQs/CMPL rings. When bringing up a device, the code checks to see if the number of MSIX has changed. pci_disable_msix() should be called first before changing the number of reserved NQs/CMPL rings. This ensures that the MSIX vectors associated with the NQs/CMPL rings are still properly mapped when pci_disable_msix() masks the vectors. This patch will prevent errors when RDMA support is added for the new 57500 chips. When the RDMA driver shuts down, the number of NQs is decreased and we must use the new sequence to prevent MSIX errors. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-17 23:08:53 -08:00
Vasundhara Volam	780baad44f	bnxt_en: Reserve 1 stat_ctx for RDMA driver. bnxt_en requires same number of stat_ctxs as CP rings but RDMA requires only 1 stat_ctx. Also add a new parameter resv_stat_ctxs to better keep track of stat_ctxs reserved including resources used by RDMA. Add a stat_ctxs parameter to all the relevant resource reservation functions so we can reserve the correct number of stat_ctxs. Prior to this patch, we were not reserving the extra stat_ctx for RDMA and RDMA would not work on the new 57500 chips. Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-17 23:08:53 -08:00
Vasundhara Volam	f4e896142d	bnxt_en: Do not modify max_stat_ctxs after RDMA driver requests/frees stat_ctxs Calling bnxt_set_max_func_stat_ctxs() to modify max stat_ctxs requested or freed by the RDMA driver is wrong. After introducing reservation of resources recently, the driver has to keep track of all stat_ctxs including the ones used by the RDMA driver. This will provide a better foundation for accurate accounting of the stat_ctxs. Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-17 23:08:53 -08:00
Vasundhara Volam	c027c6b4e9	bnxt_en: get rid of num_stat_ctxs variable For bnxt_en driver, stat_ctxs created will always be same as cp_nr_rings. Remove extra variable that duplicates the value. Also introduce bnxt_get_avail_stat_ctxs_for_en() helper to get available stat_ctxs and bnxt_get_ulp_stat_ctxs() helper to return number of stat_ctxs used by RDMA. Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-17 23:08:53 -08:00
Michael Chan	e916b0815a	bnxt_en: Add bnxt_get_avail_cp_rings_for_en() helper function. The available CP rings are calculated differently on the new 57500 chips, so add this helper to do this calculation correctly. The VFs will be assigned these available CP rings. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-17 23:08:53 -08:00
Michael Chan	f7588cd893	bnxt_en: Store the maximum NQs available on the PF. The PF has a pool of NQs and MSIX vectors assigned to it based on NVRAM configurations. The number of usable MSIX vectors on the PF is the minimum of the NQs and MSIX vectors. Any excess NQs without associated MSIX may be used for the VFs, so we need to store this max_nqs value. max_nqs minus the NQs used by the PF will be the available NQs for the VFs. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-17 23:08:52 -08:00
Leon Romanovsky	199fa087dc	net/mlx5: Continue driver initialization despite debugfs failure The failure to create debugfs entry is unpleasant event, but not enough to abort drier initialization. Align the mlx5_core code to debugfs design and continue execution whenever debugfs_create_dir() successes or not. Fixes: `e126ba97db` ("mlx5: Add driver for Mellanox Connect-IB adapters") Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-17 14:35:25 -08:00
Joakim Tjernlund	a28777f250	ucc_geth: Add change_carrier() for Fixed PHYs This allows to control carrier from /sys/class/net/ethX/carrier for Fixed PHYs. Signed-off-by: Joakim Tjernlund <joakim.tjernlund@infinera.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-17 11:24:32 -08:00
Joakim Tjernlund	6211d46713	gianfar: Add change_carrier() for Fixed PHYs This allows to control carrier from /sys/class/net/ethX/carrier for Fixed PHYs. Signed-off-by: Joakim Tjernlund <joakim.tjernlund@infinera.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-17 11:24:32 -08:00
Joakim Tjernlund	6e8b0ff1ba	dpaa_eth: Add change_carrier() for Fixed PHYs This allows to control carrier from /sys/class/net/ethX/carrier for Fixed PHYs. Signed-off-by: Joakim Tjernlund <joakim.tjernlund@infinera.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-17 11:24:32 -08:00
Eric Dumazet	4beaacc6fe	net/mlx4_en: remove fallback after kzalloc_node() kzalloc_node(..., GFP_KERNEL, node) will attempt to allocate memory as close as possible to the node. There is no need to fallback to kzalloc() if this has failed. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-17 11:18:02 -08:00
Or Gerlitz	ff9b85de5d	net/mlx5e: Add some ethtool port control entries to the uplink rep netdev Some of the ethtool entries to control the port should be supported by the uplink rep netdev in switchdev mode, add them. While here, add also the get/set coalesce entries for all reps. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-17 11:03:29 -08:00
Or Gerlitz	371289b61a	net/mlx5e: Expose ethtool pause and link functions to mlx5e callers Towards supporting set/get of global pause for the port and get of the port link ksetting from the uplink representor, expose the relevant entries to other mlx5 callers. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-17 11:03:28 -08:00
Or Gerlitz	073caf5088	net/mlx5e: Add sriov and udp tunnel ndo support for the uplink rep Some of the sriov ndo calls are needed also on the switchdev mode - e.g setup VF mac and reading vport stats. Add them to the uplink rep netdev ops. Same for the UDP tunnel ones, need them there to identify offloaded udp tunnel ports. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-17 11:03:28 -08:00
Or Gerlitz	b36cdb42ad	net/mlx5e: Handle port mtu/link, dcb and lag for uplink reps Take care of setup/teardown for the port link, dcb, lag as well as dealing with port mtu and carrier for e-switch uplink representors. This is achieved by adding a dedicated profile instance for uplink representors which includes the enable/disable and more profile routines which are invoked by the general mlx5e code for netdev attach/detach. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-17 11:03:28 -08:00
Or Gerlitz	aec002f6f8	net/mlx5e: Uninstantiate esw manager vport netdev on switchdev mode Now, when we have a dedicated uplink representor, the netdev instance set over the esw manager vport (PF) is of no-use. As such, remove it once we're on switchdev mode and get it back to life when off switchdev. This is done by reloading the Ethernet interface as well (we already do that for the IB interface) from the eswitch code while going in/out of switchdev mode. The Eth add/remove entries are modified to act differently when called in switchdev mode. In this case we only deal with registration of the eth vport representors. The rep netdevices are created from the eswitch call to load the registered eth representors. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-17 11:03:27 -08:00
Or Gerlitz	13e509a4c1	net/mlx5e: Remove leftover code from the PF netdev being uplink rep Remove some last leftovers from using the PF netdev as the e-switch uplink representor. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-17 11:03:27 -08:00
Or Gerlitz	d9ee0491c2	net/mlx5e: Use dedicated uplink vport netdev representor Currently, when running in sriov switchdev mode, we are using the PF netdevice as the uplink representor, this is problematic from few aspects: - will break when the PF isn't eswitch manager (e.g smart NIC env) - misalignment with other NIC switchdev drivers - makes us have and maintain special code, hurts the driver quality/robustness - which in turn opens the door for future bugs As of each and all of the above, we move to have a dedicated netdev representor for the uplink vport in a similar manner done for for the VF vports. This includes the following: 1. have an uplink rep netdev as we have for VF reps 2. all reps use same load/unload functions 3. HW stats for uplink based on physical port counters and not vport counters 4. link state for the uplink managed through PAOS and not vport state 5. the uplink rep has sysfs link to the PF PCI function && uses the PF MAC address Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-17 11:03:27 -08:00
Or Gerlitz	025380b20d	net/mlx5e: Use single argument for the esw representor build params helper This is prep step towards adding dedicated uplink representor. The patch doesn't change any functionality. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-17 11:03:26 -08:00
Or Gerlitz	915fe1a0d9	net/mlx5: E-Switch, Remove redundant reloading of the IB interface The reload of the IB interface done on the offloads stop call is redundant b/c we do that on mlx5_eswitch_disable_sriov(), remove it. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-17 11:03:26 -08:00
Nir Dotan	03ce5bd187	mlxsw: reg: Activate Bloom filter Now that mlxsw driver handles all aspects of updating the Bloom filter mechanism, set bf_bypass value to false and allow HW to use Bloom filter. Signed-off-by: Nir Dotan <nird@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-16 15:20:34 -08:00
Nir Dotan	dd97d85f1e	mlxsw: spectrum_acl: Set master RP index on transition to eRP Bloom filter is updated on transitions from a single rule pattern, also called master RP, to eRP table and vice versa. Since rules are being written to or deleted from the Bloom filter on such transitions, it is not required to keep the same eRP bank ID for the master RP. Change master RP index assignment so it will be assigned with zero. This is consistent with the assignment of the first available spot that is used for allocating eRP's indices. Signed-off-by: Nir Dotan <nird@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-16 15:20:34 -08:00
Nir Dotan	135fd95728	mlxsw: spectrum_acl: Update Bloom filter on eRP transitions Bloom filter update is required only for rules which reside on an eRP. When the region has only a single rule pattern then eRP table is not used, however insertion of another pattern would trigger a move to an active eRP table so it is imperative to update the Bloom filter with all previously configured rules. Add a method that updates Bloom filter entries for all rules currently configured in the region, on the event of a transition from master mask to eRP, or vice versa. For that purpose, maintain a list of all A-TCAM rules within mlxsw_sp_acl_atcam_region. Signed-off-by: Nir Dotan <nird@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-16 15:20:34 -08:00
Nir Dotan	8c81b7438b	mlxsw: spectrum_acl: Set A-TCAM rules in Bloom filter Add calls to eRP module for updating Bloom filter when a rule is added or removed from the A-TCAM. eRP module will update the Bloom filter only for cases in which the region has an active eRP table. Signed-off-by: Nir Dotan <nird@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-16 15:20:34 -08:00
Nir Dotan	f5a2852ed0	mlxsw: spectrum_acl: Add Bloom filter update Add Bloom filter update for rule insertion and rule removal scenarios. This is done within eRP module in order to assure that Bloom filter updates are done only for rules which are part of an eRP, as HW does not consult Bloom filter for entries when there is a single (master) mask in the region. Signed-off-by: Nir Dotan <nird@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-16 15:20:34 -08:00
Nir Dotan	7585cacdb9	mlxsw: spectrum_acl: Add Bloom filter handling Spectrum-2 HW uses Bloom filter in order to skip lookups on specific eRPs. It uses crc-16-Msbit-first calculation over a specific layout of a rule's key fields combined with eRP ID as well as region ID. Per potential lookup, iff the Bloom filter entry of the calculated index is empty, then the lookup can be skipped. Hence, the mlxsw driver should update the Bloom filter entry per each rule insertion or deletion when rules are part of an eRP. Add functions for adding and deleting entries in the Bloom filter. In order to do so also add crc-16 computation based on the specific Spectrum-2 polynomial and a function for encoding the crc-16 input in the manner dictated by HW implementation. Signed-off-by: Nir Dotan <nird@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-16 15:20:34 -08:00
Nir Dotan	0487cfba86	mlxsw: spectrum_acl: Introduce Bloom filter Lay the foundations for Bloom filter handling. Introduce a new file for Bloom filter actions. Add struct mlxsw_sp_acl_bf to struct mlxsw_sp_acl_erp_core and initialize the Bloom filter data structure. Also take care of proper destruction when terminating. Signed-off-by: Nir Dotan <nird@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-16 15:20:33 -08:00
Nir Dotan	944068582f	mlxsw: resources: Add Spectrum-2 Bloom filter resource Add the maximum Bloom filter logarithmic size per eRP table bank. Signed-off-by: Nir Dotan <nird@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-16 15:20:33 -08:00
Nir Dotan	418089a850	mlxsw: reg: Add Policy Engine Algorithmic Bloom Filter Entries Register Bloom filter is a bit vector which allows the HW a fast lookup on a small size bit vector, that may reduce the number of lookups on the A-TCAM memory. PEABFE register allows setting values to the bits of the bit vector mentioned above. Add the register to be later used in A-TCAM optimizations. Signed-off-by: Nir Dotan <nird@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-16 15:20:33 -08:00
Jakub Kicinski	036b9e7cae	nfp: abm: allow to opt-out of RED offload FW team asks to be able to not support RED even if NIC is capable of buffering for testing and experimentation. Add an opt-out flag. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-16 12:41:42 -08:00
Marcin Wojtas	e735fd55b9	net: mvneta: fix operation for 64K PAGE_SIZE Recent changes in the mvneta driver reworked allocation and handling of the ingress buffers to use entire pages. Apart from that in SW BM scenario the HW must be informed via PRXDQS about the biggest possible incoming buffer that can be propagated by RX descriptors. The BufferSize field was filled according to the MTU-dependent pkt_size value. Later change to PAGE_SIZE broke RX operation when usin 64K pages, as the field is simply too small. This patch conditionally limits the value passed to the BufferSize of the PRXDQS register, depending on the PAGE_SIZE used. On the occasion remove now unused frag_size field of the mvneta_port structure. Fixes: `562e2f467e` ("net: mvneta: Improve the buffer allocation method for SWBM") Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-16 12:39:59 -08:00
Yonglong Liu	6adafc356e	net: hns: Fix ping failed when use net bridge and send multicast Create a net bridge, add eth and vnet to the bridge. The vnet is used by a virtual machine. When ping the virtual machine from the outside host and the virtual machine send multicast at the same time, the ping package will lost. The multicast package send to the eth, eth will send it to the bridge too, and the bridge learn the mac of eth. When outside host ping the virtual mechine, it will match the promisc entry of the eth which is not expected, and the bridge send it to eth not to vnet, cause ping lost. So this patch change promisc tcam entry position to the END of 512 tcam entries, which indicate lower priority. And separate one promisc entry to two: mc & uc, to avoid package match the wrong tcam entry. Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-16 12:07:32 -08:00
Yonglong Liu	726ae5c9e5	net: hns: Add mac pcs config when enable\|disable mac In some case, when mac enable\|disable and adjust link, may cause hard to link(or abnormal) between mac and phy. This patch adds the code for rx PCS to avoid this bug. Disable the rx PCS when driver disable the gmac, and enable the rx PCS when driver enable the mac. Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-16 12:07:32 -08:00
Yonglong Liu	7e74a19ca5	net: hns: Fix ntuple-filters status error. The ntuple-filters features is forced on by chip. But it shows "ntuple-filters: off [fixed]" when use ethtool. This patch make it correct with "ntuple-filters: on [fixed]". Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-16 12:07:32 -08:00
Yonglong Liu	a57275d355	net: hns: Avoid net reset caused by pause frames storm There will be a large number of MAC pause frames on the net, which caused tx timeout of net device. And then the net device was reset to try to recover it. So that is not useful, and will cause some other problems. So need doubled ndev->watchdog_timeo if device watchdog occurred until watchdog_timeo up to 40s and then try resetting to recover it. When collecting dfx information such as hardware registers when tx timeout. Some registers for count were cleared when read. So need move this task before update net state which also read the count registers. Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-16 12:07:32 -08:00
Yonglong Liu	c82bd077e1	net: hns: Free irq when exit from abnormal branch 1.In "hns_nic_init_irq", if request irq fail at index i, the function return directly without releasing irq resources that already requested. 2.In "hns_nic_net_up" after "hns_nic_init_irq", if exceptional branch occurs, irqs that already requested are not release. Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-16 12:07:32 -08:00
Yonglong Liu	31f6b61d81	net: hns: Clean rx fbd when ae stopped. If there are packets in hardware when changing the speed or duplex, it may cause hardware hang up. This patch adds the code to wait rx fbd clean up when ae stopped. Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-16 12:07:32 -08:00
Yonglong Liu	5778b13b64	net: hns: Fixed bug that netdev was opened twice After resetting dsaf to try to repair chip error such as ecc error, the net device will be open if net interface is up. But at this time if there is the users set the net device up with the command ifconfig, the net device will be opened twice consecutively. Function napi_enable was called when open device. And Kernel panic will be occurred if it was called twice consecutively. Such as follow: static inline void napi_enable(struct napi_struct *n) { BUG_ON(!test_bit(NAPI_STATE_SCHED, &n->state)); smp_mb__before_clear_bit(); clear_bit(NAPI_STATE_SCHED, &n->state); } [37255.571996] Kernel panic - not syncing: BUG! [37255.595234] Call trace: [37255.597694] [<ffff80000008ab48>] dump_backtrace+0x0/0x1a0 [37255.603114] [<ffff80000008ad08>] show_stack+0x20/0x28 [37255.608187] [<ffff8000009c4944>] dump_stack+0x98/0xb8 [37255.613258] [<ffff8000009c149c>] panic+0x10c/0x26c [37255.618070] [<ffff80000070f134>] hns_nic_net_up+0x30c/0x4e0 [37255.623664] [<ffff80000070f39c>] hns_nic_net_open+0x94/0x12c [37255.629346] [<ffff80000084be78>] __dev_open+0xf4/0x168 [37255.634504] [<ffff80000084c1ac>] __dev_change_flags+0x98/0x15c [37255.640359] [<ffff80000084c29c>] dev_change_flags+0x2c/0x68 [37255.769580] [<ffff8000008dc400>] devinet_ioctl+0x650/0x704 [37255.775086] [<ffff8000008ddc38>] inet_ioctl+0x98/0xb4 [37255.780159] [<ffff800000827b7c>] sock_do_ioctl+0x44/0x84 [37255.785490] [<ffff800000828e04>] sock_ioctl+0x248/0x30c [37255.790737] [<ffff80000026dc6c>] do_vfs_ioctl+0x480/0x618 [37255.796156] [<ffff80000026de94>] SyS_ioctl+0x90/0xa4 [37255.801139] SMP: stopping secondary CPUs [37255.805079] kbox: catch panic event. [37255.809586] collected_len = 128928, LOG_BUF_LEN_LOCAL = 131072 [37255.816103] flush cache 0xffff80003f000000 size 0x800000 [37255.822192] flush cache 0xffff80003f000000 size 0x800000 [37255.828289] flush cache 0xffff80003f000000 size 0x800000 [37255.834378] kbox: no notify die func register. no need to notify [37255.840413] ---[ end Kernel panic - not syncing: BUG! This patchset fix this bug according to the flag NIC_STATE_DOWN. Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-16 12:07:31 -08:00
Yonglong Liu	4ad26f117b	net: hns: Some registers use wrong address according to the datasheet. According to the hip06 datasheet: 1.Six registers use wrong address: RCB_COM_SF_CFG_INTMASK_RING RCB_COM_SF_CFG_RING_STS RCB_COM_SF_CFG_RING RCB_COM_SF_CFG_INTMASK_BD RCB_COM_SF_CFG_BD_RINT_STS DSAF_INODE_VC1_IN_PKT_NUM_0_REG 2.The offset of DSAF_INODE_VC1_IN_PKT_NUM_0_REG should be 0x103C + 0x80 * all_chn_num 3.The offset to show the value of DSAF_INODE_IN_DATA_STP_DISC_0_REG is wrong, so the value of DSAF_INODE_SW_VLAN_TAG_DISC_0_REG will be overwrite These registers are only used in "ethtool -d", so that did not cause ndev to misfunction. Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-16 12:07:31 -08:00
Yonglong Liu	308c6cafde	net: hns: All ports can not work when insmod hns ko after rmmod. There are two test cases: 1. Remove the 4 modules:hns_enet_drv/hns_dsaf/hnae/hns_mdio, and install them again, must use "ifconfig down/ifconfig up" command pair to bring port to work. This patch calls phy_stop function when init phy to fix this bug. 2. Remove the 2 modules:hns_enet_drv/hns_dsaf, and install them again, all ports can not use anymore, because of the phy devices register failed(phy devices already exists). Phy devices are registered when hns_dsaf installed, this patch removes them when hns_dsaf removed. The two cases are sometimes related, fixing the second case also requires fixing the first case, so fix them together. Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-16 12:07:31 -08:00
Yonglong Liu	4e1d4be681	net: hns: Incorrect offset address used for some registers. According to the hip06 Datasheet: 1. The offset of INGRESS_SW_VLAN_TAG_DISC should be 0x1A00+4all_chn_num 2. The offset of INGRESS_IN_DATA_STP_DISC should be 0x1A50+4all_chn_num Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-16 12:07:31 -08:00
David S. Miller	63de273f34	mlx5e-updates-2018-12-14 (VF Lag) From Aviv Heller, Subsequent patches introduce VF LAG, which provdies load-balancing and high-availability capabilities for VFs associated with different physical ports of the same Connect-X card. This series consists of the following: - mlx5 devcom, driver infrastructure that facilitates operations that involve both core devices (physical functions) of the same card, to synchronize and communicate between two driver instances of the same card. - Infrastructure for TC rule duplication. - Changes to LAG logic to enable its use when SR-IOV is enabled - PFs in switchdev mode is the only mode currently supported. -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJcFCCzAAoJEEg/ir3gV/o+VzIH/3xOcSuDIB7rM/mNoMtXVvvu ZBZGUlUh7PoX/+fLjKYuQhosCvx1kpecPeNTa/R4yFy3kUl70CsKT6Cqtw2qGiPU 2hfr75MobM5nMwOU1QDXYyfSaXZsfUj25DE25OSvtMygR6YjeU0Lm4peWBnd9MD0 Oj5lilggS6hiNyCTI0s1D4tR0Uh2roj2tAGW7JRkd1Et34u9KZbBitCWvYkTtqcp XauuCdjlDNZenOg0Lue9fAc4Jo9AVEaB2w/4dZDsarZyvabQ4DxVrmrVOgLmWBjc Tr10JCuBvgdX/66JikJQ08Fkx/pHGby6jxcI4ux3QEoHe4/umm3Te0+5/pIxe58= =wDuV -----END PGP SIGNATURE----- Merge tag 'mlx5e-updates-2018-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5e-updates-2018-12-14 (VF Lag) From Aviv Heller, Subsequent patches introduce VF LAG, which provdies load-balancing and high-availability capabilities for VFs associated with different physical ports of the same Connect-X card. This series consists of the following: - mlx5 devcom, driver infrastructure that facilitates operations that involve both core devices (physical functions) of the same card, to synchronize and communicate between two driver instances of the same card. - Infrastructure for TC rule duplication. - Changes to LAG logic to enable its use when SR-IOV is enabled - PFs in switchdev mode is the only mode currently supported. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-15 13:29:56 -08:00
Ilias Apalodimas	35e07d2347	net: socionext: remove mmio reads on Tx Currently the driver issues 2 mmio reads to figure out the number of transmitted packets and clean them. We can get rid of the expensive reads since BIT 31 of the Tx descriptor can be used for that. We can also remove the budget counting of Tx completions since all of the descriptors are not deliberately processed. Performance numbers using pktgen are: size pre-patch(pps) post-patch(pps) 64 362483 427916 128 358315 411686 256 352725 389683 512 215675 216464 1024 113812 114442 Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-15 13:15:19 -08:00
Ilias Apalodimas	17a12eaaf0	net: socionext: correctly recover txq after being full Running pktgen with packets sizes > 512b ends up in the interface Txq getting stuck. "netsec 522d0000.ethernet eth0: netsec_netdev_start_xmit: TxQFull!" appears on dmesg but the interface never recovers. It requires an ifconfig down/up to make the interface usable again. The reason that triggers this, is a race condition between .ndo_start_xmit and the napi completion. The available budget is calculated first and indicates the queue is full. Due to a costly netif_err() the queue is not stopped in time while the napi completion runs, clears the irq and frees up descriptors, thus the queue never wakes up again. Fix this by moving the print after stopping the queue, make the print ratelimited, add barriers and check for cleaned descriptors.. Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-15 13:15:19 -08:00
Heiner Kallweit	e782410ed2	r8169: improve spurious interrupt detection Improve detection of spurious interrupts by checking against the interrupt mask as currently set in the chip. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-15 11:22:48 -08:00
Yangtao Li	b09026c691	cxgb4: remove DEFINE_SIMPLE_DEBUGFS_FILE() We already have the DEFINE_SHOW_ATTRIBUTE. There is no need to define such a macro, so remove DEFINE_SIMPLE_DEBUGFS_FILE. Also use the DEFINE_SHOW_ATTRIBUTE macro to simplify some code. Signed-off-by: Yangtao Li <tiny.windzz@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-15 11:21:22 -08:00
liuzhongzhu	82e00b86a5	net: hns3: Add "tm map" status information query function This patch prints dcb register status information by module. debugfs command: root@(none)# echo dump tm map 100 > cmd queue_id \| qset_id \| pri_id \| tc_id 0100 \| 0065 \| 08 \| 00 root@(none)# Signed-off-by: liuzhongzhu <liuzhongzhu@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-15 10:54:18 -08:00
liuzhongzhu	0c29d1912b	net: hns3: Add "queue map" information query function This patch prints queue map information. debugfs command: echo dump queue map > cmd Sample Command: root@(none)# echo queue map > cmd local queue id \| global queue id \| vector id 0 32 769 1 33 770 2 34 771 3 35 772 4 36 773 5 37 774 6 38 775 7 39 776 8 40 777 9 41 778 10 42 779 11 43 780 12 44 781 13 45 782 14 46 783 15 47 784 root@(none)# Signed-off-by: liuzhongzhu <liuzhongzhu@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-15 10:54:18 -08:00
liuzhongzhu	c0ebebb9cc	net: hns3: Add "dcb register" status information query function This patch prints dcb register status information by module. debugfs command: root@(none)# echo dump reg dcb > cmd roce_qset_mask: 0x0 nic_qs_mask: 0x0 qs_shaping_pass: 0x0 qs_bp_sts: 0x0 pri_mask: 0x0 pri_cshaping_pass: 0x0 pri_pshaping_pass: 0x0 root@(none)# Signed-off-by: liuzhongzhu <liuzhongzhu@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-15 10:54:18 -08:00
liuzhongzhu	27cf979a15	net: hns3: Add "status register" information query function This patch prints status register information by module. debugfs command: echo dump reg [mode name] > cmd Sample Command: root@(none)# echo dump reg bios common > cmd BP_CPU_STATE: 0x0 DFX_MSIX_INFO_NIC_0: 0xc000 DFX_MSIX_INFO_NIC_1: 0xf DFX_MSIX_INFO_NIC_2: 0x2 DFX_MSIX_INFO_NIC_3: 0x2 DFX_MSIX_INFO_ROC_0: 0xc000 DFX_MSIX_INFO_ROC_1: 0x0 DFX_MSIX_INFO_ROC_2: 0x0 DFX_MSIX_INFO_ROC_3: 0x0 root@(none)# Signed-off-by: liuzhongzhu <liuzhongzhu@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-15 10:54:18 -08:00
liuzhongzhu	7737f1fbb5	net: hns3: Add "manager table" information query function This patch prints manager table information. debugfs command: echo dump mng tbl > cmd Sample Command: root@(none)# echo dump mng tbl > cmd entry\|mac_addr \|mask\|ether\|mask\|vlan\|mask\|i_map\|i_dir\|e_type 00 \|01:00:5e:00:00:01\|0 \|00000\|0 \|0000\|0 \|00 \|00 \|0 01 \|c2:f1:c5:82:68:17\|0 \|00000\|0 \|0000\|0 \|00 \|00 \|0 root@(none)# Signed-off-by: liuzhongzhu <liuzhongzhu@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-15 10:54:18 -08:00
liuzhongzhu	122bedc56a	net: hns3: Add "bd info" query function This patch prints Sending and receiving package descriptor information. debugfs command: echo dump bd info 1 > cmd Sample Command: root@(none)# echo bd info 1 > cmd hns3 0000:7d:00.0: TX Queue Num: 0, BD Index: 0 hns3 0000:7d:00.0: (TX) addr: 0x0 hns3 0000:7d:00.0: (TX)vlan_tag: 0 hns3 0000:7d:00.0: (TX)send_size: 0 hns3 0000:7d:00.0: (TX)vlan_tso: 0 hns3 0000:7d:00.0: (TX)l2_len: 0 hns3 0000:7d:00.0: (TX)l3_len: 0 hns3 0000:7d:00.0: (TX)l4_len: 0 hns3 0000:7d:00.0: (TX)vlan_tag: 0 hns3 0000:7d:00.0: (TX)tv: 0 hns3 0000:7d:00.0: (TX)vlan_msec: 0 hns3 0000:7d:00.0: (TX)ol2_len: 0 hns3 0000:7d:00.0: (TX)ol3_len: 0 hns3 0000:7d:00.0: (TX)ol4_len: 0 hns3 0000:7d:00.0: (TX)paylen: 0 hns3 0000:7d:00.0: (TX)vld_ra_ri: 0 hns3 0000:7d:00.0: (TX)mss: 0 hns3 0000:7d:00.0: RX Queue Num: 0, BD Index: 120 hns3 0000:7d:00.0: (RX)addr: 0xffee7000 hns3 0000:7d:00.0: (RX)pkt_len: 0 hns3 0000:7d:00.0: (RX)size: 0 hns3 0000:7d:00.0: (RX)rss_hash: 0 hns3 0000:7d:00.0: (RX)fd_id: 0 hns3 0000:7d:00.0: (RX)vlan_tag: 0 hns3 0000:7d:00.0: (RX)o_dm_vlan_id_fb: 0 hns3 0000:7d:00.0: (RX)ot_vlan_tag: 0 hns3 0000:7d:00.0: (RX)bd_base_info: 0 Signed-off-by: liuzhongzhu <liuzhongzhu@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-15 10:54:17 -08:00
Arnd Bergmann	51367e423c	w90p910_ether: remove incorrect __init annotation The get_mac_address() function is normally inline, but when it is not, we get a warning that this configuration is broken: WARNING: vmlinux.o(.text+0x4aff00): Section mismatch in reference from the function w90p910_ether_setup() to the function .init.text:get_mac_address() The function w90p910_ether_setup() references the function __init get_mac_address(). This is often because w90p910_ether_setup lacks a __init Remove the __init to make it always do the right thing. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-14 14:42:51 -08:00
Atul Gupta	848dd1c1cb	crypto/chelsio/chtls: macro correction in tx path corrected macro used in tx path. removed redundant hdrlen and check for !page in chtls_sendmsg Signed-off-by: Atul Gupta <atul.gupta@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-14 13:39:39 -08:00
Wen Yang	390de19404	net/ibmvnic: Remove tests of member address The driver was checking for non-NULL address. - adapter->napi[i] This is pointless as these will be always non-NULL, since the 'dapter->napi' is allocated in init_napi(). It is safe to get rid of useless checks for addresses to fix the coccinelle warning: >>drivers/net/ethernet/ibm/ibmvnic.c: test of a variable/field address Since such statements always return true, they are redundant. Signed-off-by: Wen Yang <wen.yang99@zte.com.cn> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: Paul Mackerras <paulus@samba.org> CC: Michael Ellerman <mpe@ellerman.id.au> CC: Thomas Falcon <tlfalcon@linux.ibm.com> CC: John Allen <jallen@linux.ibm.com> CC: "David S. Miller" <davem@davemloft.net> CC: linuxppc-dev@lists.ozlabs.org CC: netdev@vger.kernel.org CC: linux-kernel@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-14 13:36:38 -08:00
Nathan Chancellor	2ab4c3426c	drivers: net: xgene: Remove unnecessary forward declarations Clang warns: drivers/net/ethernet/apm/xgene/xgene_enet_main.c:33:36: warning: tentative array definition assumed to have one element static const struct acpi_device_id xgene_enet_acpi_match[]; ^ 1 warning generated. Both xgene_enet_acpi_match and xgene_enet_of_match are defined before their uses at the bottom of the file so this is unnecessary. When CONFIG_ACPI is disabled, ACPI_PTR becomes NULL so xgene_enet_acpi_match doesn't need to be defined. Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-14 13:35:27 -08:00
Aviv Heller	9582466640	net/mlx5: Handle LAG FW commands failure gracefully When create_lag or destroy_lag FW commands fail, display an appropriate error message, and try to recover, if possible. Signed-off-by: Aviv Heller <avivh@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-14 13:28:56 -08:00
Aviv Heller	7c34ec19e1	net/mlx5: Make RoCE and SR-IOV LAG modes explicit With the introduction of SR-IOV LAG, checking whether LAG is active is no longer good enough, since RoCE and SR-IOV LAG each entails different behavior by both the core and infiniband drivers. This patch introduces facilities to discern LAG type, in addition to mlx5_lag_is_active(). These are implemented in such a way as to allow more complex mode combinations in the future. Signed-off-by: Aviv Heller <avivh@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-14 13:28:55 -08:00
Aviv Heller	292612d68c	net/mlx5: Rename mlx5_lag_is_bonded() to __mlx5_lag_is_active() The new name better represents the function's aim, and sets a precedent for a '__' prefix for internal, non-locked versions of LAG APIs. Signed-off-by: Aviv Heller <avivh@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-14 13:28:54 -08:00
Rabie Loulou	8aaca1976e	net/mlx5: Allow co-enablement of uplink LAG and SRIOV Enable setting uplink LAG if sriov is enabled on both ports in switchdev mode. Once the sriov mode is changed from switchdev for any of the ports, the LAG instance is disabled. Signed-off-by: Rabie Loulou <rabiel@mellanox.com> Signed-off-by: Aviv Heller <avivh@mellanox.com> Signed-off-by: Jianbo Liu <jianbol@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-14 13:28:54 -08:00
Rabie Loulou	eff849b2c6	net/mlx5: Allow/disallow LAG according to pre-req only Remove the lag forbid/allow functions, change the lag prereq check to run in the do-bond logic, so every change in the prereq state will cause LAG to be disabled/enabled accordingly after the next do-bond run. Add lag update function, so every component which changes the prereq state and want the LAG to re-calc the conditions can call the update function. Signed-off-by: Rabie Loulou <rabiel@mellanox.com> Signed-off-by: Aviv Heller <avivh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-14 13:28:54 -08:00
Rabie Loulou	3b5ff59fd8	net/mlx5: Adjustments for the activate LAG logic to run under sriov When HW lag is set/unset, roce must not be enabled on the port, as such we wrap such changes with roce enable/disable either directly or through re-creation of IB device. Currently, lag and sriov are mutually exclusive, so by definition this code doesn't run under sriov. Towards changing this exclusion, we need to make sure that roce will not be enabled on the eswitch manager port under sriov since this is requirement of the switchdev mode. We are going strict here and avoiding this all together under sriov. Signed-off-by: Rabie Loulou <rabiel@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-14 13:28:53 -08:00
Aviv Heller	1418ddd96a	net/mlx5e: Duplicate offloaded TC eswitch rules under uplink LAG Under uplink LAG, packets that match a flow might arrive on both uplink ports and transmitted through both as part of supporting aggregation and high-availability. When the netdevs representing the uplinks are set into LAG (bonding, teaming), duplicate the TC flow offloading into each of the per-uplink e-switches. Duplication is not required if the source is the bond device, since in this case it is assumed that the bond and the uplink netdevs share the same TC block, and thus duplication will occur naturally by the stack. Note that under encapsulation scheme, both flows will use the same neighbour and hence both will contribute to the last-used feedback towards the stack. Signed-off-by: Aviv Heller <avivh@mellanox.com> Signed-off-by: Rabie Loulou <rabiel@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-14 13:28:53 -08:00
Rabie Loulou	7ba58ba7ba	net/mlx5e: Offload TC e-switch rules with egress LAG device When parsing TC FDB actions, if the egress device is a bond/team net-device which enslaved the uplink representor of the e-switch, use the uplink representor as the destination in the HW rule. Signed-off-by: Rabie Loulou <rabiel@mellanox.com> Signed-off-by: Aviv Heller <avivh@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-14 13:28:53 -08:00
Rabie Loulou	491c37e49b	net/mlx5e: In case of LAG, one switch parent id is used for all representors When the uplink representors are put into lag, set all the representors (VFs and uplinks) of the same NIC to return the same switchdev id. Currently, the route lookup code on the encapsulation offload path assumes that same switchdev id for the source and dest devices means that the dest is also mlx5 HW netdev. This doesn't hold anymore when we align the switchdev Id of the uplinks to be same, which in turn causes the bond/team to return that id to the caller. As such, enhance the relevant check to take into account the uplink lag case. Signed-off-by: Rabie Loulou <rabiel@mellanox.com> Signed-off-by: Aviv Heller <avivh@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-14 13:28:52 -08:00
Shahar Klein	f9392795e2	net/mlx5e: Enhance flow counter scheme for offloaded TC eswitch rules Assign a counter dev attribute according to device capability and use it for management of counters related to offloaded eswitch TC flows. With upcoming support for uplink LAG, we have two HW rules per one logical SW (TC) rule. Although the HW supports attaching one counter to multiple rules, we are allocating counter per HW rule. We need this separation for two reasons: 1. "flow eswitch" counter affinity HW require the counter to be allocated on the device where the eswitch rule is set. 2. for some use-cases (multi-path routing) each HW flow relates to different neighbour, hence our neigh update logic must have a per-rule HW accountant in order to provide the proper feedback to the kernel. Signed-off-by: Shahar Klein <shahark@mellanox.com> Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-14 13:28:52 -08:00
Roi Dayan	04de7dda73	net/mlx5e: Infrastructure for duplicated offloading of TC flows Under uplink LAG or multipath schemes, traffic that matches one flow might arrive on both uplink ports and transmitted through both as part of supporting aggregation and high-availability. To cope with the fact that the SW model might use logical SW port (e.g uplink team or bond) but we have two HW ports with e-switch on each, there are cases where in order to offload a SW TC rule we need to duplicate it to two HW flows. Since each HW rule has its own counter we also aggregate the counter of both rules when a flow stats query is executed from user-space. Introduce the changes for the different elements (add/delete/stats), currently nothing is duplicated. Signed-off-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Aviv Heller <avivh@mellanox.com> Signed-off-by: Shahar Klein <shahark@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-14 13:28:52 -08:00
Roi Dayan	ac004b8321	net/mlx5e: E-Switch, Add peer miss rules In the sriov offloads mode, packets that are not matched by any other rule are sent towards the e-switch vport manager for further processing. Under upcoming patches (e.g for uplink LAG), packets sent from VF vports belonging to esw0 (e-switch related to PF0) might end up in esw1 (e-switch related to PF1) due to muxing logic applied by the FW. In such a case we still want the missed packet to be sent to the "base" esw manager vport in order to present the control plane a consistent view of the source (VF reresentor) port. Signed-off-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Aviv Heller <avivh@mellanox.com> Signed-off-by: Shahar Klein <shahark@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-14 13:28:51 -08:00
Aviv Heller	fadd59fc50	net/mlx5: Introduce inter-device communication mechanism This introduces devcom, a generic mechanism for performing operations on both physical functions of the same Connect-X card. The first user of this API is merged eswitch, which will be introduced in subsequent patches. Signed-off-by: Aviv Heller <avivh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-14 13:28:51 -08:00
Arnd Bergmann	2aa55dccf8	hns3: prevent building without CONFIG_INET We now get a link failure when CONFIG_INET is disabled, since tcp_gro_complete is unavailable: drivers/net/ethernet/hisilicon/hns3/hns3_enet.o: In function `hns3_set_gro_param': hns3_enet.c:(.text+0x230c): undefined reference to `tcp_gro_complete' Add an explicit CONFIG_INET dependency here to avoid the broken configuration. Fixes: `a6d53b97a2` ("net: hns3: Adds GRO params to SKB for the stack") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-14 13:17:50 -08:00
Saeed Mahameed	64e4cf0dab	Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux mlx5-next shared branch with rdma subtree to avoid mlx5 rdma v.s. netdev conflicts. Highlights: 1) Lag refactroing and flow counter affinity bits. 2) mlx5 core cleanups By Roi Dayan (2) and others * 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: net/mlx5: Fold the modify lag code into function net/mlx5: Add lag affinity info to log net/mlx5: Split the activate lag function into two routines net/mlx5: E-Switch, Introduce flow counter affinity IB/mlx5: Unify e-switch representors load approach between uplink and VFs net/mlx5: Use lowercase 'X' for hex values net/mlx5: Remove duplicated include from eswitch.c Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-14 11:15:25 -08:00
Shahar Klein	4c283e6155	net/mlx5: Fold the modify lag code into function Handle the code of modifying the lag affinity within a separate function. Signed-off-by: Shahar Klein <shahark@mellanox.com> Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Aviv Heller <avivh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-14 09:58:57 -08:00
Roi Dayan	3cfe432e1b	net/mlx5: Add lag affinity info to log Report the initial LAG port affinity upon LAG creation. Signed-off-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Shahar Klein <shahark@mellanox.com> Reviewed-by: Aviv Heller <avivh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-14 09:58:57 -08:00
Roi Dayan	8252cf728c	net/mlx5: Split the activate lag function into two routines Split the activate lag function in order to be symmetric with the deactivate lag function. Signed-off-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Shahar Klein <shahark@mellanox.com> Reviewed-by: Aviv Heller <avivh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-14 09:58:57 -08:00
Sudarsana Reddy Kalluru	c3db8d5310	qed: Fix command number mismatch between driver and the mfw The value for OEM_CFG_UPDATE command differs between driver and the Management firmware (mfw). Fix this gap with adding a reserved field. Fixes: `cac6f69154` ("qed: Add support for Unified Fabric Port.") Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-13 21:48:14 -08:00
David S. Miller	38ed22351c	mlx5-fixes-2018-12-13 -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJcEiXXAAoJEEg/ir3gV/o+xxQH/jGS6GQrty9puL40sXKE4ryK GKVSwbKLnu4WatPcLe/CGxJwRoCVUUKfsbprsuy+oYxhUd5M5h6qu215kZgsfdDC CfxznbEj3o6mN+ZyZjL+F2gEG6e96bpkKvayb2a9YYQaY/9+V6CMKh3U5jHI+Q2V D4ToCjWbMXkQ0jsevfpV9W2T0DNfXIa5QGvsvH+D+PoLF/ke/YALZd+yJT0Cm9N2 /vp0qvnHnZO5YavEThN2Lh8f9AY/pAwuGgTrPyddfkZC/6QLH+c8948gF14V+03l Nz78uSpurQ97X511UBcabSQB0xTCHSAw3/SKCVSxFrpMvkhj4mTbjohwc567Gy4= =YxUz -----END PGP SIGNATURE----- Merge tag 'mlx5-fixes-2018-12-13' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux mlx5-fixes-2018-12-13 Subject: [pull request][net 0/9] Mellanox, mlx5 fixes 2018-12-13 Saeed Mahameed says: ==================== This series introduces some fixes to the mlx5 core and mlx5e netdevice driver. ======= Conflict with net-next: When merged with net-next this series will cause a moderate conflict: 1) in drivers/net/ethernet/mellanox/mlx5/core/en_tc.c (2 hunks) Take hunks from net only and just replace attr->mirror_count to attr->split_count 1.1) there is one more instance of slow_attr->mirror_count to be replaced with slow_attr->split_count, it doesn't appear in the conflict, it will cause a compilation error if left out. 2) in mlx5_ifc.h, take hunks only from net. Example for the merge resolution can be found at: https://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git/commit/?h=merge/mlx5-fixes&id=48830adf29804d85d77ed8a251d625db0eb5b8a8 branch merge/mlx5-fixes of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux (I simply merged this pull request tag into net-next and resolved the conflict) I don't know if it's ok with you, but to save your time, you can just: git pull git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux merge/mlx5-fixes Into net-next, before your next net merge, and you will have a clean merge of net into net-next (at least for mlx5 files). ====== Please pull and let me know if there's any problem. For -stable v4.18 338d615be484 ('net/mlx5e: Cancel DIM work on close SQ') 91f40f9904ad ('net/mlx5e: RX, Verify MPWQE stride size is in range') For -stable v4.19 c5c7e1c41bbe ('net/mlx5e: Remove unused UDP GSO remaining counter') ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-13 19:21:53 -08:00
Petr Machata	74bc993974	mlxsw: spectrum_router: Veto unsupported RIF MAC addresses On NETDEV_PRE_CHANGEADDR, if the change is related to a RIF interface, verify that it satisfies the criterion that all RIF interfaces have the same MAC address prefix, as indicated by mlxsw_sp.mac_mask. Additionally, besides explicit address changes, check that the address of an interface for which a RIF is about to be added matches the required pattern as well. Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-13 18:41:39 -08:00
Petr Machata	9329b8162b	mlxsw: spectrum: Add mlxsw_sp.mac_mask The Spectrum hardware demands that all router interfaces in the system have the same first 38 resp. 36 bits of MAC address: the former limit holds on Spectrum, the latter on Spectrum-2. Add a field that refers to the required prefix mask and initialize in mlxsw_sp1_init() and mlxsw_sp2_init(). Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-13 18:41:39 -08:00
Petr Machata	9735f2d2fe	mlxsw: spectrum_router: Generalize mlxsw_sp_netdevice_router_port_event() Prepare mlxsw_sp_netdevice_router_port_event() for handling of NETDEV_PRE_CHANGEADDR. Split out the part that deals with the actual changes and call it for the two events currently handled. Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-13 18:41:39 -08:00
Tal Gilboa	fa2bf86bab	net/mlx5e: Cancel DIM work on close SQ TXQ SQ closure is followed by closing the corresponding CQ. A pending DIM work would try to modify the now non-existing CQ. This would trigger an error: [85535.835926] mlx5_core 0000:af:00.0: mlx5_cmd_check:769:(pid 124399): MODIFY_CQ(0x403) op_mod(0x0) failed, status bad resource state(0x9), syndrome (0x1d7771) Fix by making sure to cancel any pending DIM work before destroying the SQ. Fixes: `cbce4f4447` ("net/mlx5e: Enable adaptive-TX moderation") Signed-off-by: Tal Gilboa <talgi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-13 01:24:44 -08:00
Mikhael Goikhman	d13b224f43	net/mlx5e: Remove unused UDP GSO remaining counter Remove tx_udp_seg_rem counter from ethtool output, as it is no longer being updated in the driver's data flow. Fixes: `3f44899ef2` ("net/mlx5e: Use PARTIAL_GSO for UDP segmentation") Signed-off-by: Mikhael Goikhman <migo@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-13 01:24:44 -08:00
Or Gerlitz	61c806dafe	net/mlx5e: Avoid encap flows deletion attempt the 1st time a neigh is resolved Currently, we are deleting offloaded encap flows in case the relevant neigh becomes unconnected while the encap is valid (a sign that it used to be connected), or if the curr neigh mac is different from the cached mac (a sign that the remote side changed their mac). The 2nd check also applies when the neigh becomes connected on the 1st time (we start with zero mac). Before the offending commit, the deleting handler was practically no op, as no flows were offloaded. But since that commit, we offload neigh-less encap flows to slow path. Under mirroring scheme, we go into the delete handler, attempt to unoffload a mirror rule which was never set (as we were offloading to slow path) and crash. Fix that by calling the delete handler only when the encap is valid, which covers both cases mentioned above. Fixes: `5dbe906ff1` ('net/mlx5e: Use a slow path rule instead if vxlan neighbour isn't available') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-13 01:24:44 -08:00
Or Gerlitz	154e62abe9	net/mlx5e: Properly initialize flow attributes for slow path eswitch rule deletion When a neighbour is resolved, we delete the goto slow path rule from HW. The eswitch flow attributes where not properly initialized on that case, hence we mess up the eswitch refcounts for chain zero (the default one). Fix that along with making sure to use semicolons and not commas on that code; Fixes: `5dbe906ff1` ('net/mlx5e: Use a slow path rule instead if vxlan neighbour isn't available') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-13 01:24:44 -08:00
Or Gerlitz	d14f6f2a84	net/mlx5e: Avoid overriding the user provided priority for offloaded tc rules Just a leftover which was wrongly left there, remove it while spawning a message to suggest firmware upgrade. Fixes: `bf07aa730a` ('net/mlx5e: Support offloading tc priorities and chains for eswitch flows') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-13 01:24:44 -08:00
Or Gerlitz	e88afe759a	net/mlx5e: Err if asked to mirror a goto chain tc eswitch rule Currently we are not supporting this and not err-ing on that either. For now, just err if asked to do that. Fixes: `bf07aa730a` ('net/mlx5e: Support offloading tc priorities and chains for eswitch flows') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reported-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-13 01:24:44 -08:00
Moshe Shemesh	e1c15b62b7	net/mlx5e: RX, Verify MPWQE stride size is in range Add check of MPWQE stride size is within range supported by HW. In case calculated MPWQE stride size exceed range, linear SKB can't be used and we should use non linear MPWQE instead. Fixes: `619a8f2a42` ("net/mlx5e: Use linear SKB in Striding RQ") Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-13 01:24:44 -08:00
Gavi Teitz	8956f0014e	net/mlx5e: Fix default amount of channels for VF representors The default amount of channels a representor opens was erroneously changed from one to the maximum amount of channels, restore to its intended value. Fixes: `779d986d60` ("net/mlx5e: Do not ignore netdevice TX/RX queues number") Signed-off-by: Gavi Teitz <gavi@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-13 01:24:44 -08:00
David S. Miller	95302c394c	mlx5e-updates-2018-12-11 From Eli Britstein, Patches 1-10 adds remote mirroring support. Patches 1-4 refactor encap related code as pre-steps for using per destination encapsulation properties. Patches 5-7 use extended destination feature for single/multi destination scenarios that have a single encap destination. Patches 8-10 enable multiple encap destinations for a TC flow. From, Daniel Jurgens, Patch 11, Use CQE padding for Ethernet CQs, PPC showed up to a 24% improvement in small packet throughput From Eyal Davidovich, patches 12-14, FW monitor counter support FW monitor counters feature came to solve the delayed reporting of FW stats in the atomic get_stats64 ndo, since we can't access the FW at that stage, this feature will enable immediate FW stats updates in the driver via fw events on specific stats updates. Patch 12, cleanup to avoid querying a FW counter when it is not supported Patch 13, Monitor counters FW commands support Patch 14, Use monitor counters in ethernet netdevice to update FW stats reported in the atomic get_stats64 ndo. -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJcEEYmAAoJEEg/ir3gV/o+XlIIAKIQhvHnCoGMPsZU+HbA1h3O 10i2hnmQqx2F5Z7tSON9jr9HVzLj6FqQ3pSnTyw8xiMBqOldF6abmfpMPQ/XeWXk PDBCzfOUMAlIsGFAJ06tcz3VLkWrEdoQHmi7Jb4IW+LC9CnZBobzrcLaxQ95txYP LkCfO8Y70D5tzBrzL1o1RfKsCSFJma5zhXDPF/B3rifLyHOmo9aLzZuDvKwvvSkC Y3QM5ulnLqVhxaXA0xzB+iHLiI0nX2H17CBPseLzM+Ip0vZS/nA77rnQ49PUgjiZ d1BoXo9MUa3eoQwXNl+qSAfnGjCwqzplXxzjT2bKKXVc2ZOWmkECgRXmKI6JDDU= =EuqQ -----END PGP SIGNATURE----- Merge tag 'mlx5e-updates-2018-12-11' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5e-updates-2018-12-11 From Eli Britstein, Patches 1-10 adds remote mirroring support. Patches 1-4 refactor encap related code as pre-steps for using per destination encapsulation properties. Patches 5-7 use extended destination feature for single/multi destination scenarios that have a single encap destination. Patches 8-10 enable multiple encap destinations for a TC flow. From, Daniel Jurgens, Patch 11, Use CQE padding for Ethernet CQs, PPC showed up to a 24% improvement in small packet throughput From Eyal Davidovich, patches 12-14, FW monitor counter support FW monitor counters feature came to solve the delayed reporting of FW stats in the atomic get_stats64 ndo, since we can't access the FW at that stage, this feature will enable immediate FW stats updates in the driver via fw events on specific stats updates. Patch 12, cleanup to avoid querying a FW counter when it is not supported Patch 13, Monitor counters FW commands support Patch 14, Use monitor counters in ethernet netdevice to update FW stats reported in the atomic get_stats64 ndo. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-12 22:22:53 -08:00
David S. Miller	3b076cfe86	Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Fixes 2018-12-12 This series contains fixes to i40e and ixgbe. Stefan Assmann fixes an issue created by a previous fix, where ether_addr_copy() was moved to avoid a race but did not take into account that it alters the MAC address being handed to i40e_del_mac_filter(). Michał Mirosław provides 2 fixes for i40e, first resolves issues in the hardware VLAN offload where VLAN.TCI equal to 0 was being dropped and a race between disabling VLAN receive feature in hardware and processing the receive queue, where packets could have their VLAN information dropped. Ross Lagerwall fixes a racy condition during a ixgbe VF reset, where writing the register to issue a reset and sending the reset message via the mailbox API could result of the mailbox memory getting cleared during the reset before the message gets successfully sent which results in a VF driver malfunction. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-12 21:43:43 -08:00
Biao Huang	43d4b29718	net-next: stmmac: dwmac-mediatek: add module license info Add MODULE_LICENSE info to fix this: WARNING: modpost: missing MODULE_LICENSE() in drivers/net/ethernet/stmicro/stmmac/dwmac-mediatek.o Signed-off-by: Biao Huang <biao.huang@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-12 21:42:55 -08:00
YueHaibing	c784a28b02	net/mlx5e: Remove set but not used variable 'upriv' Fixes gcc '-Wunused-but-set-variable' warning: drivers/net/ethernet/mellanox/mlx5/core/en_rep.c: In function 'mlx5e_vport_rep_load': drivers/net/ethernet/mellanox/mlx5/core/en_rep.c:1490:21: warning: variable 'upriv' set but not used [-Wunused-but-set-variable] drivers/net/ethernet/mellanox/mlx5/core/en_rep.c: In function 'mlx5e_vport_rep_unload': drivers/net/ethernet/mellanox/mlx5/core/en_rep.c:1557:21: warning: variable 'upriv' set but not used [-Wunused-but-set-variable] It not used any more since commit `ef381359e3` ("net/mlx5e: Replace egdev with indirect block notifications"). Also remove unused variable 'uplink_rpriv' after this change. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-12 21:38:04 -08:00
YueHaibing	186599f89e	net/mlx5: Remove duplicated include from eswitch.c Remove duplicated include. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-12 17:15:35 -08:00
Petr Machata	7357eb3d4b	mlxsw: spectrum_switchdev: Propagate extack on port VLAN events After switchdev_handle_port_obj_add() was extended in a preceding patch, mlxsw_sp_port_obj_add() now takes an extack argument. Propagate it further by extending a callee chain from mlxsw_sp_port_vlans_add(), via mlxsw_sp_bridge_port_vlan_add() via mlxsw_sp_port_vlan_bridge_join() via mlxsw_sp_port_vlan_fid_join() to mlxsw_sp_bridge_ops.fid_get, adding an extack argument for each of them. This code path is used when a VLAN is added to a port netdevice if there already is an unoffloadable VXLAN device with that VLAN mapped. mlxsw_sp_bridge_8021d_port_join() is updated to obey the new interfaces changed by the abovementioned code, propagating extack ultimately from NETDEV_CHANGEUPPER events. Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-12 16:34:22 -08:00
Petr Machata	0a5a2aee6f	mlxsw: spectrum_switchdev: Propagate extack on VXLAN VLAN events Now that VLAN port object addition notifications carry an extack, propagate it from mlxsw_sp_switchdev_vxlan_vlans_add() through mlxsw_sp_switchdev_vxlan_vlan_add() to mlxsw_sp_bridge_8021q_vxlan_join(). This code path is used when a VLAN is added to a VXLAN netdevice that cannot be offloaded. Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-12 16:34:22 -08:00
Petr Machata	6921351359	net: switchdev: Add extack to switchdev_handle_port_obj_add() callback Drivers use switchdev_handle_port_obj_add() to handle recursive descent through lower devices. Change this function prototype to take add_cb that itself takes an extack argument. Decode extack from switchdev_notifier_port_obj_info and pass it to add_cb. Update mlxsw and ocelot drivers which use this helper. Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-12 16:34:22 -08:00
Petr Machata	2fd527b72b	net: ndo_bridge_setlink: Add extack Drivers may not be able to implement a VLAN addition or reconfiguration. In those cases it's desirable to explain to the user that it was rejected (and why). To that end, add extack argument to ndo_bridge_setlink. Adapt all users to that change. Following patches will use the new argument in the bridge driver. Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-12 16:34:21 -08:00
Jonathan Toppins	351cbde969	bnxt: remove printing of hwrm message bnxt_en 0000:19:00.0 (unregistered net_device) (uninitialized): hwrm req_type 0x190 seq id 0x6 error 0xffff The message above is commonly seen when a newer driver is used on hardware with older firmware. The issue is this message means nothing to anyone except Broadcom. Remove the message to not confuse users as this message is really not very informative. Signed-off-by: Jonathan Toppins <jtoppins@redhat.com> Acked-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-12 16:31:32 -08:00
Sudarsana Reddy Kalluru	9061193c4e	bnx2x: Send update-svid ramrod with retry/poll flags enabled Driver sends update-SVID ramrod in the MFW notification path. If there is a pending ramrod, driver doesn't retry the command and storm firmware will never be updated with the SVID value. The patch adds changes to send update-svid ramrod in process context with retry/poll flags set. Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-12 16:25:14 -08:00
Sudarsana Reddy Kalluru	07f12622a6	bnx2x: Enable PTP only on the PF that initializes the port There will be only one PHC clock per port. PTP should be enabled only on one PF per port. The change enables PTP functionality on the PF that initializes the port. The change is useful in multi-function modes e.g., NPAR where a port can have more than one PF. Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-12 16:25:14 -08:00
Sudarsana Reddy Kalluru	04f05230c5	bnx2x: Remove configured vlans as part of unload sequence. Vlans are not getting removed when drivers are unloaded. The recent storm firmware versions had added safeguards against re-configuring an already configured vlan. As a result, PF inner reload flows (e.g., mtu change) might trigger an assertion. This change is going to remove vlans (same as we do for MACs) when doing a chip cleanup during unload. Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-12 16:25:14 -08:00
Sudarsana Reddy Kalluru	bbf666c1af	bnx2x: Clear fip MAC when fcoe offload support is disabled On some customer setups it was observed that shmem contains a non-zero fip MAC for 57711 which would lead to enabling of SW FCoE. Add a software workaround to clear the bad fip mac address if no FCoE connections are supported. Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-12 16:25:14 -08:00
Ross Lagerwall	96d1a73161	ixgbe: Fix race when the VF driver does a reset When the VF driver does a reset, it (at least the Linux one) writes to the VFCTRL register to issue a reset and then immediately sends a reset message using the mailbox API. This is racy because when the PF driver detects that the VFCTRL register reset pin has been asserted, it clears the mailbox memory. Depending on ordering, the reset message sent by the VF could be cleared by the PF driver. It then responds to the cleared message with a NACK which causes the VF driver to malfunction. Fix this by deferring clearing the mailbox memory until the reset message is received. Fixes: `939b701ad6` ("ixgbe: fix driver behaviour after issuing VFLR") Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-12-12 15:51:50 -08:00
Michał Mirosław	800b8f637d	i40e: DRY rx_ptype handling code Move rx_ptype extracting to i40e_process_skb_fields() to avoid duplicating the code. Signed-off-by: Michał Mirosław <michal.miroslaw@atendesoftware.pl> Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-12-12 15:46:02 -08:00
Michał Mirosław	2a508c64ad	i40e: fix VLAN.TCI == 0 RX HW offload This fixes two bugs in hardware VLAN offload: 1. VLAN.TCI == 0 was being dropped 2. there was a race between disabling of VLAN RX feature in hardware and processing RX queue, where packets processed in this window could have their VLAN information dropped Fix moves the VLAN handling into i40e_process_skb_fields() to save on duplicated code. i40e_receive_skb() becomes trivial and so is removed. Signed-off-by: Michał Mirosław <michal.miroslaw@atendesoftware.pl> Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-12-12 15:40:42 -08:00
Biao Huang	9992f37e34	stmmac: dwmac-mediatek: add support for mt2712 Add Ethernet support for MediaTek SoCs from the mt2712 family Signed-off-by: Biao Huang <biao.huang@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-12 15:21:00 -08:00
Stefan Assmann	158daed16e	i40e: fix mac filter delete when setting mac address A previous commit moved the ether_addr_copy() in i40e_set_mac() before the mac filter del/add to avoid a race. However it wasn't taken into account that this alters the mac address being handed to i40e_del_mac_filter(). Also changed i40e_add_mac_filter() to operate on netdev->dev_addr, hopefully that makes the code easier to read. Fixes: `458867b2ca` ("i40e: don't remove netdev->dev_addr when syncing uc list") Signed-off-by: Stefan Assmann <sassmann@kpanic.de> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Acked-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-12-12 14:54:48 -08:00
Greg Kroah-Hartman	ed0a773bff	phy: for 4.21 ) Change phy set_mode ops to take both mode and setmode as arguments ) Add phy_configure() and phy_validate() API's mostly used for MIPI D-PHY ) Add helpers to get default values of parameters define in MIPI D-PHY spec ) Add driver for TI's CPSW Port PHY Interface Mode selection ) Add driver for Cadence Sierra PHY used with USB and PCIe ) Add driver for Freescale i.MX8MQ USB3 PHY ) Fixes QMP PHY bindings to allow the clocks provided by the PHY to be pointed at in device tree ) Fix for using fully specified regions (in device tree) for configuring the second lane in dual lane PHYs in QMP PHY ) Add support for Allwinner H6 USB2 PHY in phy-sun4i-usb driver ) Update phy-rcar-gen3-usb driver to follow the hardware manual ) Add support for fine grained power management in mapphone-mdm6600 driver Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> -----BEGIN PGP SIGNATURE----- iQJCBAABCgAsFiEEUXMr/TfP2p4suIY5Dlx4XIBNgtkFAlwQj94OHGtpc2hvbkB0 aS5jb20ACgkQDlx4XIBNgtmYrA/9HK7yfOWfNLJ7DB8bi4yPx5hd657sLohSdDEX 0Pxx9PRuxjLYlaHQh8XbHEdvnLIbbGgnHbfvkkHjxJx272G+QAWMi7/SwF3a2ZpY ev7U1gvWvCTqS/3RsEmxmwiX9esszcOgya2ia5TcyvTIHLla0QXTOUCmOQBejWhJ Wfe3siSzxn6O4BITsoRUu+0Ek3bScW/cFdby5B62zBrs912p1qlGx20ihvjywB6I uR5q1w5DxCbC9KZLTkQgfPXki69r1EaCYzdcEKnUON59OXceIy/TKDx/ewZBOjtR /d+lWybsLMF5MXIy2UKlaKxLrk4AVdsQ6zmfU23uTqCT3rWn7OPEvbzSj3nBXpgi SNzBTt9wHT4ZAGOq4fmhgF1Gn1gx4tEcarvAfeyyzv1riVgGbwlxSaUpWWvkya/E zDBqROi/9VU+WX+De+E2KTCAOjJpcnVpLx8euA7uSD2m0nF4Kp4YiyAJiH/ZDQba ghhR4DiD1xhjO77U+tM7mD+68ZyKsQubqzO+h4rztjZ/QRCWqGtOCg9sSQBxSz86 llEklLB3156dkLnHRMd0NIuSqYGOkHLWdTBrXNr2HaK7ZWvN6jxhC2puziIlmIZl 8m94QyY5pmuTJdQsWA/MEoWFglQqLBDxAVuoMH4d/XFx7PBNxiaTGOOEgX7MhfVM 9di9hak= =JLIJ -----END PGP SIGNATURE----- Merge tag 'phy-for-4.21_v1' of git://git.kernel.org/pub/scm/linux/kernel/git/kishon/linux-phy into usb-next Kishon writes: phy: for 4.21 ) Change phy set_mode ops to take both mode and setmode as arguments ) Add phy_configure() and phy_validate() API's mostly used for MIPI D-PHY ) Add helpers to get default values of parameters define in MIPI D-PHY spec ) Add driver for TI's CPSW Port PHY Interface Mode selection ) Add driver for Cadence Sierra PHY used with USB and PCIe ) Add driver for Freescale i.MX8MQ USB3 PHY ) Fixes QMP PHY bindings to allow the clocks provided by the PHY to be pointed at in device tree ) Fix for using fully specified regions (in device tree) for configuring the second lane in dual lane PHYs in QMP PHY ) Add support for Allwinner H6 USB2 PHY in phy-sun4i-usb driver ) Update phy-rcar-gen3-usb driver to follow the hardware manual ) Add support for fine grained power management in mapphone-mdm6600 driver Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> * tag 'phy-for-4.21_v1' of git://git.kernel.org/pub/scm/linux/kernel/git/kishon/linux-phy: (30 commits) phy: qcom-qmp: Expose provided clocks to DT dt-bindings: phy-qcom-qmp: Move #clock-cells to child phy: qcom-qmp: Utilize fully-specified DT registers dt-bindings: phy-qcom-qmp: Fix register underspecification phy: ti: fix semicolon.cocci warnings phy: dphy: Add configuration helpers phy: Add MIPI D-PHY configuration options phy: Add configuration interface phy: Add MIPI D-PHY mode phy: add driver for Freescale i.MX8MQ USB3 PHY dt-bindings: phy: add binding for Freescale i.MX8MQ USB3 PHY phy: Use of_node_name_eq for node name comparisons net: ethernet: ti: cpsw: add support for port interface mode selection phy dt-bindings: net: ti: cpsw: switch to use phy-gmii-sel phy phy: ti: introduce phy-gmii-sel driver dt-bindings: phy: add cpsw port interface mode selection phy bindings phy: mvebu-cp110-comphy: fix spelling in structure name phy: mapphone-mdm6600: Improve phy related runtime PM calls phy: renesas: rcar-gen3-usb2: follow the hardware manual procedure phy: cadence: Add driver for Sierra PHY ...	2018-12-12 09:26:04 +01:00
Nir Dotan	cf7221a4f5	mlxsw: spectrum_router: Add Multicast routing support for Spectrum-2 Add implementation of Spectrum-2 multicast routes for both IPv4 and IPv6 by using ACL module explicitly. In Spectrum-2, multicast routes are set as ACL rules, so initialization takes care of creating dedicated ACL groups and binding them to the appropriate multicast routing protocol IPv4/IPv6, and afterwards routes configuration translates to setting explicit ACL rules. Signed-off-by: Nir Dotan <nird@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-11 23:01:33 -08:00
Nir Dotan	d7263ab35b	mlxsw: spectrum_acl: Limit priority value In Spectrum-2, higher priority value wins and priority valid values are in the range of {1,cap_kvd_size-1}. mlxsw_sp_acl_tcam_priority_get converts from lower-bound priorities alike tc flower to Spectrum-2 HW range. Up until now tc flower did not provide priority 0 or reached the maximal value, however multicast routing does provide priority 0. Therefore, Change mlxsw_sp_acl_tcam_priority_get to verify priority is in the correct range. Make sure priority is never set to zero and never exceeds the maximal allowed value. Signed-off-by: Nir Dotan <nird@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-11 23:01:33 -08:00
Nir Dotan	c20580c21f	mlxsw: spectrum_acl: Support rule creation without action creation Up until now, when ACL rule was created its action was created with it. It suits well for tc flower where ACL rule always needs an action, however it does not suit multicast router, where the action is created prior to setting a route, which in Spectrum-2 is actually an ACL rule. Add support for rule creation without action creation. Do it by adding afa_block argument to mlxsw_sp_acl_rule_create, which if NULL then an action would be created, also add an indication within struct mlxsw_sp_acl_rule_info that tells if the action should be destroyed when the rule is destroyed. Signed-off-by: Nir Dotan <nird@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-11 23:01:33 -08:00
Nir Dotan	2507a64c17	mlxsw: spectrum_acl: Add replace rule action operation Multicast routes actions may be updated after creation. An example for that is an addition of an egress interface to an existing route. So far, as tc flower API dictated, ACL rules were either created or deleted. Since multicast routes in Spectrum-2 are written to ACL as any rule, it is required to allow the update of a rule's action as it may change. Add methods and operations to support updating rule's action. This is supported only for Spectrum-2. Signed-off-by: Nir Dotan <nird@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-11 23:01:33 -08:00
Nir Dotan	1a29d29394	mlxsw: spectrum_acl: Add multicast router profile operations Add specific ACL operations needed for programming multicast routing ACL groups and routes. Signed-off-by: Nir Dotan <nird@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-11 23:01:33 -08:00
Nir Dotan	add4550fca	mlxsw: spectrum_acl: Add Spectrum-2 keys Add virtual router ID fields to Spectrum-2 key blocks set, as the field is required for multicast routing. Signed-off-by: Nir Dotan <nird@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-11 23:01:33 -08:00
Nir Dotan	254cec1464	mlxsw: spectrum: Change stage of ACL initialization In Spectrum-2, MC routing is implemented using ACL block explicitly, so router initialization should take place after ACL initialization. Set the initialization of the ACL block before IP router initizalization takes place, so multicast router will be able to allocate ACL data structures and create its required chains. Signed-off-by: Nir Dotan <nird@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-11 23:01:33 -08:00
Nir Dotan	a75e41d37a	mlxsw: reg: Add Policy Engine Multicast Router Binding Table Register In Spectrum-2, multicast routing is implemented explicitly using policy engine (ACL) block. PEMRBT register is used to bind a dedicated ACL group to a specific IP protocol. Add the register to be later used in multicast router implementation. Signed-off-by: Nir Dotan <nird@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-11 23:01:33 -08:00
Heiner Kallweit	ee28b30cbb	r8169: fix crash if CONFIG_DEBUG_SHIRQ is enabled If CONFIG_DEBUG_SHIRQ is enabled __free_irq() intentionally fires a spurious interrupt. This interrupt causes a crash because tp->dev->phydev is NULL at that time. Fixes: `38caff5a44` ("r8169: handle all interrupt events in the hard irq handler") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-11 22:54:19 -08:00
Xue Chaojing	e1a76515b0	hinic: optmize rx refill buffer mechanism There is no need to schedule a different tasklet for refill, This patch remove it. Suggested-by: Neil Horman <nhorman@redhat.com> Signed-off-by: Xue Chaojing <xuechaojing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-11 22:52:56 -08:00
Grygorii Strashko	3ff18849eb	net: ethernet: ti: cpsw: add support for port interface mode selection phy Add support for port interface mode selection phy (phy-gmii-sel): - try to request interface mode selection phy from Port DT node and fail silently if not defined and old CONFIG_TI_CPSW_PHY_SEL driver enabled. - use new phy if requested successfully. Cc: Kishon Vijay Abraham I <kishon@ti.com> Cc: Tony Lindgren <tony@atomide.com> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>	2018-12-12 10:01:42 +05:30
Grygorii Strashko	cccc43b853	phy: mvebu-cp110-comphy: convert to use eth phy mode and submode Convert mvebu-cp110-comphy PHY driver to use recently introduced PHY_MODE_ETHERNET and phy_set_mode_ext(). Cc: Russell King - ARM Linux <linux@armlinux.org.uk> Cc: Maxime Chevallier <maxime.chevallier@bootlin.com> Cc: Antoine Tenart <antoine.tenart@free-electrons.com> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Acked-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>	2018-12-12 10:01:35 +05:30
Grygorii Strashko	c8fe6d7f3f	phy: ocelot-serdes: convert to use eth phy mode and submode Convert ocelot-serdes PHY driver to use recently introduced PHY_MODE_ETHERNET and phy_set_mode_ext(). Cc: Quentin Schulz <quentin.schulz@bootlin.com> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Reviewed-by: Quentin Schulz <quentin.schulz@bootlin.com> Tested-by: Quentin Schulz <quentin.schulz@bootlin.com> Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>	2018-12-12 10:01:35 +05:30
Eyal Davidovich	5c7e8bbb02	net/mlx5e: Use monitor counters for update stats - Adding new notifier block (struct mlx5_nb) monitor_counters_nb for handeling MONITOR_COUNTER new event type. - Adding work queue element: monitor_counters_work for re-arm and update stats. - We re-queue the update stat work, only when working over firmware that doesn't support the monitored counters. Signed-off-by: Eyal Davidovich <eyald@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-11 14:52:20 -08:00
Eyal Davidovich	2f8bc4917a	net/mlx5e: Monitor counters commands support new file monitor_stats.c for the new API. add arm_monitor_counter new command support. add set_monitor_counter new command support. The device can monitor specific counters and provide an event to notify when these counters are changed. The monitoring is done in best effort manner where the minimum notification period is 200 ms, however when the device is loaded, the notification might be delayed. To configure the required counters to be monitored, the SET_MONITOR_COUNTER command shall be used with a list of counters to be monitored. The device firmware can monitor up to HCA_CAP.max_num_of_monitor_counters. The configuration is done based on counter type (such as ppcnt, q counter, etc) and additional param according to the type of counter selected. Upon monitor counter change, the device will generate Monitor_Counter_Change event. The device will not generate new events unless the driver re-arms the monitoring functionality, using the ARM_MONITOR_COUNTER command. Signed-off-by: Eyal Davidovich <eyald@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-11 14:52:20 -08:00
Eyal Davidovich	75370eb0d3	net/mlx5e: Avoid query PPCNT register if not supported by the device PPCNT is not supported if PCAM access reg is supported and ppcnt bit is clear. Signed-off-by: Eyal Davidovich <eyald@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-11 14:52:20 -08:00
Daniel Jurgens	939de57d30	net/mlx5e: Use CQE padding for Ethernet CQs Writing 64B CQEs to 128B cache lines results in a RMW operation. Padding the CQEs to 128B if possible improves performance on 128B cache line systems like PPC. Testing on PPC showed up to a 24% improvement in small packet throughput vs the default behavior, depending on the workload and system topology. Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-11 14:52:20 -08:00
Eli Britstein	8c4dc42bf6	net/mlx5e: Support multiple encapsulations for a TC flow Currently a flow is associated with a single encap structure. The FW extended destination features enables the driver to associate a flow with multiple encap instances. Change the encap id field from a flow scope to a per destination value in the flow attributes struct. Use the encaps array to associate a flow table entry with multiple encap entries. Update the neigh logic to offload only if all encapsulations used in a flow are connected, and un-offload upon the first one disconnected. Note that the driver can now support up to two encap destinations. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-11 14:52:20 -08:00
Eli Britstein	79baaec719	net/mlx5e: Allow association of a flow to multiple encaps Currently a flow can be associated with a single encap entry. The extended destination feature enables the driver to configure multiple encap entries per flow. Change the encap flow association field to array as a pre-step towards supporting multiple encap destinations. Use only the first array element, with no functional change. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-11 14:52:20 -08:00
Eli Britstein	98b66cb1c9	net/mlx5e: Change parse attr struct to accommodate multiple tunnel infos Currently the driver can support only a single TC tunnel_set action. Change the tunnel info fields to arrays, as a pre-step to support multiple encapsulations for a single flow, with no functional change. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-11 14:52:20 -08:00
Eli Britstein	1cc26d74bb	net/mlx5e: Support header rewrite actions with remote port mirroring A rule with the following actions is split to a two level FDB: 1. Forward to local mirror vport 2. Header rewrite 3. Forward to local vport In the first level flow table, forward the packet to the local port and forward the packet to the second level flow table for header rewrite and local port forwarding. This configuration fails when mirroring to a remote encapsulated destination because currently an FTE cannot support encap and table destinations. Use the extended destination capabilities to configure the first level flow table with a multi-destination FTE to the uplink and second level table and the second level flow table for the header rewrite and local port forwarding. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-11 14:52:19 -08:00
Eli Britstein	38c9d2697b	net/mlx5e: Replace the split logic with extended destination Currently the FTE encap flag applies to all destinations. To support mirroring encapsulated traffic to a local port the driver split the two destinations to two flow table entries: Table#0: - FWD to the local vport - Goto table#1 Table#1: - Encap and FWD to wire The firmware extended destination capabilities enable the driver to set an encapsulation flag per destination. Remove the split logic and use the extended destination mechanism instead. Note that split technique is still required for pedit and VLAN push scenarios. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-11 14:52:19 -08:00
Eli Britstein	a18e879d4e	net/mlx5e: Annul encap action ordering requirement Currently a FW syndrome is emitted if the driver configures a multi-destination FTE where the first destination is a tunneled uplink port and the second destination is a local vPort. Support this scenario by creating a multi-destination FTE using the firmware's extended destination capabilities. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-11 14:52:19 -08:00
Eli Britstein	f493f15534	net/mlx5e: Move flow attr reformat action bit to per dest flags Flow attr reformat action bit is moved from the global action bits to a per destination flags field, as a pre-step for adding additional flags to support encapsulation properties per destination, with no functionality change. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-11 14:52:19 -08:00
Eli Britstein	df65a573ea	net/mlx5e: Refactor eswitch flow attr for destination specific properties Currently the eswitch flow attr structure stores each destination specific property in its own specific array. Group them in an array of destination structures as a pre-step towards adding additional destination specific field properties. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-11 14:52:19 -08:00
Eli Britstein	e85e02bad2	net/mlx5: E-Switch, Rename esw attr mirror count field The mirror count esw attributes field is used to determine if splitting the rule to two FTEs is required while programming e-switch mirroring. Rename it to split count, making it clearer with no functional change. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-11 14:52:19 -08:00
Eli Britstein	1228e912c9	net/mlx5: Consider encapsulation properties when comparing destinations Currently the driver identifies identical vport destinations by comparing the vport ID. The FW extended destination feature enables the driver to forward the packet to the same vport with multiple encapsulation properties. Change the vport destination comparison logic to compare the encapsulation properties in addition to the vport ID. Signed-off-by: Eli Britstein <elibr@mellanox.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-11 14:52:19 -08:00
Jason Gunthorpe	28ab1bb0e8	Linux 4.20-rc6 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAlwNpb0eHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGwGwH/00UHnXfxww3ixxz zwTVDzptA6SPm6s84yJOWatM5fXhPiAltZaHSYF9lzRzNU71NCq7Frhq3fQUIXKM OxqDn9nfSTWcjWTk2q5n2keyRV/KIn67YX7UgqFc1bO/mqtVjEgNWaMyblhI+e9E giu1ZXayHr43jK1cDOmGExZubXUq7Vsc9TOlrd+d2SwIqeEP7TCMrPhnHDwCNvX2 UU5dtANpVzGtHaBcr37wJj+L8kODCc0f+PQ3g2ar5jTHst5SLlHp2u0AMRnUmgdi VkGx+mu/uk8mtwUqMIMqhplklVoqK6LTeLqsY5Xt32SKruw9UqyJGdphLjW2QP/g MkmA1lI= =7kaD -----END PGP SIGNATURE----- Merge tag 'v4.20-rc6' into rdma.git for-next For dependencies in following patches.	2018-12-11 14:24:57 -07:00
David S. Miller	addb067983	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2018-12-11 The following pull-request contains BPF updates for your net-next tree. It has three minor merge conflicts, resolutions: 1) tools/testing/selftests/bpf/test_verifier.c Take first chunk with alignment_prevented_execution. 2) net/core/filter.c [...] case bpf_ctx_range_ptr(struct __sk_buff, flow_keys): case bpf_ctx_range(struct __sk_buff, wire_len): return false; [...] 3) include/uapi/linux/bpf.h Take the second chunk for the two cases each. The main changes are: 1) Add support for BPF line info via BTF and extend libbpf as well as bpftool's program dump to annotate output with BPF C code to facilitate debugging and introspection, from Martin. 2) Add support for BPF_ALU \| BPF_ARSH \| BPF_{K,X} in interpreter and all JIT backends, from Jiong. 3) Improve BPF test coverage on archs with no efficient unaligned access by adding an "any alignment" flag to the BPF program load to forcefully disable verifier alignment checks, from David. 4) Add a new bpf_prog_test_run_xattr() API to libbpf which allows for proper use of BPF_PROG_TEST_RUN with data_out, from Lorenz. 5) Extend tc BPF programs to use a new __sk_buff field called wire_len for more accurate accounting of packets going to wire, from Petar. 6) Improve bpftool to allow dumping the trace pipe from it and add several improvements in bash completion and map/prog dump, from Quentin. 7) Optimize arm64 BPF JIT to always emit movn/movk/movk sequence for kernel addresses and add a dedicated BPF JIT backend allocator, from Ard. 8) Add a BPF helper function for IR remotes to report mouse movements, from Sean. 9) Various cleanups in BPF prog dump e.g. to make UAPI bpf_prog_info member naming consistent with existing conventions, from Yonghong and Song. 10) Misc cleanups and improvements in allowing to pass interface name via cmdline for xdp1 BPF example, from Matteo. 11) Fix a potential segfault in BPF sample loader's kprobes handling, from Daniel T. 12) Fix SPDX license in libbpf's README.rst, from Andrey. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-10 18:00:43 -08:00
Pieter Jansen van Vuuren	290974d434	nfp: flower: ensure TCP flags can be placed in IPv6 frame Previously we did not ensure tcp flags have a place to be stored when using IPv6. We correct this by including IPv6 key layer when we match tcp flags and the IPv6 key layer has not been included already. Fixes: `07e1671cfc` ("nfp: flower: refactor shared ip header in match offload") Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-10 17:45:41 -08:00
Thomas Falcon	1d1bbc37f8	ibmvnic: Fix non-atomic memory allocation in IRQ context ibmvnic_reset allocated new reset work item objects in a non-atomic context. This can be called from a tasklet, generating the output below. Allocate work items with the GFP_ATOMIC flag instead. BUG: sleeping function called from invalid context at mm/slab.h:421 in_atomic(): 1, irqs_disabled(): 1, pid: 93, name: kworker/0:2 INFO: lockdep is turned off. irq event stamp: 66049 hardirqs last enabled at (66048): [<c000000000122468>] tasklet_action_common.isra.12+0x78/0x1c0 hardirqs last disabled at (66049): [<c000000000befce8>] _raw_spin_lock_irqsave+0x48/0xf0 softirqs last enabled at (66044): [<c000000000a8ac78>] dev_deactivate_queue.constprop.28+0xc8/0x160 softirqs last disabled at (66045): [<c0000000000306e0>] call_do_softirq+0x14/0x24 CPU: 0 PID: 93 Comm: kworker/0:2 Kdump: loaded Not tainted 4.20.0-rc6-00001-g1b50a8f03706 #7 Workqueue: events linkwatch_event Call Trace: [c0000003fffe7ae0] [c000000000bc83e4] dump_stack+0xe8/0x164 (unreliable) [c0000003fffe7b30] [c00000000015ba0c] ___might_sleep+0x2dc/0x320 [c0000003fffe7bb0] [c000000000391514] kmem_cache_alloc_trace+0x3e4/0x440 [c0000003fffe7c30] [d000000005b2309c] ibmvnic_reset+0x16c/0x360 [ibmvnic] [c0000003fffe7cc0] [d000000005b29834] ibmvnic_tasklet+0x1054/0x2010 [ibmvnic] [c0000003fffe7e00] [c0000000001224c8] tasklet_action_common.isra.12+0xd8/0x1c0 [c0000003fffe7e60] [c000000000bf1238] __do_softirq+0x1a8/0x64c [c0000003fffe7f90] [c0000000000306e0] call_do_softirq+0x14/0x24 [c0000003f3967980] [c00000000001ba50] do_softirq_own_stack+0x60/0xb0 [c0000003f39679c0] [c0000000001218a8] do_softirq+0xa8/0x100 [c0000003f39679f0] [c000000000121a74] __local_bh_enable_ip+0x174/0x180 [c0000003f3967a60] [c000000000bf003c] _raw_spin_unlock_bh+0x5c/0x80 [c0000003f3967a90] [c000000000a8ac78] dev_deactivate_queue.constprop.28+0xc8/0x160 [c0000003f3967ad0] [c000000000a8c8b0] dev_deactivate_many+0xd0/0x520 [c0000003f3967b70] [c000000000a8cd40] dev_deactivate+0x40/0x60 [c0000003f3967ba0] [c000000000a5e0c4] linkwatch_do_dev+0x74/0xd0 [c0000003f3967bd0] [c000000000a5e694] __linkwatch_run_queue+0x1a4/0x1f0 [c0000003f3967c30] [c000000000a5e728] linkwatch_event+0x48/0x60 [c0000003f3967c50] [c0000000001444e8] process_one_work+0x238/0x710 [c0000003f3967d20] [c000000000144a48] worker_thread+0x88/0x4e0 [c0000003f3967db0] [c00000000014e3a8] kthread+0x178/0x1c0 [c0000003f3967e20] [c00000000000bfd0] ret_from_kernel_thread+0x5c/0x6c Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-10 17:34:25 -08:00
Thomas Falcon	6c5c748908	ibmvnic: Convert reset work item mutex to spin lock ibmvnic_reset can create and schedule a reset work item from an IRQ context, so do not use a mutex, which can sleep. Convert the reset work item mutex to a spin lock. Locking debugger generated the trace output below. BUG: sleeping function called from invalid context at kernel/locking/mutex.c:908 in_atomic(): 1, irqs_disabled(): 1, pid: 120, name: kworker/8:1 4 locks held by kworker/8:1/120: #0: 0000000017c05720 ((wq_completion)"events"){+.+.}, at: process_one_work+0x188/0x710 #1: 00000000ace90706 ((linkwatch_work).work){+.+.}, at: process_one_work+0x188/0x710 #2: 000000007632871f (rtnl_mutex){+.+.}, at: rtnl_lock+0x30/0x50 #3: 00000000fc36813a (&(&crq->lock)->rlock){..-.}, at: ibmvnic_tasklet+0x88/0x2010 [ibmvnic] irq event stamp: 26293 hardirqs last enabled at (26292): [<c000000000122468>] tasklet_action_common.isra.12+0x78/0x1c0 hardirqs last disabled at (26293): [<c000000000befce8>] _raw_spin_lock_irqsave+0x48/0xf0 softirqs last enabled at (26288): [<c000000000a8ac78>] dev_deactivate_queue.constprop.28+0xc8/0x160 softirqs last disabled at (26289): [<c0000000000306e0>] call_do_softirq+0x14/0x24 CPU: 8 PID: 120 Comm: kworker/8:1 Kdump: loaded Not tainted 4.20.0-rc6 #6 Workqueue: events linkwatch_event Call Trace: [c0000003fffa7a50] [c000000000bc83e4] dump_stack+0xe8/0x164 (unreliable) [c0000003fffa7aa0] [c00000000015ba0c] ___might_sleep+0x2dc/0x320 [c0000003fffa7b20] [c000000000be960c] __mutex_lock+0x8c/0xb40 [c0000003fffa7c30] [d000000006202ac8] ibmvnic_reset+0x78/0x330 [ibmvnic] [c0000003fffa7cc0] [d0000000062097f4] ibmvnic_tasklet+0x1054/0x2010 [ibmvnic] [c0000003fffa7e00] [c0000000001224c8] tasklet_action_common.isra.12+0xd8/0x1c0 [c0000003fffa7e60] [c000000000bf1238] __do_softirq+0x1a8/0x64c [c0000003fffa7f90] [c0000000000306e0] call_do_softirq+0x14/0x24 [c0000003f3f87980] [c00000000001ba50] do_softirq_own_stack+0x60/0xb0 [c0000003f3f879c0] [c0000000001218a8] do_softirq+0xa8/0x100 [c0000003f3f879f0] [c000000000121a74] __local_bh_enable_ip+0x174/0x180 [c0000003f3f87a60] [c000000000bf003c] _raw_spin_unlock_bh+0x5c/0x80 [c0000003f3f87a90] [c000000000a8ac78] dev_deactivate_queue.constprop.28+0xc8/0x160 [c0000003f3f87ad0] [c000000000a8c8b0] dev_deactivate_many+0xd0/0x520 [c0000003f3f87b70] [c000000000a8cd40] dev_deactivate+0x40/0x60 [c0000003f3f87ba0] [c000000000a5e0c4] linkwatch_do_dev+0x74/0xd0 [c0000003f3f87bd0] [c000000000a5e694] __linkwatch_run_queue+0x1a4/0x1f0 [c0000003f3f87c30] [c000000000a5e728] linkwatch_event+0x48/0x60 [c0000003f3f87c50] [c0000000001444e8] process_one_work+0x238/0x710 [c0000003f3f87d20] [c000000000144a48] worker_thread+0x88/0x4e0 [c0000003f3f87db0] [c00000000014e3a8] kthread+0x178/0x1c0 [c0000003f3f87e20] [c00000000000bfd0] ret_from_kernel_thread+0x5c/0x6c Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-10 17:34:25 -08:00
Oz Shlomo	df2ef3bff1	net/mlx5e: Add GRE protocol offloading Add HW offloading support for TC flower filters configured on gretap/ip6gretap net devices. Signed-off-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-10 15:53:04 -08:00
Oz Shlomo	0621e6fc5e	net: Add netif_is_gretap()/netif_is_ip6gretap() Changed the is_gretap_dev and is_ip6gretap_dev logic from structure comparison to string comparison of the rtnl_link_ops kind field. This approach aligns with the current identification methods and function names of vxlan and geneve network devices. Convert mlxsw to use these helpers and use them in downstream mlx5 patch. Signed-off-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-10 15:53:04 -08:00
Oz Shlomo	101f4de9dd	net/mlx5e: Move TC tunnel offloading code to separate source file Move tunnel offloading related code to a separate source file for better code maintainability. Code refactoring with no functional change. Signed-off-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-10 15:53:04 -08:00
Oz Shlomo	54c177ca9c	net/mlx5e: Branch according to classified tunnel type Currently the tunnel offloading encap/decap methods assumes that VXLAN is the sole tunneling protocol. Lay the infrastructure for supporting multiple tunneling protocols by branching according to the tunnel net device kind. Encap filters tunnel type is determined according to the egress/mirred net device. Decap filters classify the tunnel type according to the filter's ingress net device kind. Distinguish between the tunnel type as defined by the SW model and the FW reformat type that specifies the HW operation being made. Signed-off-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-10 15:53:04 -08:00
Oz Shlomo	4d70564d1c	net/mlx5e: Refactor VXLAN tunnel decap offloading code Separates the vxlan header match handling from the matching on the general fields of ipv4/6 tunnels, thus allowing the common IP tunnel match code to branch in down stream patch, to multiple IP tunnels. This patch doesn't add any functionality. Signed-off-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-10 15:53:04 -08:00

... 3 4 5 6 7 ...

26391 Commits