linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-28 11:18:45 +07:00

Author	SHA1	Message	Date
Kiran Kumar	34425e8c75	octeontx2-af: Support to get NIX HW constants from AF This patch adds reading HW limits like number of Rx/Tx stats, number of queue IRQs supported per NIX LF from AF registers and sync them to PF/VF. Signed-off-by: Kiran Kumar <kirankumark@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-19 17:56:07 -08:00
Sunil Goutham	9b7dd87ac0	octeontx2-af: Support to modify min/max allowed packet lengths This patch adds support for RVU PF/VFs to modify min/max packet lengths allowed by HW. For VFs on PF0, settings will be automatically applied on LBK link. RX link's min/maxlen is configured to min/max of PF and it's all VFs. On the TX side if requested all SMQs attached to the requesting NIXLF will be updated with new min/max lengths. Also updates transmit credits for Tx links based on new maxlen. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-19 17:56:07 -08:00
Sunil Goutham	eac66686c6	octeontx2-af: Convert mbox handlers APIs to lowercase This patch converts all mailbox message handler API names to lowercase. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-19 17:56:07 -08:00
Heiner Kallweit	55d2ad7b90	r8169: improve chip version identification Only the upper 12 bits are used for chip identification, this helps to reduce the size of array mac_info. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-19 17:32:15 -08:00
Heiner Kallweit	3c72bf71a9	r8169: simplify ocp functions rtl8168_oob_notify is used in rtl8168dp_driver_start and rtl8168dp_driver_stop only, so we can rename it to r8168dp_oob_notify. The same applies to condition rtl_ocp_read_cond which can be renamed to rtl_dp_ocp_read_cond. This allows to simplify the code. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-19 17:32:15 -08:00
Heiner Kallweit	8b6dd85666	r8169: remove workaround for ancient gcc bug The kernel can't be built any longer with this ancient GCC version. Eventually it becomes clear what this statement actually does. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-19 17:32:15 -08:00
Heiner Kallweit	ad45ff0c12	r8169: remove manual padding in struct ring_info The compiler takes care of alignment and padding, I see no need to bother him with manual hints. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-19 17:32:15 -08:00
Heiner Kallweit	b10ceb5571	r8169: remove "not PCI Express" message The ones who want to know can easily identify whether chip is PCI or PCIe based on the chip name. I doubt there's any benefit in this message, so remove it. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-19 17:32:15 -08:00
Heiner Kallweit	8c0511ec52	r8169: remove print_mac_version The syslog message printed on driver load allows to easily identify the mac version number (based on chip name and XID). So we don't need this extra debug message which is wrong anyway because e.g. RTL_GIGA_MAC_VER_01 has value 0. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-19 17:32:14 -08:00
Heiner Kallweit	6f0d308855	r8169: use PCI_VDEVICE macro Using macro PCI_VDEVICE helps to simplify the PCI ID table. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-19 17:32:14 -08:00
Heiner Kallweit	559c3c046d	r8169: replace event_slow with irq_mask Recently the "slow event" handler was removed, therefore the member name isn't appropriate any longer. In addition store the full mask, including the RTL_EVENT_NAPI interrupt source bits. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-19 17:32:14 -08:00
Heiner Kallweit	97ad92f283	r8169: remove unused interrupt sources Setting PCSTimeout interrupt source was copied from the vendor driver which uses the chip programmable timer interrupt. The mainline driver doesn't use this timer interrupt. SYSErr indicates a PCI error and isn't defined on the PCIe models. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-19 17:32:14 -08:00
Heiner Kallweit	0f07bd850d	r8169: use dev_get_drvdata where possible Using dev_get_drvdata directly is simpler here. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-19 17:32:14 -08:00
Heiner Kallweit	fe716f8a33	r8169: merge rtl_irq_enable and rtl_irq_enable_all After the recent changes to the interrupt handler rtl_irq_enable and rtl_irq_enable_all can be merged. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-19 17:32:14 -08:00
Shay Agroskin	9184e51b5b	net/mlx5e: Fix failing ethtool query on FEC query error If FEC caps query fails when executing 'ethtool <interface>' the whole callback fails unnecessarily, fixed that by replacing the error return code with debug logging only. Fixes: `6cfa946050` ("net/mlx5e: Ethtool driver callback for query/set FEC policy") Signed-off-by: Shay Agroskin <shayag@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-11-19 15:33:31 -08:00
Shay Agroskin	64e2833484	net/mlx5e: Removed unnecessary warnings in FEC caps query Querying interface FEC caps with 'ethtool [int]' after link reset throws warning regading link speed. This warning is not needed as there is already an indication in user space that the link is not up. Fixes: `0696d60853` ("net/mlx5e: Receive buffer configuration") Signed-off-by: Shay Agroskin <shayag@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-11-19 15:33:31 -08:00
Shay Agroskin	febd72f27c	net/mlx5e: Fix wrong field name in FEC related functions This bug would result in reading wrong FEC capabilities for 10G/40G. Fixes: `2095b26414` ("net/mlx5e: Add port FEC get/set functions") Signed-off-by: Shay Agroskin <shayag@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-11-19 15:33:31 -08:00
Shay Agroskin	9cdeaab3b7	net/mlx5e: Fix a bug in turning off FEC policy in unsupported speeds Some speeds don't support turning FEC policy off. In case a requested FEC policy is not supported for a speed (including current speed), its new FEC policy would be: no FEC - if disabling FEC is supported for that speed unchanged - else Fixes: `2095b26414` ("net/mlx5e: Add port FEC get/set functions") Signed-off-by: Shay Agroskin <shayag@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-11-19 15:33:31 -08:00
Arthur Kiyanovski	4c23738a3f	net: ena: update driver version from 2.0.1 to 2.0.2 Update driver version due to critical bug fixes. Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-19 15:13:00 -08:00
Arthur Kiyanovski	58a54b9c62	net: ena: fix crash during ena_remove() In ena_remove() we have the following stack call: ena_remove() unregister_netdev() ena_destroy_device() netif_carrier_off() Calling netif_carrier_off() causes linkwatch to try to handle the link change event on the already unregistered netdev, which leads to a read from an unreadable memory address. This patch switches the order of the two functions, so that netif_carrier_off() is called on a regiestered netdev. To accomplish this fix we also had to: 1. Remove the set bit ENA_FLAG_TRIGGER_RESET 2. Add a sanitiy check in ena_close() both to prevent double device reset (when calling unregister_netdev() ena_close is called, but the device was already deleted in ena_destroy_device()). 3. Set the admin_queue running state to false to avoid using it after device was reset (for example when calling ena_destroy_all_io_queues() right after ena_com_dev_reset() in ena_down) Fixes: `944b28aa29` ("net: ena: fix missing lock during device destruction") Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-19 15:13:00 -08:00
Arthur Kiyanovski	e76ad21d07	net: ena: fix crash during failed resume from hibernation During resume from hibernation if ena_restore_device fails, ena_com_dev_reset() is called, and uses the readless read mechanism, which was already destroyed by the call to ena_com_mmio_reg_read_request_destroy(). This causes a NULL pointer reference. In this commit we switch the call order of the above two functions to avoid this crash. Fixes: `d7703ddbd7` ("net: ena: fix rare bug when failed restart/resume is followed by driver removal") Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-19 15:13:00 -08:00
Valentine Fatiev	228c4cd04d	net/mlx5e: Fix selftest for small MTUs Loopback test had fixed packet size, which can be bigger than configured MTU. Shorten the loopback packet size to be bigger than minimal MTU allowed by the device. Text field removed from struct 'mlx5ehdr' as redundant to allow send small packets as minimal allowed MTU. Fixes: `d605d66` ("net/mlx5e: Add support for ethtool self diagnostics test") Signed-off-by: Valentine Fatiev <valentinef@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-11-19 14:35:04 -08:00
Moshe Shemesh	0073c8f727	net/mlx5e: RX, verify received packet size in Linear Striding RQ In case of striding RQ, we use MPWRQ (Multi Packet WQE RQ), which means that WQE (RX descriptor) can be used for many packets and so the WQE is much bigger than MTU. In virtualization setups where the port mtu can be larger than the vf mtu, if received packet is bigger than MTU, it won't be dropped by HW on too small receive WQE. If we use linear SKB in striding RQ, since each stride has room for mtu size payload and skb info, an oversized packet can lead to crash for crossing allocated page boundary upon the call to build_skb. So driver needs to check packet size and drop it. Introduce new SW rx counter, rx_oversize_pkts_sw_drop, which counts the number of packets dropped by the driver for being too large. As a new field is added to the RQ struct, re-open the channels whenever this field is being used in datapath (i.e., in the case of linear Striding RQ). Fixes: `619a8f2a42` ("net/mlx5e: Use linear SKB in Striding RQ") Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-11-19 14:35:04 -08:00
Roi Dayan	1392f44bba	net/mlx5e: Apply the correct check for supporting TC esw rules split The mirror and not the output count is the one denoting a split. Fix to condition the offload attempt on the mirror count being > 0 along the firmware to have the related capability. Fixes: `592d365159` ("net/mlx5e: Parse mirroring action for offloaded TC eswitch flows") Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Yossi Kuperman <yossiku@mellanox.com> Reviewed-by: Chris Mi <chrism@mellanox.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-11-19 14:35:04 -08:00
Yuval Avnery	a1f240f180	net/mlx5e: Adjust to max number of channles when re-attaching When core driver enters deattach/attach flow after pci reset, Number of logical CPUs may have changed. As a result we need to update the cpu affiliated resource tables. 1. indirect rqt list 2. eq table Reproduction (PowerPC): echo 1000 > /sys/kernel/debug/powerpc/eeh_max_freezes ppc64_cpu --smt=on # Restart driver modprobe -r ... ; modprobe ... # Link up ifconfig ... # Only physical CPUs ppc64_cpu --smt=off # Inject PCI errors so PCI will reset - calling the pci error handler echo 0x8000000000000000 > /sys/kernel/debug/powerpc/<PCI BUS>/err_injct_inboundA Call trace when trying to add non-existing rqs to an indirect rqt: mlx5e_redirect_rqt+0x84/0x260 [mlx5_core] (unreliable) mlx5e_redirect_rqts+0x188/0x190 [mlx5_core] mlx5e_activate_priv_channels+0x488/0x570 [mlx5_core] mlx5e_open_locked+0xbc/0x140 [mlx5_core] mlx5e_open+0x50/0x130 [mlx5_core] mlx5e_nic_enable+0x174/0x1b0 [mlx5_core] mlx5e_attach_netdev+0x154/0x290 [mlx5_core] mlx5e_attach+0x88/0xd0 [mlx5_core] mlx5_attach_device+0x168/0x1e0 [mlx5_core] mlx5_load_one+0x1140/0x1210 [mlx5_core] mlx5_pci_resume+0x6c/0xf0 [mlx5_core] Create cq will fail when trying to use non-existing EQ. Fixes: `89d44f0a6c` ("net/mlx5_core: Add pci error handlers to mlx5_core driver") Signed-off-by: Yuval Avnery <yuvalav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-11-19 14:35:04 -08:00
Or Gerlitz	83621b7df6	net/mlx5e: Always use the match level enum when parsing TC rule match We get the match level (none, l2, l3, l4) while going over the match dissectors of an offloaded tc rule. When doing this, the match level enum and the not min inline enum values should be used, fix that. This worked accidentally b/c both enums have the same numerical values. Fixes: `d708f90298` ('net/mlx5e: Get the required HW match level while parsing TC flow matches') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-11-19 14:35:04 -08:00
Or Gerlitz	077ecd785d	net/mlx5e: Claim TC hw offloads support only under a proper build config Currently, we are only supporting tc hw offloads when the eswitch support is compiled in, but we are not gating the adevertizment of the NETIF_F_HW_TC feature on this config being set. Fix it, and while doing that, also avoid dealing with the feature on ethtool when the config is not set. Fixes: `e8f887ac6a` ('net/mlx5e: Introduce tc offload support') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-11-19 14:35:04 -08:00
Or Gerlitz	d3a80bb5a3	net/mlx5e: Don't match on vlan non-existence if ethertype is wildcarded For the "all" ethertype we should not care whether the packet has vlans. Besides being wrong, the way we did it caused FW error for rules such as: tc filter add dev eth0 protocol all parent ffff: \ prio 1 flower skip_sw action drop b/c the matching meta-data (outer headers bit in struct mlx5_flow_spec) wasn't set. Fix that by matching on vlan non-existence only if we were also told to match on the ethertype. Fixes: `cee2648762` ('net/mlx5e: Set vlan masks for all offloaded TC rules') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reported-by: Slava Ovsiienko <viacheslavo@mellanox.com> Reviewed-by: Jianbo Liu <jianbol@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-11-19 14:35:04 -08:00
Denis Drozdov	acf3766b36	net/mlx5e: IPoIB, Reset QP after channels are closed The mlx5e channels should be closed before mlx5i_uninit_underlay_qp puts the QP into RST (reset) state during mlx5i_close. Currently QP state incorrectly set to RST before channels got deactivated and closed, since mlx5_post_send request expects QP in RTS (Ready To Send) state. The fix is to keep QP in RTS state until mlx5e channels get closed and to reset QP afterwards. Also this fix is simply correct in order to keep the open/close flow symmetric, i.e mlx5i_init_underlay_qp() is called first thing at open, the correct thing to do is to call mlx5i_uninit_underlay_qp() last thing at close, which is exactly what this patch is doing. Fixes: `dae37456c8` ("net/mlx5: Support for attaching multiple underlay QPs to root flow table") Signed-off-by: Denis Drozdov <denisd@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-11-19 14:35:04 -08:00
Raed Salem	f2b18732ee	net/mlx5: IPSec, Fix the SA context hash key The commit "net/mlx5: Refactor accel IPSec code" introduced a bug where asynchronous short time change in hash key value by create/release SA context might happen during an asynchronous hash resize operation this could cause a subsequent remove SA context operation to fail as the key value used during resize is not the same key value used when remove SA context operation is invoked. This commit fixes the bug by defining the SA context hash key such that it includes only fields that never change during the lifetime of the SA context object. Fixes: `d6c4f0298c` ("net/mlx5: Refactor accel IPSec code") Signed-off-by: Raed Salem <raeds@mellanox.com> Reviewed-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-11-19 14:35:04 -08:00
David S. Miller	f2be6d710d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2018-11-19 10:55:00 -08:00
Shalom Toledo	bae4e10983	mlxsw: spectrum: Expose discard counters via ethtool Expose packets discard counters via ethtool to help with debugging. Signed-off-by: Shalom Toledo <shalomt@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-18 19:02:07 -08:00
thesven73@gmail.com	cddaf02bcb	tg3: optionally use eth_platform_get_mac_address() to get mac address This function will try to determine the mac address via the devicetree, or via an architecture-specific method (e.g. a PROM on SPARC). The SPARC-specific code in this driver (#ifdef SPARC) did exactly this, and is therefore removed. Note that you can now specify the tg3 mac address via the devicetree, on any platform, not just SPARC: Devicetree example: (see Documentation/devicetree/bindings/pci/pci.txt) &pcie { host@0 { #address-cells = <3>; #size-cells = <2>; reg = <0 0 0 0 0>; bcm5778: bcm5778@0 { reg = <0 0 0 0 0>; mac-address = [CA 11 AB 1E 10 01]; }; }; }; Signed-off-by: Sven Van Asbroeck <svendev@arcx.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-18 12:57:59 -08:00
Doug Berger	c5a54bbcec	net: bcmgenet: abort suspend on error If an error occurs during suspension of the driver the driver should restore the hardware configuration and return an error to force the system to resume. Fixes: `0db55093b5` ("net: bcmgenet: return correct value 'ret' from bcmgenet_power_down") Signed-off-by: Doug Berger <opendmb@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-17 22:04:39 -08:00
Doug Berger	a94cbf03eb	net: bcmgenet: code movement This commit switches the order of bcmgenet_suspend and bcmgenet_resume in the file to prevent the need for a forward declaration in the next commit and to make the review of that commit easier. Signed-off-by: Doug Berger <opendmb@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-17 22:04:38 -08:00
Yunsheng Lin	cdca4c485d	net: hns3: up/down netdev in hclge module when setting mtu Currently netdev is down in enet module, and it is before mtu range checking in hclge module, which may be cause netdev being down unnecessarily. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-17 21:57:29 -08:00
Yunsheng Lin	818f167587	net: hns3: Add mtu setting support for vf The patch adds mtu setting support for vf, currently vf and pf share the same hardware mtu setting. Mtu set by vf must be less than or equal to pf' mtu, and mtu set by pf must be greater than or equal to vf' mtu. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-17 21:57:29 -08:00
Yunsheng Lin	a6d818e31d	net: hns3: Add vport alive state checking support Currently there is no way for pf to know if a vf device is alive or not, so PF does not know which vf to notify when reset happens, or which vf's mtu is invalid when vf and pf share the same hardware mtu setting. This patch adds vport alive state checking support, in order to support the above scenario. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-17 21:57:29 -08:00
Yunsheng Lin	e6d7d79d3e	net: hns3: Refactor mac mtu setting related functions This patch refactors mac mtu setting related functions, normalizes the use of mps and mtu. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-17 21:57:29 -08:00
Yunsheng Lin	a0b4371751	net: hns3: Support two vlan header when setting mtu This patch adds supports for two vlan header when setting mtu. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-17 21:57:29 -08:00
Rob Herring	d7b4a2f232	net: fsl: Use device_type helpers to access the node type Remove directly accessing device_node.type pointer and use the accessors instead. This will eventually allow removing the type pointer. Cc: "David S. Miller" <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-17 21:52:58 -08:00
Colin Ian King	7c460cf9cd	net: aquantia: fix spelling mistake "specfield" -> "specified" There is a spelling mistake in a netdev_err message. Fix this. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-17 21:36:09 -08:00
YueHaibing	098aafaa68	net: aquantia: cleanup err handing in hw_atl_utils_fw_rpc_wait 'err' always be 0 in the two places. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-17 21:15:04 -08:00
Arjun Vynipadath	2391b0030e	cxgb4: Remove SGE_HOST_PAGE_SIZE dependency on page size The SGE Host Page Size has nothing to do with the actual Host Page Size. It's the SGE's BAR2 Doorbell/GTS Page Size for interpreting the SGE Ingress/Egress Queue per Page values. Firmware reads all of these things and makes all the subsequent changes necessary. The Host Driver uses the SGE Host Page Size in order to properly calculate BAR2 Offsets. Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Arjun Vynipadath <arjun@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-17 20:36:25 -08:00
Ioana Ciocoi Radulescu	569dac6a5a	dpaa2-eth: bql support Add support for byte queue limit. On NAPI poll, we save the total number of Tx confirmed frames/bytes and register them with bql at the end of the poll function. Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-16 20:12:31 -08:00
Ioana Ciocoi Radulescu	dbcdf72898	dpaa2-eth: Update callback signature Change the frame consume callback signature: * the entire FQ structure is passed to the callback instead of just the queue index * the NAPI structure can be easily obtained from the channel it is associated to, so we don't need to pass it explicitly Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-16 20:12:31 -08:00
Ioana Ciocoi Radulescu	b0e4f37b01	dpaa2-eth: Don't use multiple queues per channel The DPNI object on which we build a network interface has a certain number of {Rx, Tx, Tx confirmation} frame queues as resources. The default hardware setup offers one queue of each type, as well as one DPCON channel, for each core available in the system. There are however cases where the number of queues is greater than the number of cores or channels. Until now, we configured and used all the frame queues associated with a DPNI, even if it meant assigning multiple queues of one type to the same channel. Update the driver to only use a number of queues equal to the number of channels, ensuring each channel will contain exactly one Rx and one Tx confirmation queue. >From the user viewpoint, this change is completely transparent. Performance wise there is no impact in most scenarios. In case the number of queues is larger than and not a multiple of the number of channels, Rx hash distribution offers now better load balancing between cores, which can have a positive impact on overall system performance. Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-16 20:12:31 -08:00
Christophe JAILLET	06bc4d0079	net: lantiq: Fix returned value in case of error in 'xrx200_probe()' Return 'err' in the error handling path instead of 0. Return explicitly 0 in the normal path, instead of 'err', which is known to be 0 at this point. Fixes: `fe1a56420c` ("net: lantiq: Add Lantiq / Intel VRX200 Ethernet driver") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-16 19:46:49 -08:00
Colin Ian King	790cd1a8f0	net: hns3: fix spelling mistake "failded" -> "failed" Trivial fix, the spelling of "failded" is incorrect in dev_err and dev_warn messages. Fix this. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-16 19:34:50 -08:00
Maxime Chevallier	83e65df6df	net: mvneta: Don't advertise 2.5G modes Using 2.5G speed relies on the SerDes lanes being configured accordingly. The lanes have to be reconfigured to switch between 1G and 2.5G, and for now only the bootloader does this configuration. In the case we add a Comphy driver to handle switching the lanes dynamically, it's better for now to stick with supporting only 1G and add advertisement for 2.5G once we really are capable of handling both speeds without problem. Since the interface mode is initialy taken from the DT, we want to make sure that adding comphy support won't break boards that don't update their dtb. Fixes: `da58a931f2` ("net: mvneta: Add support for 2500Mbps SGMII") Reported-by: Andrew Lunn <andrew@lunn.ch> Reported-by: Russell King <linux@armlinux.org.uk> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-16 19:23:45 -08:00
Andrew Morton	a97b956533	drivers/net/ethernet/qlogic/qed/qed_rdma.h: fix typo Add missing semicolon. Fixes: `291d57f67d` ("qed: Fix rdma_info structure allocation") Cc: Michal Kalderon <michal.kalderon@cavium.com> Cc: Denis Bolotin <denis.bolotin@cavium.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-15 16:21:00 -08:00
Aya Levin	a463146e67	net/mlx4: Fix UBSAN warning of signed integer overflow UBSAN: Undefined behavior in drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:626:29 signed integer overflow: 1802201963 + 1802201963 cannot be represented in type 'int' The union of res_reserved and res_port_rsvd[MLX4_MAX_PORTS] monitors granting of reserved resources. The grant operation is calculated and protected, thus both members of the union cannot be negative. Changed type of res_reserved and of res_port_rsvd[MLX4_MAX_PORTS] from signed int to unsigned int, allowing large value. Fixes: `5a0d0a6161` ("mlx4: Structures and init/teardown for VF resource quotas") Signed-off-by: Aya Levin <ayal@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-15 16:09:31 -08:00
Tariq Toukan	3ea7e7ea53	net/mlx4_core: Fix uninitialized variable compilation warning Initialize the uid variable to zero to avoid the compilation warning. Fixes: `7a89399ffa` ("net/mlx4: Add mlx4_bitmap zone allocator") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-15 16:09:31 -08:00
Jack Morgenstein	bd85fbc203	net/mlx4_core: Zero out lkey field in SW2HW_MPT fw command When re-registering a user mr, the mpt information for the existing mr when running SRIOV is obtained via the QUERY_MPT fw command. The returned information includes the mpt's lkey. This retrieved mpt information is used to move the mpt back to hardware ownership in the rereg flow (via the SW2HW_MPT fw command when running SRIOV). The fw API spec states that for SW2HW_MPT, the lkey field must be zero. Any ConnectX-3 PF driver which checks for strict spec adherence will return failure for SW2HW_MPT if the lkey field is not zero (although the fw in practice ignores this field for SW2HW_MPT). Thus, in order to conform to the fw API spec, set the lkey field to zero before invoking SW2HW_MPT when running SRIOV. Fixes: `e630664c83` ("mlx4_core: Add helper functions to support MR re-registration") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-15 16:09:30 -08:00
David S. Miller	7e18750cda	Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2018-11-14 This series contains updates to i40e and virtchnl. Lance Roy updates i40e to use lockdep_assert_held() instead of spin_is_locked(), since it is better suited to check locking requirements. Jan improves the code readability in XDP by adding the use of a local variable. Provides protection on methods that create/modify/destroy VF's via locking mechanism to prevent unstable behaviour and potential kernel panics. Krzysztof adds a hardware capability flag to indicate whether firmware supports stopping the LLDP agent. Patryk replaces the use of strncpy() with strlcpy() to ensure the buffer is NULL terminated. Mitch fixes the issue of trying to start nway on devices that do not support auto-negotiation, by checking the autoneg state before attempting to restart nway. Alice updates virtchnl to keep the checks all together for ease of readability and consistency. Also fixed a "off by one" error in the number of traffic classes being calculated. Richard fixed VF port VLANs, where the priority bits were incorrectly set because the incorrect shift and mask bits were being used. Alan adds a bit to set and check if a timeout recovery is already pending to prevent overlapping transmit timeout recovery. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-15 15:05:11 -08:00
Jiri Pirko	c22291f7cf	mlxsw: spectrum: acl: Implement delta for ERP Allow ERP sharing for multiple mask. Do it by properly implementing delta_create() objagg object. Use the computed delta info for inserting rules in A-TCAM. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-15 14:43:43 -08:00
Jiri Pirko	c293ba3403	mlxsw: spectrum: acl: Push code related to num_ctcam_erps inc/dec into separate helpers Later on the same code is going to be needed for deltas as well. So push the procedures related to increment and decrement of num_ctcam_erps into a separate helpers. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-15 14:43:43 -08:00
Jiri Pirko	59600844cf	mlxsw: spectrum: acl: Remove mlxsw_afk_encode() block range args and key/mask check Since two remaining users of mlxsw_afk_encode() do not specify block ranges to work on, remove the args. Also, key/mask is always non-NULL now, so skip the checks. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-15 14:43:43 -08:00
Jiri Pirko	b1ce60e621	mlxsw: spectrum: acl: Don't encode the key again in mlxsw_sp_acl_atcam_12kb_lkey_id_get() No need to do key encoding again in mlxsw_sp_acl_atcam_12kb_lkey_id_get(). Instead of that, introduce a new helper that would just clear unused blocks. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-15 14:43:43 -08:00
Jiri Pirko	3bc6f3858a	mlxsw: core_acl: Change order of args of ops->encode_block() Change order so it is aligned with the usual case where the "write_to" buffer comes as the first arg. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-15 14:43:43 -08:00
Jiri Pirko	d07cd66060	mlxsw: spectrum: acl: Pass key pointer to master_mask_set/clear The device requires that the master mask of each region will be composed from a logical OR between all the unmasked bits in the region. Currently, this is just a logical OR between all the eRPs used in the region, but the next patch is going to introduce delta bits support which need to be taken into account as well. Since the eRP does not include the delta bits, pass the key pointer to mlxsw_sp_acl_erp_master_mask_set/clear instead. Convert key->mask to the bitmap on fly. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-15 14:43:43 -08:00
Jiri Pirko	c71abd7d94	mlxsw: spectrum: acl_erp: Convert to use objagg for tracking ERPs Currently the ERPs are tracked internally in a hashtable. Benefit from the newly introduced objagg library and use it to track ERPs. At this point, there is no nesting of objects done, as the delta_create callback always returns -EOPNOTSUPP. On the way, add "mask" into ERP mask get and set functions and struct names. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-15 14:43:43 -08:00
David S. Miller	f0739e6517	Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 100GbE Intel Wired LAN Driver Updates 2018-11-13 This series contains updates to the ice driver only. Brett cleans up debug print messages by removing useless or duplicate messages, and make sure we assign the hardware head pointer to head instead of the software head pointer. Resolved an issue when disabling SRIOV we were trying to stop queues multiple times, so make sure we disable SRIOV before stopping transmit and receive queues for VF. Tony fixes a potential NULL pointer dereference during a VF reset. Anirudh resolves an issue where we were releasing the VSI before removing the VSI scheduler node, which was resulting in an error "Failed to set LAN Tx queue context, error: -1". Also fixed the guaranteed number of VSIs available and used by discovering the device capabilities to determine the 'guar_num_vsi' per function, rather than always using the theoretical max number of VSIs every time. Dave avoids a deadlock by nesting RTNL locking, so added a boolean to determine if the RTNL lock is already held. Lev fixes bad mask values which would break compilation. Piotr increases the receive queue disable timeout since it can take additional time to finish all pending queue requests. Usha resolves an issue of VLAN priority tagged traffic not appearing on all traffic classes, which was causing ETS bandwidth shaping to not work as expected. Henry fixes the reset path to cleanup the old scheduler tree before rebuilding it. Md Fahad removes a unnecessary check which was causing a driver load error on platforms with more than 128 cores. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-15 11:23:30 -08:00
Ganesh Goudar	ebcd210e93	cxgb4: fix thermal zone build error with CONFIG_THERMAL=m and cxgb4 as built-in build fails, and 'commit `e70a57fa59` ("cxgb4: fix thermal configuration dependencies")' tries to fix it but when cxgb4i is made built-in build fails again, use IS_REACHABLE instead of IS_ENABLED to fix the issue. Fixes: `e70a57fa59` (cxgb4: fix thermal configuration dependencies) Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-15 09:49:07 -08:00
Peng Li	a6d53b97a2	net: hns3: Adds GRO params to SKB for the stack When HW GRO enable, protocol stack will not do GRO again, driver should add gro param to the skb for the protocol stack.. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-15 09:44:46 -08:00
Peng Li	81ae0e0491	net: hns3: Add skb chain when num of RX buf exceeds MAX_SKB_FRAGS MAX_SKB_FRAGS in protocol stack is defined as: MAX_SKB_FRAGS is 17 when PAGE_SIZE is 4K. If HW enable GRO, it may merge small packets and the rx buffer may be more than MAX_SKB_FRAGS. So driver will add skb chain when RX buffer num. more than MAX_SKB_FRAGS. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-15 09:44:46 -08:00
Peng Li	5c9f6b3935	net: hns3: Add support for ethtool -K to enable/disable HW GRO This patch adds support of ethtool -K to enable/disable hardware GRO in HNS3 PF/VF driver. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-15 09:44:46 -08:00
Peng Li	e559709505	net: hns3: Add handling of GRO Pkts not fully RX'ed in NAPI poll The "FE bit" in the description means the last description for a packets. When HW GRO enable, HW write data to ring every packet/buffer, there is greater probability that driver handle with the describtion but HW still not set the "FE bit". When drier handle the packet and HW still not set "FE bit", driver stores skb and bd_num in rx ring, and continue to use the skb and bd_num in next napi. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-15 09:44:46 -08:00
Peng Li	b26a6fea22	net: hns3: Enable HW GRO for Rev B(=0x21) HNS3 hardware HNS3 hardware Revision B(=0x21) supports Hardware GRO feature. This patch enables this feature in the HNS3 PF/VF driver. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-15 09:44:46 -08:00
Vasundhara Volam	8dc5ae2d48	bnxt_en: Fix filling time in bnxt_fill_coredump_record() Fix the year and month offset while storing it in bnxt_fill_coredump_record(). Fixes: `6c5657d085` ("bnxt_en: Add support for ethtool get dump.") Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-15 09:37:15 -08:00
Michael Chan	83eb5c5cff	bnxt_en: Add software "missed_irqs" counter. To keep track of the number of times the workaround code for 57500 A0 has been triggered. This is a per NQ counter. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-15 09:37:15 -08:00
Michael Chan	ffd7762170	bnxt_en: Workaround occasional TX timeout on 57500 A0. Hardware can sometimes not generate NQ MSIX with a single pending CP ring entry. This seems to always happen at the last entry of the CP ring before it wraps. Add logic to check all the CP rings for pending entries without the CP ring consumer index advancing. Calling HWRM_DBG_RING_INFO_GET to read the context of the CP ring will flush out the NQ entry and MSIX. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-15 09:37:15 -08:00
Michael Chan	addd4df6d7	bnxt_en: Disable RDMA support on the 57500 chips. There is no RDMA support on 57500 chips yet, so prevent bnxt_re from registering on these chips. There is intermittent failure if bnxt_re is allowed to register and proceed with RDMA operations. Fixes: `1ab968d2f1` ("bnxt_en: Add PCI ID for BCM57508 device.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-15 09:37:15 -08:00
Michael Chan	d19819297d	bnxt_en: Fix rx_l4_csum_errors counter on 57500 devices. The software counter structure is defined in both the CP ring's structure and the NQ ring's structure on the new devices. The legacy code adds the counter to the CP ring's structure and the counter won't get displayed since the ethtool code is looking at the NQ ring's structure. Since all other counters are contained in the NQ ring's structure, it makes more sense to count rx_l4_csum_errors in the NQ. Fixes: `50e3ab7836` ("bnxt_en: Allocate completion ring structures for 57500 series chips.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-15 09:37:15 -08:00
Michael Chan	6ba990384e	bnxt_en: Fix RSS context allocation. Recent commit has added the reservation of RSS context. This requires bnxt_hwrm_vnic_qcaps() to be called before allocating any RSS contexts. The bnxt_hwrm_vnic_qcaps() call sets up proper flags that will determine how many RSS contexts to allocate to support NTUPLE. This causes a regression that too many RSS contexts are being reserved and causing resource shortage when enabling many VFs. Fix it by calling bnxt_hwrm_vnic_qcaps() earlier. Fixes: `41e8d79837` ("bnxt_en: Modify the ring reservation functions for 57500 series chips.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-15 09:37:15 -08:00
Alan Brady	d5585b7b68	i40e: prevent overlapping tx_timeout recover If a TX hang occurs, we attempt to recover by incrementally resetting. If we're starved for CPU time, it's possible the reset doesn't actually complete (or even fire) before another tx_timeout fires causing us to fly through the different resets without actually doing them. This adds a bit to set and check if a timeout recovery is already pending and, if so, bail out of tx_timeout. The bit will get cleared at the end of i40e_rebuild when reset is complete. Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-14 10:56:34 -08:00
Mitch Williams	7cd8eb0861	i40e: suppress bogus error message The i40e driver complains about unprivileged VFs trying to configure promiscuous mode each time a VF reset occurs. This isn't the fault of the poor VF driver - the PF driver itself is making the request. To fix this, skip the privilege check if the request is to disable all promiscuous activity. This gets rid of the bogus message, but doesn't affect privilege checks, since we really only care if the unprivileged VF is trying to enable promiscuous mode. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-14 10:56:34 -08:00
Richard Rodriguez	211257a499	i40e: Use correct shift for VLAN priority When using port VLAN, for VFs, and setting priority bits, the device was sending out incorrect priority bits, and also setting the CFI bit incorrectly. To fix this, changed shift and mask bit definition for this function, to use the correct ones. Signed-off-by: Richard Rodriguez <richard.rodriguez@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-14 10:56:33 -08:00
Jacob Keller	61bfb06005	i40e: always set ks->base.speed in i40e_get_settings_link_up In i40e_get_settings_link_up, set ks->base.speed to SPEED_UNKNOWN in the case where we don't know the link speed. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-14 10:56:33 -08:00
Mitch Williams	7c3758f783	i40e: don't restart nway if autoneg not supported On link types that do not support autoneg, we cannot attempt to restart nway negotiation. This results in a dead link that requires a power cycle to remedy. Fix this by saving off the autoneg state and checking this value before we try to restart nway. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-14 10:56:33 -08:00
Patryk Małek	5734fe8748	i40e: Allow disabling FW LLDP on X722 devices This patch allows disabling FW LLDP agent on X722 devices. It also changes a source of information for this feature from pf->hw_features to pf->hw.flags which are set in i40e_init_adminq. Signed-off-by: Patryk Małek <patryk.malek@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-14 10:56:33 -08:00
Alice Michael	c95cb7b25f	i40e: update driver version The version numbers have not been kept up to date and this is an effort to ammend that. Signed-off-by: Alice Michael <alice.michael@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-14 10:56:33 -08:00
Jan Sokolowski	f5a7b21b24	i40e: Protect access to VF control methods A scenario has been found in which simultaneous addition/removal and modification of VF's might cause unstable behaviour, up to and including kernel panics. Protect the methods that create/modify/destroy VF's by locking them behind an atomically set bit in PF status bitfield. Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-14 10:56:33 -08:00
Patryk Małek	4ff2d85403	i40e: Replace strncpy with strlcpy to ensure null termination Using strncpy allows destination buffer to be not null terminated after the copying takes place. strlcpy ensures that's not the case by explicitly setting last element in the buffer as '\0'. Signed-off-by: Patryk Małek <patryk.malek@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-14 10:56:33 -08:00
Krzysztof Galazka	de10933e37	i40e: Add capability flag for stopping FW LLDP Add HW capability flag to indicate that firmware supports stopping LLDP agent. This feature has been added in FW API 1.7 for XL710 devices and 1.6 for X722. Also raise expected minor version number for X722 FW API to 6. Signed-off-by: Krzysztof Galazka <krzysztof.galazka@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-14 10:56:33 -08:00
Jan Sokolowski	8554768c2c	i40e: Use a local variable for readability Use a local variable to make the code a bit more readable. Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-14 10:56:33 -08:00
Lance Roy	6a9a5ec10e	i40e: Replace spin_is_locked() with lockdep lockdep_assert_held() is better suited to checking locking requirements, since it won't get confused when someone else holds the lock. This is also a step towards possibly removing spin_is_locked(). Signed-off-by: Lance Roy <ldr709@gmail.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-14 10:56:33 -08:00
Jakub Kicinski	bd3b5d462a	nfp: abm: restructure Qdisc handling In preparation of handling more Qdisc types switch to a different offload strategy. We have now recreated the Qdisc hierarchy in the driver. Every time the hierarchy changes parse it, and update the configuration of the HW accordingly. While at it drop the support of pretending that we can instantiate a single queue on a multi-queue device in HW/FW. MQ is now required, and each queue will have its own instance of RED. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-14 08:51:28 -08:00
Jakub Kicinski	52db4eaca5	nfp: abm: save RED's parameters Use the new driver Qdisc structure to keep track of parameters of RED Qdiscs. This way as the Qdisc moves around in the hierarchy we will be able to configure the HW appropriately. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-14 08:51:28 -08:00
Jakub Kicinski	6c5dbda0d4	nfp: abm: reset RED's child based on limit RED qdisc will replace its child Qdisc with a new FIFO queue if it is reconfigured and the limit parameter is not 0. This means that when it's created with limit of 0 it will have no FIFO, and all packets will be dropped. If it's changed and limit is specified it will loose its existing child (implicit graft). Make sure we mark RED Qdisc child as NFP_QDISC_UNTRACKED if its not the expected FIFO. nfp_abm_qdisc_replace() will return 1 if Qdisc already existed. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-14 08:51:28 -08:00
Jakub Kicinski	6b8417b7e6	nfp: abm: build full Qdisc hierarchy based on graft notifications Using graft notifications recreate in the driver the full Qdisc hierarchy. Keep track of how many times each Qdisc is attached to the hierarchy to make sure we don't offload Qdiscs which are attached multiple times (device queues can't be shared). For graft events of Qdiscs we don't know exist make the child as invalid/untracked. Note that MQ Qdisc doesn't send destruction events reliably when device is dismantled, so we need to manually clean out the children otherwise we'd think Qdiscs which are still in use are getting freed. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-14 08:51:28 -08:00
Jakub Kicinski	aee7539c58	nfp: abm: allocate Qdisc child table To keep track of Qdisc hierarchy allocate a table for children for each Qdisc. RED Qdisc can only have one child. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-14 08:51:27 -08:00
Jakub Kicinski	1853125889	nfp: abm: remember which Qdisc is root Keep track of which Qdisc is currently root. We need to implement TC_SETUP_ROOT_QDISC handling, and for completeness also clear the root Qdisc pointer when it's freed. TC_SETUP_ROOT_QDISC isn't always sent when device is dismantled. Remembering the root Qdisc will allow us to build the entire hierarchy in following patches. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-14 08:51:27 -08:00
Jakub Kicinski	4f5681d088	nfp: abm: track all offload-enabled qdiscs Allocate an object corresponding to any offloaded qdisc we are informed about by the kernel. Not only the qdiscs we have a chance of offloading. The count of created objects will be used to decide whether the ethtool TC offload can be disabled, since otherwise we may miss destroy commands. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-14 08:51:27 -08:00
Jakub Kicinski	6666f545e9	nfp: abm: keep track of all RED thresholds Instead of writing the threshold out when Qdisc is configured and not remembering it move to a scheme where we remember all thresholds. When configuration changes parse the offloaded Qdiscs and set thresholds appropriately. This will help future extensions. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-14 08:51:27 -08:00
Jakub Kicinski	08990494e5	nfp: abm: rename qdiscs -> red_qdiscs Rename qdiscs member to red_qdiscs. One of following patches will use the name qdiscs for tracking all qdisc types. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-14 08:51:27 -08:00
Dmitry Bogdanov	7975d2aff5	net: aquantia: add support of rx-vlan-filter offload Since it uses the same NIC table as rx flow vlan filter therefore rx-flow vlan filter accepts only vlans that present on the interface in case of rx-vlan-filter is on. Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-14 08:48:37 -08:00
Dmitry Bogdanov	9a8cac4b4d	net: aquantia: add ethertype and PCP to rx flow filters L2 EtherType filters allows to filter packet by EtherType field or both EtherType and User Priority (PCP) field of 802.1Q. UserPriority (vlan) parameter must be accompanied by mask 0x1FFF. That is to distinguish VLAN filter from L2 Ethertype filter with UserPriority since both User Priority and VLAN ID are passed in the same 'vlan' parameter. Example: To add a filter that directs IP4 packess of priority 3 to queue 3: ethtool -N <ethX> flow-type ether proto 0x800 vlan 0x600 m 0x1FFF \ action 3 loc 16 Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-14 08:48:37 -08:00
Dmitry Bogdanov	54bcb3d162	net: aquantia: add vlan id to rx flow filters The VLAN filter (VLAN id) is compared against 16 filters. VLAN id must be accompanied by mask 0xF000. That is to distinguish VLAN filter from L2 Ethertype filter with UserPriority since both User Priority and VLAN ID are passed in the same 'vlan' parameter. Flow type may be any as it is not matched for VLAN filter. Due to fixed order of the rules in the NIC, the location 0-15 are reserved for vlan filters. Example: To add a rule that directs packets from VLAN 2001 to queue 5: ethtool -N <ethX> flow-type ip4 vlan 2001 m 0xF000 action 5 loc 0 Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-14 08:48:37 -08:00
Dmitry Bogdanov	a6ed6f2269	net: aquantia: add support of L3/L4 ntuple filters Add support of L3/L4 5-tuple {protocol, src-ip, dst-ip, src-port, dst-port} filters. Mask is not supported. Src-port and dst-port are only compared for TCP/UDP/SCTP packets. Both IPv4 and IPv6 are supported. The supported actions are the drop and the queue assignment. Due to fixed order of the rules in the NIC, the location 32-39 are reserved for L3/L4 5-tuple filters. The locations 32 and 36 are reserved for IPv6 filters. Examples: sudo ethtool -N eth0 flow-type ip6 src-ip 2001:db8:0:f101::2 \ dst-ip 2001:db8:0:f101::5 action -1 loc 36 sudo ethtool -N eth0 flow-type udp4 src-ip 10.0.0.4 \ dst-ip 10.0.0.7 src-port 2000 dst-port 2001 action 2 loc 32 Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-14 08:48:37 -08:00
Dmitry Bogdanov	8d0bcb012f	net: aquantia: add infrastructure for ntuple rules Add infrastructure to support ntuple filter configuration. Add rule, remove rule, reapply on interface up. Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-14 08:48:37 -08:00
Dmitry Bogdanov	23e7a718a4	net: aquantia: add rx-flow filter definitions Add missing register definitions and the functions accessing them related to rx-flow filters. Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-14 08:48:37 -08:00
Ivan Khoronzhuk	1ebb2446c3	net: ethernet: ti: cpsw: allow vlan tagged packets to be timestamped Allow vlan tagged packets to be timestamped, as no any restrictions for this. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-13 16:29:59 -08:00
Ivan Khoronzhuk	a942312034	net: ethernet: ti: cpts: move enable/disable flags outside of cpts module Each slave has it's own receive timestamp filter. But cpts rx/tx timestamp enable flags are used to allow ts retrieve only for one user. This limitation causes data path redundancy and setting overlap if cpsw module is in dual-mac mode for instance. If rx ts is enabled only for one port - the second interface must expect every incoming packet to be PTP packet w/o absolutely any reason, and if it's PTP - do unneeded stuff, as rx filter for second port is not set and cpts fifo is not supposed to contain appropriate ts event. That's not correct. So, to fix control overlap and avoid redundant CPU cycles, the patch splits rx/tx ts enable flags between network devices. After the patch, PTP timestamping still should be used for only one port (or PTP id counter has to be different for both ports as cpts IP is common). Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-13 16:29:59 -08:00
Ivan Khoronzhuk	f19dcd5f11	net: ethernet: ti: cpts: purge staled skbs from txq The overflow event is running with 1 jiffy in case if txq is not empty, but it can be emptied completely only if next tx event consumes skb or deletes staled skb from the txq. In case of staled skb, that can happen for some unpredictable reason (the ts event was lost or timed out), the overflow event can be generated quite long time consuming CPU w/o reason before next tx event happens. To avoid it, purge txq before increasing overflow event rate. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-13 16:29:59 -08:00
Ivan Khoronzhuk	d0e14c4d9b	net: ethernet: ti: cpts: correct debug for expired txq skb The msgtype and seqid that is smth that belongs to event for comparison but not for staled txq skb. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-13 16:29:59 -08:00
Md Fahad Iqbal Polash	ef878d6086	ice: Remove ICE_MAX_TXQ_PER_TXQG check when configuring Tx queue This patch removes the condition checking of VSI TX queue number to ICE_MAX_TXQ_PER_TXQG. This is an unnecessary check and causes a driver load error on hosts that have more than 128 cores. Signed-off-by: Md Fahad Iqbal Polash <md.fahad.iqbal.polash@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-13 09:09:26 -08:00
Henry Tieman	47e3e53cea	ice: Destroy scheduler tree in reset path The scheduler tree is is always rebuilt during reset. The existing code adds new scheduler nodes for queues but may not clean up earlier nodes. This patch removed the old scheduler tree during reset before it is rebuilt. Signed-off-by: Henry Tieman <henry.w.tieman@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-13 09:09:26 -08:00
Usha Ketineni	c5a2a4a388	ice: Fix to make VLAN priority tagged traffic to appear on all TCs This patch includes below changes to resolve the issue of ETS bandwidth shaping to work. 1. Allocation of Tx queues is accounted for based on the enabled TC's in ice_vsi_setup_q_map() and enabled the Tx queues on those TC's via ice_vsi_cfg_txqs() 2. Get the mapped netdev TC # for the user priority and set the priority to TC mapping for the VSI. Signed-off-by: Usha Ketineni <usha.k.ketineni@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-13 09:09:26 -08:00
Brett Creeley	99fc1057b4	ice: Call pci_disable_sriov before stopping queues for VF Previous to this commit the driver was immediately stopping Tx/Rx queues when doing the following "echo 0 > sriov_numvfs" and then it was calling pci_disable_sriov if the VFs are not assigned. This was causing the VIRTCHNL_OP_DISABLE_QUEUES to fail because it was trying to stop the queues for a second time. Fix this by calling pci_disable_sriov before stopping the Tx/Rx queues. This allows the VIRTCHNL_OP_DISABLE_QUEUES to get processed before the driver tries to stop the Rx/Tx queues in ice_free_vfs. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-13 09:09:26 -08:00
Piotr Raczynski	7b8ff0f9cc	ice: Increase Rx queue disable timeout With much traffic coming into the port, Rx queue disable procedure can take more time until all pending queue requests on PCIe finish. Reuse ICE_Q_WAIT_MAX_RETRY macro and increase the delay itself. Signed-off-by: Piotr Raczynski <piotr.raczynski@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-13 09:09:26 -08:00
Lev Faerman	6263e811f4	ice: Fix NVM mask defines Fixes bad masks that would break compilation when evaluated. Signed-off-by: Lev Faerman <lev.faerman@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-13 09:09:26 -08:00
Dave Ertman	d09e2693b6	ice: Avoid nested RTNL locking in ice_dis_vsi ice_dis_vsi() performs an rtnl_lock() if it detects a netdev that is running on the VSI. In cases where the RTNL lock has already been acquired, a deadlock results. Add a boolean to pass to ice_dis_vsi to tell it if the RTNL lock is already held. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-13 09:09:26 -08:00
Anirudh Venkataramanan	995c90f2de	ice: Calculate guaranteed VSIs per function and use it Currently we are setting the guar_num_vsi to equal to ICE_MAX_VSI which is the device limit of 768. This is incorrect and could have unintended consequences. To fix this use the valid_function's 8-bit bitmap returned from discovering device capabilities to determine the guar_num_vsi per function. guar_num_vsi value is then passed on to pf->num_alloc_vsi. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-13 09:09:26 -08:00
Anirudh Venkataramanan	10e03a22de	ice: Remove node before releasing VSI Before releasing the VSI, remove the VSI scheduler node. If not, the node is left in the scheduler tree and, on subsequent load, the scheduler tree contains the node so it does not set it in vsi_ctx. This, later, causes the node to not be found in ice_sched_get_free_qparent which leads to a "Failed to set LAN Tx queue context, error: -1". To remove the scheduler node, this patch introduces ice_rm_vsi_lan_cfg and related helpers. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-13 09:09:26 -08:00
Tony Nguyen	b354e98f49	ice: Check for q_vector when stopping rings There is a gap in time between a VF reset, which sets the q_vector to NULL, and the VF requesting mapping of the q_vectors. If ice_vsi_stop_tx_rings() is called during this time, a NULL pointer dereference is encountered. Add a check in ice_vsi_stop_tx_rings() to ensure the q_vector is set to avoid this situation from occurring. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-13 09:09:25 -08:00
Brett Creeley	807bc98d31	ice: Fix debug print in ice_tx_timeout Currently the debug print in ice_tx_timeout is printing useless and duplicate values. First, head is being assigned to tx_ring->next_to_clean and we are printing both of those values, but naming them HWB and NTC respectively. Also, reading tail always returns 0 so remove that as well. Instead of assigning the SW head (NTC) read to head, use the actual head register and change the debug print to note that this is HW_HEAD. Also reduce the scope of a couple variables. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-13 09:09:25 -08:00
Denis Bolotin	ed4eac20dc	qed: Fix reading wrong value in loop condition The value of "sb_index" is written by the hardware. Reading its value and writing it to "index" must finish before checking the loop condition. Signed-off-by: Denis Bolotin <denis.bolotin@cavium.com> Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-13 08:51:16 -08:00
Michal Kalderon	291d57f67d	qed: Fix rdma_info structure allocation Certain flows need to access the rdma-info structure, for example dcbx update flows. In some cases there can be a race between the allocation or deallocation of the structure which was done in roce start / roce stop and an asynchrounous dcbx event that tries to access the structure. For this reason, we move the allocation of the rdma_info structure to be similar to the iscsi/fcoe info structures which are allocated during device setup. We add a new field of "active" to the struct to define whether roce has already been started or not, and this is checked instead of whether the pointer to the info structure. Fixes: `51ff17251c` ("qed: Add support for RoCE hw init") Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com> Signed-off-by: Denis Bolotin <denis.bolotin@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-13 08:51:16 -08:00
Denis Bolotin	e90202ed1c	qed: Fix overriding offload_tc by protocols without APP TLV The TC received from APP TLV is stored in offload_tc, and should not be set by protocols which did not receive an APP TLV. Fixed the condition when overriding the offload_tc. Signed-off-by: Denis Bolotin <denis.bolotin@cavium.com> Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-13 08:51:16 -08:00
Denis Bolotin	9aaa4e8ba1	qed: Fix PTT leak in qed_drain() Release PTT before entering error flow. Signed-off-by: Denis Bolotin <denis.bolotin@cavium.com> Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-13 08:51:16 -08:00
Sudarsana Reddy Kalluru	77e461d14e	bnx2x: Assign unique DMAE channel number for FW DMAE transactions. Driver assigns DMAE channel 0 for FW as part of START_RAMROD command. FW uses this channel for DMAE operations (e.g., TIME_SYNC implementation). Driver also uses the same channel 0 for DMAE operations for some of the PFs (e.g., PF0 on Port0). This could lead to concurrent access to the DMAE channel by FW and driver which is not legal. Hence need to assign unique DMAE id for FW. Currently following DMAE channels are used by the clients, MFW - OCBB/OCSD functionality uses DMAE channel 14/15 Driver 0-3 and 8-11 (for PF dmae operations) 4 and 12 (for stats requests) Assigning unique dmae_id '13' to the FW. Changes from previous version: ------------------------------ v2: Incorporated the review comments. Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-12 08:54:12 -08:00
David S. Miller	2b9b7502df	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2018-11-11 17:57:54 -08:00
Heiner Kallweit	9206eb0bc5	PCI: add USR vendor id and use it in r8169 and w6692 driver The PCI vendor id of U.S. Robotics isn't defined in pci_ids.h so far, only ISDN driver w6692 has a private definition. Move the definition to pci_ids.h and use it in the r8169 driver too. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-11 14:00:05 -08:00
Andrew Lunn	3c1bcc8614	net: ethernet: Convert phydev advertize and supported from u32 to link mode There are a few MAC/PHYs combinations which now support > 1Gbps. These may need to make use of link modes with bits > 31. Thus their supported PHY features or advertised features cannot be implemented using the current bitmap in a u32. Convert to using a linkmode bitmap, which can support all the currently devices link modes, and is future proof as more modes are added. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-11 10:10:01 -08:00
John Hurley	d4b69bad61	nfp: flower: remove unnecessary code in flow lookup Recent changes to NFP mean that stats updates from fw to driver no longer require a flow lookup and (because egdev offload has been removed) the ingress netdev for a lookup is now always known. Remove obsolete code in a flow lookup that matches on host context and that allows for a netdev to be NULL. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-11 09:54:53 -08:00
John Hurley	4f63fde3fc	nfp: flower: remove TC egdev offloads Previously, only tunnel decap rules required egdev registration for offload in NFP. These are now supported via indirect TC block callbacks. Remove the egdev code from NFP. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-11 09:54:53 -08:00
John Hurley	3166dd07a9	nfp: flower: offload tunnel decap rules via indirect TC blocks Previously, TC block tunnel decap rules were only offloaded when a callback was triggered through registration of the rules egress device. This meant that the driver had no access to the ingress netdev and so could not verify it was the same tunnel type that the rule implied. Register tunnel devices for indirect TC block offloads in NFP, giving access to new rules based on the ingress device rather than egress. Use this to verify the netdev type of VXLAN and Geneve based rules and offload the rules to HW if applicable. Tunnel registration is done via a netdev notifier. On notifier registration, this is triggered for already existing netdevs. This means that NFP can register for offloads from devices that exist before it is loaded (filter rules will be replayed from the TC core). Similarly, on notifier unregister, a call is triggered for each currently active netdev. This allows the driver to unregister any indirect block callbacks that may still be active. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-11 09:54:53 -08:00
John Hurley	65b7970edf	nfp: flower: increase scope of netdev checking functions Both the actions and tunnel_conf files contain local functions that check the type of an input netdev. In preparation for re-use with tunnel offload via indirect blocks, move these to static inline functions in a header file. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-11 09:54:53 -08:00
John Hurley	7885b4fc8d	nfp: flower: allow non repr netdev offload Previously the offload functions in NFP assumed that the ingress (or egress) netdev passed to them was an nfp repr. Modify the driver to permit the passing of non repr netdevs as the ingress device for an offload rule candidate. This may include devices such as tunnels. The driver should then base its offload decision on a combination of ingress device and egress port for a rule. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-11 09:54:53 -08:00
Quentin Monnet	16a8cb5cff	bpf: do not pass netdev to translate() and prepare() offload callbacks The kernel functions to prepare verifier and translate for offloaded program retrieve "offload" from "prog", and "netdev" from "offload". Then both "prog" and "netdev" are passed to the callbacks. Simplify this by letting the drivers retrieve the net device themselves from the offload object attached to prog - if they need it at all. There is currently no need to pass the netdev as an argument to those functions. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-11-10 15:39:54 -08:00
Quentin Monnet	a40a26322a	bpf: pass prog instead of env to bpf_prog_offload_verifier_prep() Function bpf_prog_offload_verifier_prep(), called from the kernel BPF verifier to run a driver-specific callback for preparing for the verification step for offloaded programs, takes a pointer to a struct bpf_verifier_env object. However, no driver callback needs the whole structure at this time: the two drivers supporting this, nfp and netdevsim, only need a pointer to the struct bpf_prog instance held by env. Update the callback accordingly, on kernel side and in these two drivers. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-11-10 15:39:54 -08:00
Quentin Monnet	eb9119471e	bpf: pass destroy() as a callback and remove its ndo_bpf subcommand As part of the transition from ndo_bpf() to callbacks attached to struct bpf_offload_dev for some of the eBPF offload operations, move the functions related to program destruction to the struct and remove the subcommand that was used to call them through the NDO. Remove function __bpf_offload_ndo(), which is no longer used. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-11-10 15:39:54 -08:00
Quentin Monnet	b07ade27e9	bpf: pass translate() as a callback and remove its ndo_bpf subcommand As part of the transition from ndo_bpf() to callbacks attached to struct bpf_offload_dev for some of the eBPF offload operations, move the functions related to code translation to the struct and remove the subcommand that was used to call them through the NDO. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-11-10 15:39:54 -08:00
Quentin Monnet	00db12c3d1	bpf: call verifier_prep from its callback in struct bpf_offload_dev In a way similar to the change previously brought to the verify_insn hook and to the finalize callback, switch to the newly added ops in struct bpf_prog_offload for calling the functions used to prepare driver verifiers. Since the dev_ops pointer in struct bpf_prog_offload is no longer used by any callback, we can now remove it from struct bpf_prog_offload. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-11-10 15:39:54 -08:00
Quentin Monnet	1385d755cf	bpf: pass a struct with offload callbacks to bpf_offload_dev_create() For passing device functions for offloaded eBPF programs, there used to be no place where to store the pointer without making the non-offloaded programs pay a memory price. As a consequence, three functions were called with ndo_bpf() through specific commands. Now that we have struct bpf_offload_dev, and since none of those operations rely on RTNL, we can turn these three commands into hooks inside the struct bpf_prog_offload_ops, and pass them as part of bpf_offload_dev_create(). This commit effectively passes a pointer to the struct to bpf_offload_dev_create(). We temporarily have two struct bpf_prog_offload_ops instances, one under offdev->ops and one under offload->dev_ops. The next patches will make the transition towards the former, so that offload->dev_ops can be removed, and callbacks relying on ndo_bpf() added to offdev->ops as well. While at it, rename "nfp_bpf_analyzer_ops" as "nfp_bpf_dev_ops" (and similarly for netdevsim). Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-11-10 15:39:53 -08:00
Quentin Monnet	1da6f57338	nfp: bpf: move nfp_bpf_analyzer_ops from verifier.c to offload.c We are about to add several new callbacks to the struct, all of them defined in offload.c. Move the struct bpf_prog_offload_ops object in that file. As a consequence, nfp_verify_insn() and nfp_finalize() can no longer be static. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-11-10 15:39:53 -08:00
Alexandre Belloni	fbd1d52453	net: mvneta: correct typo The reserved variable should be named reserved1. Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-09 20:10:13 -08:00
Ioana Ciornei	5500598abb	dpaa2-ptp: defer probe when portal allocation failed The fsl_mc_portal_allocate can fail when the requested MC portals are not yet probed by the fsl_mc_allocator. In this situation, the driver should defer the probe. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-09 20:08:58 -08:00
Ioana Ciornei	d7f5a9d89a	dpaa2-eth: defer probe on object allocate The fsl_mc_object_allocate function can fail because not all allocatable objects are probed by the fsl_mc_allocator at the call time. Defer the dpaa2-eth probe when this happens. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-09 20:08:58 -08:00
Jakub Kicinski	560f1ba4d8	nfp: use the new __netdev_tx_sent_queue() BQL optimisation __netdev_tx_sent_queue() was added in commit e59020abf0f ("net: bql: add __netdev_tx_sent_queue()") and allows for better GSO performance. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-09 19:49:00 -08:00
Subash Abhinov Kasiviswanathan	d02854dc19	net: qualcomm: rmnet: Fix incorrect assignment of real_dev A null dereference was observed when a sysctl was being set from userspace and rmnet was stuck trying to complete some actions in the NETDEV_REGISTER callback. This is because the real_dev is set only after the device registration handler completes. sysctl call stack - <6> Unable to handle kernel NULL pointer dereference at virtual address 00000108 <2> pc : rmnet_vnd_get_iflink+0x1c/0x28 <2> lr : dev_get_iflink+0x2c/0x40 <2> rmnet_vnd_get_iflink+0x1c/0x28 <2> inet6_fill_ifinfo+0x15c/0x234 <2> inet6_ifinfo_notify+0x68/0xd4 <2> ndisc_ifinfo_sysctl_change+0x1b8/0x234 <2> proc_sys_call_handler+0xac/0x100 <2> proc_sys_write+0x3c/0x4c <2> __vfs_write+0x54/0x14c <2> vfs_write+0xcc/0x188 <2> SyS_write+0x60/0xc0 <2> el0_svc_naked+0x34/0x38 device register call stack - <2> notifier_call_chain+0x84/0xbc <2> raw_notifier_call_chain+0x38/0x48 <2> call_netdevice_notifiers_info+0x40/0x70 <2> call_netdevice_notifiers+0x38/0x60 <2> register_netdevice+0x29c/0x3d8 <2> rmnet_vnd_newlink+0x68/0xe8 <2> rmnet_newlink+0xa0/0x160 <2> rtnl_newlink+0x57c/0x6c8 <2> rtnetlink_rcv_msg+0x1dc/0x328 <2> netlink_rcv_skb+0xac/0x118 <2> rtnetlink_rcv+0x24/0x30 <2> netlink_unicast+0x158/0x1f0 <2> netlink_sendmsg+0x32c/0x338 <2> sock_sendmsg+0x44/0x60 <2> SyS_sendto+0x150/0x1ac <2> el0_svc_naked+0x34/0x38 Fixes: `b752eff5be` ("net: qualcomm: rmnet: Implement ndo_get_iflink") Signed-off-by: Sean Tranchetti <stranche@codeaurora.org> Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-09 19:45:48 -08:00
Miroslav Lichvar	6fe42e228d	tg3: extend PTP gettime function to read system clock This adds support for the PTP_SYS_OFFSET_EXTENDED ioctl. Cc: Richard Cochran <richardcochran@gmail.com> Cc: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-09 19:43:51 -08:00
Miroslav Lichvar	018ed23ddc	ixgbe: extend PTP gettime function to read system clock This adds support for the PTP_SYS_OFFSET_EXTENDED ioctl. Cc: Richard Cochran <richardcochran@gmail.com> Cc: Jacob Keller <jacob.e.keller@intel.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-09 19:43:51 -08:00
Miroslav Lichvar	cff8ba28db	igb: extend PTP gettime function to read system clock This adds support for the PTP_SYS_OFFSET_EXTENDED ioctl. Cc: Richard Cochran <richardcochran@gmail.com> Cc: Jacob Keller <jacob.e.keller@intel.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-09 19:43:51 -08:00
Miroslav Lichvar	98942d7053	e1000e: extend PTP gettime function to read system clock This adds support for the PTP_SYS_OFFSET_EXTENDED ioctl. Cc: Richard Cochran <richardcochran@gmail.com> Cc: Jacob Keller <jacob.e.keller@intel.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-09 19:43:51 -08:00
Huazhong Tan	6ff3cf0783	net: hns3: add PCIe FLR support for VF This patch implements the .reset_prepare and .reset_done ops from pci framework to support the VF FLR. This patch uses hclgevf_set_def_reset_request() and hclgevf_reset_event() to handle FLR, so when hdev->default_reset_request is non zero, it means there is some reset requseted by hclgevf_set_def_reset_request() need to be processed. Also get the hdev from the ae_dev because hclgevf_reset_event is called with handle being NULL. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-09 16:47:35 -08:00
Huazhong Tan	862d969a3a	net: hns3: do VF's pci re-initialization while PF doing FLR While doing PF FLR, VF's PCIe configuration space will be cleared, so the pci and vector of VF should be re-initialized in the VF's reset process while PF doing FLR. Also, this patch fixes some memory not freed problem when pci re-initialization is done during reset process. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-09 16:47:35 -08:00
Huazhong Tan	6b9a97ee43	net: hns3: add PCIe FLR support for PF This patch implements the .reset_prepare and .reset_done ops from pci framework to support the PF FLR. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-09 16:47:35 -08:00
Huazhong Tan	6dd22bbc77	net: hns3: implement the IMP reset processing for PF The current code only print the prompt message after receiving the IMP reset interrupt and does not perform the corresponding driver reset operation. This patch implements the missing IMP reset handling in the driver. 1. The driver sets the HCLGE_STATE_CMD_DISABLE to stop sending command after receiving the IMP reset interrupt. 2. The driver needs to notify the hardware to reload the IMP firmware. 3. The IMP firmware reloading makes the reset time of hardware longer, so it is necessary to extend the driver's waiting time to wait for the hardware reset to complete. 4. In hclge_check_event_cause, IMP reset event should have higher priority than other events. Also, after clearing HCLGE_STATE_CMD_DISABLE in the hclge_cmd_init(), it needs to check whether there is a pending reset, if so, just set the HCLGE_STATE_CMD_DISABLE back and return. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-09 16:47:35 -08:00
Huazhong Tan	ff0699e04b	net: hns3: stop napi polling when HNS3_NIC_STATE_DOWN is set When calling napi_disable during reset down process, if NAPIF_STATE_MISSED is set, napi_complete will call __napi_schedule to do the polling again. So this patch uses HNS3_NIC_STATE_DOWN to ensure the polling is not scheduled again. Also, when napi_complete returns true, it means polling is scheduled again, it is not neccssary to enable the interrupt. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-09 16:47:35 -08:00
Huazhong Tan	6a5f6fa382	net: hns3: add error handler for hclgevf_reset() Since hclgevf_reset() may fail for some reasons, so it needs an error handler to deal with it. When VF reset failed, VF can only be restored by a higher level reset asserted by PF. So, it needs to reinitialize its command queue, then it can respond to higher level reset. Also, this patch adds error logging in the hclgevf_notify_client(). Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-09 16:47:35 -08:00
Huazhong Tan	ef5f8e507e	net: hns3: stop handling command queue while resetting VF According to hardware's description, after the reset occurs, the driver needs to re-initialize the command queue before sending and receiving any commands. Therefore, the VF's driver needs to identify the command queue needs to re-initialize with HCLGEVF_STATE_CMD_DISABLE, and does not allow sending or receiving commands before the re-initialization. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-09 16:47:35 -08:00
Huazhong Tan	b90fcc5bd9	net: hns3: add reset handling for VF when doing Core/Global/IMP reset When a Core/Global/IMP reset occurs, the hardware sets the reset status register of all PF/VF and reports a reset interrupt to all PF/VF and firmware. When receiving the reset interrupt: 1. The firmware will wait for 100 ms before resetting the hardware and clear the reset status register of all PF when hardware reset is done. 2. The PF/VF driver needs to down the netdev within 100 ms and then wait for hardware reset to finish. 3. After firmware clearing the reset status register of all PF, the PF driver reinitializes the hardware and clear the reset status register of it's VF. 4. After PF driver clearing the reset status register of VF, the VF driver reinitializes the hardware. This patch mainly add handling for the step 4. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-09 16:47:34 -08:00
Huazhong Tan	aa5c4f175b	net: hns3: add reset handling for VF when doing PF reset When PF performs a function reset, the hardware will reset both PF and all the VF belong to this PF. Hence, both PF's driver and VF's driver need to perform corresponding reset operations. Before PF driver asserting function reset to hardware, it firstly set up VF's hardware reset status, and inform the VF driver with HNAE3_VF_PF_FUNC_RESET, then VF driver sets this reset type to reset_pending and shechule reset task to stop IO and waits for the hardware reset status to clear. When PF driver has reinitialized the hardware and is ready to process mailbox from VF, PF driver clears VF's hardware reset status for VF to continue its reset process. Also, this patch uses readl_poll_timeout to simplify the hardware reset status waitting. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-09 16:47:34 -08:00
Huazhong Tan	dea846e85a	net: hns3: adjust VF's reset process Currently when VF need to reset itself, it will send a cmd to PF, after receiving the VF reset requset, PF sends a cmd to inform VF to enter the reset process and send a cmd to firmware to do the actual reset for the VF, it is possible that firmware has resetted the VF, but VF has not entered the reset process, which may cause IO not stopped problem when firmware is resetting VF. This patch fixes it by adjusting the VF reset process, when VF need to reset itself, it will enter the reset process first, and it will tell the PF to send cmd to firmware to reset itself. Add member reset_pending to struct hclgevf_dev, which indicates that there is reset event need to be processed by the VF's reset task, and the VF's reset task chooses the highest-level one and clears other low-level one when it processes reset_pending. hclge_inform_reset_assert_to_vf function is unused now, but it will be used to support the PF reset with VF working, so declare it in the header file. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-09 16:47:34 -08:00
Huazhong Tan	9c6f708577	net: hns3: add reset_hdev to reinit the hdev in VF's reset process When doing reset, the reset handling function only need to reinitialize hardware, it makes sense to add a function to do that job. Also the error handling of hclgevf_init_hdev is different when it is used in reset process. This patch adds reset_hdev to reinitialize hardware when resetting. Also, this patch removes the hclgevf_dev_ongoing_full_reset because it is unused now. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-09 16:47:34 -08:00
Dmitry Bogdanov	bbb67a44ba	net: aquantia: allow rx checksum offload configuration RX Checksum offloads could not be configured and ignored netdev features flag for checksumming. Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-09 15:38:10 -08:00
Dmitry Bogdanov	ad703c2b91	net: aquantia: invalid checksumm offload implementation Packets with marked invalid IP/UDP/TCP checksums were considered as good by the driver. The error was in a logic, processing offload bits in RX descriptor. Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-09 15:38:10 -08:00
Igor Russkikh	bfaa9f8553	net: aquantia: fixed enable unicast on 32 macvlan Fixed a condition mistake due to which macvlans unicast item number 32 was not added in the unicast filter. The consequence is that when exactly 32 macvlans are created on NIC, the last created macvlan receives no traffic because its MAC was not registered in HW. Fixes: `94b3b54230` ("net: aquantia: vlan unicast address list correct handling") Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Tested-by: Nikita Danilov <nikita.danilov@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-09 15:38:10 -08:00
Dmitry Bogdanov	7a1bb49461	net: aquantia: fix potential IOMMU fault after driver unbind IOMMU fault may occurr on unbind/bind or if_down/if_up sequence. Although driver disables the rings on down, this is not enough. Due to internal HW design, during subsequent initialization NIC sometimes may reuse RX descriptors cache and write to the host memory from the descriptor cache. That's get catched by IOMMU on host. This patch invalidates the descriptor cache in NIC on interface down to prevent writing to the cached descriptors and to the memory pointed in those descriptors. Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-09 15:38:10 -08:00
Igor Russkikh	35e8e8b45d	net: aquantia: synchronized flow control between mac/phy Flow control statuses were not synchronized between blocks, that caused packets/link drop on some corner cases, when MAC sent PFC although Phy was not expecting these to come. Driver should readout the negotiated FC from phy and configure RX block accordigly. This is done on each link change event with information from FW. Fixes: `288551de45` ("net: aquantia: Implement rx/tx flow control ethtools callback") Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-09 15:38:10 -08:00
Arjun Vynipadath	40c4b1e9b6	cxgb4vf: free mac_hlist properly The locally maintained list for tracking hash mac table was not freed during driver remove. Signed-off-by: Arjun Vynipadath <arjun@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-09 15:18:48 -08:00
Arjun Vynipadath	24357e06ba	cxgb4vf: fix memleak in mac_hlist initialization mac_hlist was initialized during adapter_up, which will be called every time a vf device is first brought up, or every time when device is brought up again after bringing all devices down. This means our state of previous list is lost, causing a memleak if entries are present in the list. To fix that, move list init to the condition that performs initial one time adapter setup. Signed-off-by: Arjun Vynipadath <arjun@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-09 15:18:22 -08:00
Arjun Vynipadath	2a8d84bf51	cxgb4: free mac_hlist properly The locally maintained list for tracking hash mac table was not freed during driver remove. Signed-off-by: Arjun Vynipadath <arjun@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-09 15:18:00 -08:00
Jiong Wang	cf599f5031	nfp: bpf: relax prog rejection through max_pkt_offset NFP is refusing to offload programs whenever the MTU is set to a value larger than the max packet bytes that fits in NFP Cluster Target Memory (CTM). However, a eBPF program doesn't always need to access the whole packet data. Verifier has always calculated maximum direct packet access (DPA) offset, and kept it in max_pkt_offset inside prog auxiliar information. This patch relax prog rejection based on max_pkt_offset. Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-11-09 09:16:32 +01:00
Jakub Kicinski	6e5a716f42	nfp: abm: refuse RED offload with harddrop set RED Qdisc will now inform the drivers about the state of the harddrop flag. Refuse to offload in case harddrop is set. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 20:48:01 -08:00
Jakub Kicinski	cae5f48e32	nfp: abm: don't set negative threshold Turns out the threshold value is used in signed compares in the FW, so we should avoid setting the top bit. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 20:48:01 -08:00
Jakub Kicinski	032748acf6	nfp: abm: provide more precise info about offload parameter validation Improve log messages printed when RED can't be offloaded because of Qdisc parameters. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 20:48:01 -08:00
Jakub Kicinski	83ec8857a0	nfp: parse vNIC TLV capabilities at alloc time In certain cases initialization logic which follows allocation of the vNIC structure may want to validate the capabilities of that vNIC. This is easy before vNIC is initialized for normal capabilities which are at fixed offsets in control memory, easy to locate and read, but poses a challenge if the capabilities are in form of TLVs. Parse the TLVs early on so other code can just access parsed info, instead of having to do the parsing by itself. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 20:48:00 -08:00
Jakub Kicinski	e38f5d11b9	nfp: pass ctrl_bar pointer to nfp_net_alloc Move setting ctrl_bar pointer to the nfp_net_alloc function, to make sure we can parse capabilities early in the following patch. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 20:48:00 -08:00
Jakub Kicinski	47330f9bdf	nfp: abm: split qdisc offload code into a separate file The Qdisc offload code is logically separate, and we will soon do significant surgery on it to support more Qdiscs, so move it to a separate file. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 20:48:00 -08:00
Michał Mirosław	3149a2711b	sky2: use __vlan_hwaccel helpers Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 20:45:04 -08:00
Michał Mirosław	4b17f9fe48	mlx4: use __vlan_hwaccel helpers Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 20:45:04 -08:00
Michał Mirosław	c4062f89c5	benet: use __vlan_hwaccel helpers Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 20:45:04 -08:00
Ivan Khoronzhuk	00fe471205	net: ethernet: ti: cpsw: fix vlan configuration while down/up The vlan configuration is not restored after interface donw/up sequence (if dual-emac - both interfaces). Tested on am572x EVM. Steps to check: ~# ip link add link eth1 name eth1.100 type vlan id 100 ~# ifconfig eth0 down ~# ifconfig eth1 down Try to remove vid and observe warning: ~# ip link del eth1.100 [ 739.526757] net eth1: removing vlanid 100 from vlan filter [ 739.533322] failed to kill vid 0081/100 for device eth1 This patch fixes it, restoring only vlan ALE entries and all other unicast/multicast entries are restored by system calling rx_mode ndo. Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 20:30:58 -08:00
Ivan Khoronzhuk	15180eca56	net: ethernet: ti: cpsw: fix vlan mcast At this moment, mcast addresses are added for real device only (reserved vlans for dual-emac mode), even if a mcast address was added for some vlan only, thus ALE doesn't have corresponding vlan mcast entries after vlan socket joined multicast group. So ALE drops vlan frames with mcast addresses intended for vlans and potentially can receive mcast frames for base ndev. That's not correct. So, fix it by creating only vlan/mcast entries as requested. Patch doesn't use any additional lists and is based on device mc address list and cpsw ALE table entries. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 20:30:58 -08:00
Edward Cree	29e1220717	sfc: use the new __netdev_tx_sent_queue BQL optimisation As added in `3e59020abf` ("net: bql: add __netdev_tx_sent_queue()"), which see for performance rationale. Signed-off-by: Edward Cree <ecree@solarflare.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 20:01:29 -08:00
Michał Mirosław	f4f9a5e6cc	gianfar: remove use of VLAN_TAG_PRESENT Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 19:49:32 -08:00
Michał Mirosław	f723a1a293	cnic: remove use of VLAN_TAG_PRESENT This just removes VLAN_TAG_PRESENT use. VLAN TCI=0 special meaning is deeply embedded in the driver code and so is left as is. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 19:49:31 -08:00
Thor Thayer	8137b6ef0c	net: stmmac: Fix RX packet size > 8191 Ping problems with packets > 8191 as shown: PING 192.168.1.99 (192.168.1.99) 8150(8178) bytes of data. 8158 bytes from 192.168.1.99: icmp_seq=1 ttl=64 time=0.669 ms wrong data byte 8144 should be 0xd0 but was 0x0 16 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f 20 21 22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f %< ---------------snip-------------------------------------- 8112 b0 b1 b2 b3 b4 b5 b6 b7 b8 b9 ba bb bc bd be bf c0 c1 c2 c3 c4 c5 c6 c7 c8 c9 ca cb cc cd ce cf 8144 0 0 0 0 d0 d1 ^^^^^^^ Notice the 4 bytes of 0 before the expected byte of d0. Databook notes that the RX buffer must be a multiple of 4/8/16 bytes [1]. Update the DMA Buffer size define to 8188 instead of 8192. Remove the -1 from the RX buffer size allocations and use the new DMA Buffer size directly. [1] Synopsys DesignWare Cores Ethernet MAC Universal v3.70a [section 8.4.2 - Table 8-24] Tested on SoCFPGA Stratix10 with ping sweep from 100 to 8300 byte packets. Fixes: `286a837217` ("stmmac: add CHAINED descriptor mode support (V4)") Suggested-by: Jose Abreu <jose.abreu@synopsys.com> Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 19:47:44 -08:00
Ilias Apalodimas	0d404a6128	net: socionext: refactor netsec_alloc_dring() return -ENOMEM directly instead of assigning it in a variable Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 19:42:41 -08:00
Ilias Apalodimas	4acb20b462	net: socionext: different approach on DMA Current driver dynamically allocates an skb and maps it as DMA Rx buffer. In order to prepare for upcoming XDP changes, let's introduce a different allocation scheme. Buffers are allocated dynamically and mapped into hardware. During the Rx operation the driver uses build_skb() to produce the necessary buffers for the network stack. This change increases performance ~15% on 64b packets with smmu disabled and ~5% with smmu enabled Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 19:42:41 -08:00
Stefan Wahren	026b907d58	net: qca_spi: Add available buffer space verification Interferences on the SPI line could distort the response of available buffer space. So at least we should check that the response doesn't exceed the maximum available buffer space. In error case increase a new error counter and retry it later. This behavior avoids buffer errors in the QCA7000, which results in an unnecessary chip reset including packet loss. Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 19:41:01 -08:00
Sagiv Ozeri	fa5c448d98	qed: Fix potential memory corruption A stuck ramrod should be deleted from the completion_pending list, otherwise it will be added again in the future and corrupt the list. Return error value to inform that ramrod is stuck and should be deleted. Signed-off-by: Sagiv Ozeri <sagiv.ozeri@cavium.com> Signed-off-by: Denis Bolotin <denis.bolotin@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 19:38:19 -08:00
Denis Bolotin	fb5e7438e7	qed: Fix SPQ entries not returned to pool in error flows qed_sp_destroy_request() API was added for SPQ users that need to free/return the entry they acquired in their error flows. Signed-off-by: Denis Bolotin <denis.bolotin@cavium.com> Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 19:38:19 -08:00
Denis Bolotin	2632f22ebd	qed: Fix blocking/unlimited SPQ entries leak When there are no SPQ entries left in the free_pool, new entries are allocated and are added to the unlimited list. When an entry in the pool is available, the content is copied from the original entry, and the new entry is sent to the device. qed_spq_post() is not aware of that, so the additional entry is stored in the original entry as p_post_ent, which can later be returned to the pool. Signed-off-by: Denis Bolotin <denis.bolotin@cavium.com> Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 19:38:18 -08:00
Denis Bolotin	39477551df	qed: Fix memory/entry leak in qed_init_sp_request() Free the allocated SPQ entry or return the acquired SPQ entry to the free list in error flows. Signed-off-by: Denis Bolotin <denis.bolotin@cavium.com> Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 19:38:18 -08:00
Colin Ian King	141b95d551	net: hns3: fix spelling mistake, "assertting" -> "asserting" Trivial fix to spelling mistake in dev_err error message Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 17:07:56 -08:00
Ganesh Goudar	6d444c4efc	cxgb4: Add new T6 PCI device ids 0x608a Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 17:05:20 -08:00
Huazhong Tan	e12c225258	net: hns3: bugfix for not checking return value hns3_reset_notify_init_enet() only return error early if the return value of hns3_restore_vlan() is not 0. This patch adds checking for the return value of hns3_restore_vlan. Fixes: `7fa6be4fd2` ("net: hns3: fix incorrect return value/type of some functions") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 16:23:49 -08:00
YueHaibing	0db55093b5	net: bcmgenet: return correct value 'ret' from bcmgenet_power_down Fixes gcc '-Wunused-but-set-variable' warning: drivers/net/ethernet/broadcom/genet/bcmgenet.c: In function 'bcmgenet_power_down': drivers/net/ethernet/broadcom/genet/bcmgenet.c:1136:6: warning: variable 'ret' set but not used [-Wunused-but-set-variable] bcmgenet_power_down should return 'ret' instead of 0. Fixes: `ca8cf34190` ("net: bcmgenet: propagate errors from bcmgenet_power_down") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 16:21:41 -08:00
David S. Miller	5867b33014	Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2018-11-07 This series contains updates to almost all of the Intel wired LAN drivers. Lance Roy replaces a spin lock with lockdep_assert_held() for igbvf driver in move toward trying to remove spin_is_locked(). Colin Ian King fixes a potential null pointer dereference by adding a check in ixgbe. Also fixed the igc driver by properly assigning the return error code of a function call, so that we can properly check it. Shannon Nelson updates the ixgbe driver to not block IPsec offload when in VEPA mode, in VEB mode, IPsec offload is still blocked because the device drops packets into a black hole. Jake adds support for software timestamping for packets sent over ixgbevf. Also modifies i40e, iavf, igb, igc, and ixgbe to delay calling skb_tx_timestamp() to the latest point possible, which is just prior to notifying the hardware of the new Tx packet. Todd adds the new WoL filter flag so that we properly report that we do not support this new feature. YueHaibing from Huawei fixes the igc driver by cleaning up variables that are not "really" used. Dan Carpenter cleans up igc whitespace issues. Miroslav Lichvar fixes e1000e for potential underflow issue in the timecounter, so modify the driver to use timecounter_cyc2time() to allow non-monotonic SYSTIM readings. Sasha provides additional igc cleanups based on community feedback. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-07 23:07:04 -08:00
John Hurley	e963e1097a	nfp: flower: include geneve as supported offload tunnel type Offload of geneve decap rules is supported in NFP. Include geneve in the check for supported types. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-07 23:00:23 -08:00
John Hurley	83f27d027d	nfp: flower: use geneve and vxlan helpers Make use of the recently added VXLAN and geneve helper functions to determine the type of the netdev from its rtnl_link_ops. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-07 23:00:23 -08:00
Edward Cree	cea0604d3f	sfc: add missing NVRAM partition types for EF10 Expose the MUM/SUC Firmware, UEFI Expansion ROM and MC Status partitions of the NIC's NVRAM as MTDs if found on the NIC. The first two are needed in order to properly update them when performing firmware updates; the MC Status partition is used to determine whether a signed firmware image was accepted or rejected by a Secure NIC. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-07 22:58:00 -08:00
Michał Mirosław	b25ddb00bc	qlcnic: remove assumption that vlan_tci != 0 VLAN.TCI == 0 is perfectly valid (802.1p), so allow it to be accelerated. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-07 22:37:55 -08:00
Michał Mirosław	e84b47941e	ibmvnic: fix accelerated VLAN handling Don't request tag insertion when it isn't present in outgoing skb. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-07 22:36:21 -08:00
YueHaibing	f601a85bd7	net: hns3: Remove set but not used variable 'reset_level' Fixes gcc '-Wunused-but-set-variable' warning: drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c: In function 'hclge_log_and_clear_ppp_error': drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c:821:24: warning: variable 'reset_level' set but not used [-Wunused-but-set-variable] enum hnae3_reset_type reset_level = HNAE3_NONE_RESET; It never used since introduction in commit `01865a50d7` ("net: hns3: Add enable and process hw errors of TM scheduler") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-07 11:46:37 -08:00
Jakub Kicinski	0c665e2bf4	nfp: flower: use the common netdev notifier Use driver's common notifier for LAG and tunnel configuration. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-07 11:45:22 -08:00
Jakub Kicinski	3e33359040	nfp: register a notifier handler in a central location for the device Code interested in networking events registers its own notifier handlers. Create one device-wide notifier instance. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-07 11:45:22 -08:00
Jakub Kicinski	659bb404eb	nfp: flower: make nfp_fl_lag_changels_event() void nfp_fl_lag_changels_event() never fails, and therefore we would never return NOTIFY_BAD for NETDEV_CHANGELOWERSTATE. Make this clearer by changing nfp_fl_lag_changels_event()'s return type to void. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-07 11:45:22 -08:00
Jakub Kicinski	a558c982a8	nfp: flower: don't try to nack device unregister events Returning an error from a notifier means we want to veto the change. We shouldn't veto NETDEV_UNREGISTER just because we couldn't find the tracking info for given master. I can't seem to find a way to trigger this unless we have some other bug, so it's probably not fix-worthy. While at it move the checking if the netdev really is of interest into the handling functions, like we do for other events. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-07 11:45:22 -08:00
Jakub Kicinski	e50bfdf74d	nfp: flower: remove unnecessary iteration over devices For flower tunnel offloads FW has to be informed about MAC addresses of tunnel devices. We use a netdev notifier to keep track of these addresses. Remove unnecessary loop over netdevices after notifier is registered. The intention of the loop was to catch devices which already existed on the system before nfp driver got loaded, but netdev notifier will replay NETDEV_REGISTER events. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-07 11:45:22 -08:00
Pieter Jansen van Vuuren	4234d62c27	nfp: flower: add ipv6 set flow label and hop limit offload Add ipv6 set flow label and hop limit action offload. Since pedit sets headers per 4 byte word, we need to ensure that setting either version, priority, payload_len or nexthdr does not get offloaded. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-07 11:45:21 -08:00
Pieter Jansen van Vuuren	a3c6b063fe	nfp: flower: add ipv4 set ttl and tos offload Add ipv4 set ttl and tos action offload. Since pedit sets headers per 4 byte word, we need to ensure that setting either version, ihl, protocol, total length or checksum does not get offloaded. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-07 11:45:21 -08:00
Huazhong Tan	8b0195a305	net: hns3: fix for cmd queue memory not freed problem during reset It is not necessary to reallocate the descriptor and remap the descriptor memory in reset process, otherwise it may cause memory not freed problem. Also, this patch initializes the cmd queue's spinlocks in hclgevf_alloc_cmd_queue, and take the spinlocks when reinitializing cmd queue' registers. Fixes: `fedd0c15d2` ("net: hns3: Add HNS3 VF IMP(Integrated Management Proc) cmd interface") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-07 11:42:18 -08:00
Huazhong Tan	65e41e7e68	net: hns3: add error handler for hclge_reset() When hclge_reset() is called, it may fail for several reasons. For example, an higher-level reset event occurs, memory allocation failure, hardware reset timeout, etc. Therefore, it is necessary to add corresponding error handling for these situations. 1. A high-level reset is required due to a high-level reset failure. 2. For memory allocation failure, a high-level reset is initiated by the timer to recover. The reason for using the timer is to prevent this new high-level reset to interrupt the reset process of other pf/vf; 3. For the case of hardware reset timeout, reschedule the reset task to wait for the hardware to complete the reset. For memory allocation failure and reset timeouts, in order to prevent an infinite number of scheduled reset tasks, the number of error recovery needs to be limited. This patch also add some reset related debug log printing. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-07 11:42:18 -08:00
Huazhong Tan	f403a84fb2	net: hns3: call roce's reset notify callback when resetting While doing resetting, roce should do its uninitailization part before nic's, and do its initialization part after nic's. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-07 11:42:18 -08:00
Huazhong Tan	35d93a3004	net: hns3: adjust the process of PF reset When doing PF reset, the driver needs to do some preparatory work before asserting PF reset. Since when hardware is resetting, it is necessary to stop tx/rx queue, clear hardware table, etc, otherwise hardware may run into unrecoverable state if there is still IO running when the hardware is resetting. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-07 11:42:18 -08:00
Huazhong Tan	0742ed7c24	net: hns3: move some reset information from hnae3_handle into hclge_dev/hclgevf_dev Saving reset related information in the hclge_dev/hclgevf_dev structure is more suitable than the hnae3_handle, since hardware related information is kept in these two structure. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-07 11:42:18 -08:00
Huazhong Tan	7cea834d94	net: hns3: ignore new coming low-level reset while doing high-level reset When processing a higher level reset, the pending lower level reset does not have to be processed anymore, because the higher level reset is the superset of the lower level reset. Therefore, when processing an higher level reset, the request of lower level reset needs to be cleared. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-07 11:42:17 -08:00
Huazhong Tan	257e4f2994	net: hns3: use HNS3_NIC_STATE_RESETTING to indicate resetting While hclge is going to reset, it will notify its client with HNAE3_DOWN_CLIENT, so this client should get into a resetting status from this moment, other operations from the stack need to be blocked as well. And when the reset is finished, the client will be notified with HNAE3_UP_CLIENT, so this is the end of the resetting status. This patch uses HNS3_NIC_STATE_RESETTING flag to implement that, and adds hns3_nic_resetting() to indicate which operation is not allowed. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-07 11:42:17 -08:00
Huazhong Tan	8df0fa9168	net: hns3: enable/disable ring in the enet while doing UP/DOWN While hardware gets into reset status, the firmware will not respond to driver's command request, which may cause ring not disabled problem during reset process. So this patch uses register instead of command to enable/disable the ring in the enet while doing UP/DOWN operation. Also, HNS3_RING_RX_VM_REG is previously unused, so change it to the correct meaning, and add a wrapper function for readl(). Fixes: `46a3df9f97` ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-07 11:42:17 -08:00
Huazhong Tan	7edff5339a	net: hns3: adjust the location of clearing the table when doing reset When doing a function reset, the hardware table should be cleared before the hardware reset. In current code, this clearing is done in hns3_reset_notify_uninit_enet, but it is too late, because the hardware reset is already done, hns3_reset_notify_down_enet is more suitable to do that. Fixes: `bb6b94a896` ("net: hns3: Add reset interface implementation in client") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-07 11:42:17 -08:00
Huazhong Tan	4d60291b6b	net: hns3: provide some interface & information for the client The client needs to know if the hardware is resetting when loading or unloading itself, because client may abort the loading process or wait for the reset process to finish when unloading if hardware is resetting. So this patch provides these interfaces to do it. 1. get_hw_reset_stat, the reset status of hardware. 2. ae_dev_resetting, whether reset task is scheduling. 3. ae_dev_reset_cnt, how many reset has been done. Also, the RoCE client needs some field in the hnae3_roce_private_info to save its state, and process_hw_error interface in the hnae3_client_ops to process hardware errors. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-07 11:42:17 -08:00
Huazhong Tan	720bd5837e	net: hns3: add set_default_reset_request in the hnae3_ae_ops Currently, when reset_event is called because of tx timeout, it will upgrade the reset level (For PF, HNAE3_FUNC_RESET -> HNAE3_CORE_RESET -> HNAE3_GLOBAL_RESET) if the time between the new reset and last reset is within 20 secs, or restore the reset level to HNAE3_FUNC_RESET if the time between the new reset and last reset is over 20 secs. There is requirement that the caller needs to decide the reset level when triggering a reset, for example, RAS recovery. So this patch adds the set_default_reset_request to meet this requirement. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-07 11:42:17 -08:00
Huazhong Tan	814da63c55	net: hns3: use HNS3_NIC_STATE_INITED to indicate the initialization state of enet Besides of module_init and module_exit, the process of reset will also uninitialize and initialize the enet client. When reset process fails with enet client uninitialized, the module_exit does not need to uninitialize the enet client, otherwise it may cause double uninitialization problem. So we need the HNS3_NIC_STATE_INITED flag to indicate whether the enet client is initialized. Also HNS3_NIC_STATE_REINITING is previously unused, so change it to HNS3_NIC_STATE_INITED. Fixes: `bb6b94a896` ("net: hns3: Add reset interface implementation in client") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-07 11:42:17 -08:00
Jacob Keller	d5596fd467	i40e: enable NETIF_F_NTUPLE and NETIF_F_HW_TC at driver load The assignment of the feature flag NETIF_F_NTUPLE and NETIF_F_HW_TC occurs prior to the initial setup of the local hw_features variable. This means the features are set as user-changeable, but are not set in the currently active feature list. This results in the features being disabled at the driver's initial load. Move the assignment after the initial assignment of hw_features, and assign to the local variable. This ensures that NETIF_F_NTUPLE and NETIF_F_HW_TC are marked as user-changeable, and also enables them by default when the driver loads. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-07 10:32:15 -08:00
Sasha Neftin	920664a8f7	igc: Clean up code Address few community comments. Remove unused code, will be added per demand. Remove blank lines and unneeded includes. Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-07 09:47:01 -08:00
Miroslav Lichvar	e1f65b0d70	e1000e: allow non-monotonic SYSTIM readings It seems with some NICs supported by the e1000e driver a SYSTIM reading may occasionally be few microseconds before the previous reading and if enabled also pass e1000e_sanitize_systim() without reaching the maximum number of rereads, even if the function is modified to check three consecutive readings (i.e. it doesn't look like a double read error). This causes an underflow in the timecounter and the PHC time jumps hours ahead. This was observed on 82574, I217 and I219. The fastest way to reproduce it is to run a program that continuously calls the PTP_SYS_OFFSET ioctl on the PHC. Modify e1000e_phc_gettime() to use timecounter_cyc2time() instead of timecounter_read() in order to allow non-monotonic SYSTIM readings and prevent the PHC from jumping. Cc: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Acked-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-07 09:47:01 -08:00
Dan Carpenter	bb9089b668	igc: Tidy up some white space I just cleaned up a couple small white space issues. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-07 09:47:01 -08:00
Colin Ian King	14b21cec85	igc: fix error return handling from call to netif_set_real_num_tx_queues The call to netif_set_real_num_tx_queues is not assigning the error return to variable err even though the next line checks err for an error. Fix this by adding the missing err assignment. Detected by CoverityScan, CID#1474551 ("Logically dead code") Fixes: `3df25e4c1e` ("igc: Add interrupt support") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-07 09:47:01 -08:00
YueHaibing	84cfa53740	igc: Remove set but not used variable 'pci_using_dac' Fixes gcc '-Wunused-but-set-variable' warning: drivers/net/ethernet/intel/igc/igc_main.c: In function 'igc_probe': drivers/net/ethernet/intel/igc/igc_main.c:3535:11: warning: variable 'pci_using_dac' set but not used [-Wunused-but-set-variable] It never used since introduction in commit `d89f88419f` ("igc: Add skeletal frame for Intel(R) 2.5G Ethernet Controller support") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-07 09:47:01 -08:00
YueHaibing	dda458d285	igc: Remove set but not used variables 'ctrl_ext, link_mode' Fixes gcc '-Wunused-but-set-variable' warning: drivers/net/ethernet/intel/igc/igc_base.c: In function 'igc_init_phy_params_base': drivers/net/ethernet/intel/igc/igc_base.c:240:6: warning: variable 'ctrl_ext' set but not used [-Wunused-but-set-variable] u32 ctrl_ext; drivers/net/ethernet/intel/igc/igc_base.c: In function 'igc_get_invariants_base': drivers/net/ethernet/intel/igc/igc_base.c:290:6: warning: variable 'link_mode' set but not used [-Wunused-but-set-variable] u32 link_mode = 0; It never used since introduction in commit `c0071c7aa5` ("igc: Add HW initialization code") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-07 09:47:01 -08:00
Todd Fujinaka	540a152da7	i40e/ixgbe/igb: fail on new WoL flag setting WAKE_MAGICSECURE There's a new flag for setting WoL filters that is only enabled on one manufacturer's NICs, and it's not ours. Fail with EOPNOTSUPP. Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-07 09:47:01 -08:00
Jacob Keller	a9e510589d	intel-ethernet: software timestamp skbs as late as possible Many of the Intel Ethernet drivers call skb_tx_timestamp() earlier than necessary. Move the calls to this function to the latest point possible, just prior to notifying hardware of the new Tx packet when we bump the tail register. This affects i40e, iavf, igb, igc, and ixgbe. The e100, e1000, e1000e, fm10k, and ice drivers already call the skb_tx_timestamp() function just prior to indicating the Tx packet to hardware, so they do not need to be changed. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-07 09:47:01 -08:00
Jacob Keller	9fc145fcb5	ixgbevf: add support for software timestamps Add a call to skb_tx_timestamp in the ixgbevf_tx_map function. This enables software timestamping for packets sent over this device driver. The call is placed just prior to when we notify hardware of the new packet, in order to software timestamp as close as possible to when the hardware will transmit. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-07 09:47:00 -08:00
Shannon Nelson	7fa57ca443	ixgbe: allow IPsec Tx offload in VEPA mode When it's possible that the PF might end up trying to send a packet to one of its own VFs, we have to forbid IPsec offload because the device drops the packets into a black hole. See commit `47b6f50077` ("ixgbe: disallow IPsec Tx offload when in SR-IOV mode") for more info. This really is only necessary when the device is in the default VEB mode. If instead the device is running in VEPA mode, the packets will go through the encryption engine and out the MAC/PHY as normal, and get "hairpinned" as needed by the switch. So let's not block IPsec offload when in VEPA mode. To get there with the ixgbe device, use the handy 'bridge' command: bridge link set dev eth1 hwmode vepa Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-07 09:47:00 -08:00
Colin Ian King	0db4a47c05	ixgbe: don't clear_bit on xdp_ring->state if xdp_ring is null There is an earlier check to see if xdp_ring is null when configuring the tx ring, so assuming that it can still be null, the clearing of the xdp_ring->state currently could end up with a null pointer dereference. Fix this by only clearing the bit if xdp_ring is not null. Detected by CoverityScan, CID#1473795 ("Dereference after null check") Fixes: `024aa5800f` ("ixgbe: added Rx/Tx ring disable/enable functions") Signed-off-by: Colin Ian King <colin.king@canonical.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-07 09:47:00 -08:00
Lance Roy	b86077207d	igbvf: Replace spin_is_locked() with lockdep lockdep_assert_held() is better suited to checking locking requirements, since it won't get confused when someone else holds the lock. This is also a step towards possibly removing spin_is_locked(). Signed-off-by: Lance Roy <ldr709@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-07 09:47:00 -08:00
Jacob Keller	ba766b8b99	i40e: restore NETIF_F_GSO_IPXIP[46] to netdev features Since commit `bacd75cfac` ("i40e/i40evf: Add capability exchange for outer checksum", 2017-04-06) the i40e driver has not reported support for IP-in-IP offloads. This likely occurred due to a bad rebase, as the commit extracts hw_enc_features into its own variable. As part of this change, it dropped the NETIF_F_FSO_IPXIP flags from the netdev->hw_enc_features. This was unfortunately not caught during code review. Fix this by adding back the missing feature flags. For reference, NETIF_F_GSO_IPXIP4 was added in commit `7e13318daa` ("net: define gso types for IPx over IPv4 and IPv6", 2016-05-20), replacing NETIF_F_GSO_IPIP and NETIF_F_GSO_SIT. NETIF_F_GSO_IPXIP6 was added in commit `bf2d1df395` ("intel: Add support for IPv6 IP-in-IP offload", 2016-05-20). Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-07 09:45:42 -08:00
Chinh T Cao	ffe498237b	ice: Change req_speeds to be u16 Since the req_speeds field in struct ice_link_status is a u8, req_speeds & ICE_AQ_LINK_SPEED_40GB always returns 0. This was caught by a coverity scan. Fix this by changing req_speeds to be u16. Reported-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-07 09:37:28 -08:00
Florian Fainelli	da106a140f	net: systemport: Unmap queues upon DSA unregister event Binding and unbinding the switch driver which creates the DSA slave network devices for which we set-up inspection would lead to undesireable effects since we were not clearing the port/queue mapping to the SYSTEMPORT TX queue. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-06 15:39:48 -08:00
Florian Fainelli	25c4407046	net: systemport: Simplify queue mapping logic The use of a bitmap speeds up the finding of the first available queue to which we could start establishing the mapping for, but we still have to loop over all slave network devices to set them up. Simplify the logic to have a single loop, and use the fact that a correctly configured ring has inspect set to true. This will make things simpler to unwind during device unregistration. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-06 15:39:48 -08:00
Florian Fainelli	80f8dea876	net: systemport: Restore Broadcom tag match filters upon resume Some of the system suspend states that we support wipe out entirely the HW contents. If we had a Wake-on-LAN filter programmed prior to going into suspend, but we did not actually wake-up from Wake-on-LAN and instead used a deeper suspend state, make sure we restore the CID number that we need to match against. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-06 15:05:22 -08:00
Miroslav Lichvar	4c9b658eea	igb: shorten maximum PHC timecounter update interval The timecounter needs to be updated at least once per ~550 seconds in order to avoid a 40-bit SYSTIM timestamp to be misinterpreted as an old timestamp. Since commit `500462a9de` ("timers: Switch to a non-cascading wheel"), scheduling of delayed work seems to be less accurate and a requested delay of 540 seconds may actually be longer than 550 seconds. Also, the PHC may be adjusted to run up to 6% faster than real time and the system clock up to 10% slower. Shorten the delay to 360 seconds to be sure the timecounter is updated in time. This fixes an issue with HW timestamps on 82580/I350/I354 being off by ~1100 seconds for few seconds every ~9 minutes. Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Acked-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-06 12:54:27 -08:00
Brett Creeley	d944b46992	ice: Fix the bytecount sent to netdev_tx_sent_queue Currently if the driver does a TSO offload the bytecount sent to netdev_tx_sent_queue will be incorrect. This is because in ice_tso we overwrite the initial value that we set in ice_tx_map. This creates a mismatch between the Tx and Tx clean flow. In the Tx clean flow we calculate the bytecount (called total_bytes) as we clean the descriptors so the value used in the Tx clean path is correct. Fix this by using += in ice_tso instead of =. This fixes the mismatch in bytecount mentioned above. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-06 12:46:47 -08:00
Brett Creeley	c585ea42ec	ice: Fix tx_timeout in PF driver Prior to this commit the driver was running into tx_timeouts when a queue was stressed enough. This was happening because the HW tail and SW tail (NTU) were incorrectly out of sync. Consequently this was causing the HW head to collide with the HW tail, which to the hardware means that all descriptors posted for Tx have been processed. Due to the Tx logic used in the driver SW tail and HW tail are allowed to be out of sync. This is done as an optimization because it allows the driver to write HW tail as infrequently as possible, while still updating the SW tail index to keep track. However, there are situations where this results in the tail never getting updated, resulting in Tx timeouts. Tx HW tail write condition: if (netif_xmit_stopped(txring_txq(tx_ring) \|\| !skb->xmit_more) writel(sw_tail, tx_ring->tail); An issue was found in the Tx logic that was causing the afore mentioned condition for updating HW tail to never happen, causing tx_timeouts. In ice_xmit_frame_ring we calculate how many descriptors we need for the Tx transaction based on the skb the kernel hands us. This is then passed into ice_maybe_stop_tx along with some extra padding to determine if we have enough descriptors available for this transaction. If we don't then we return -EBUSY to the stack, otherwise we move on and eventually prepare the Tx descriptors accordingly in ice_tx_map and set next_to_watch. In ice_tx_map we make another call to ice_maybe_stop_tx with a value of MAX_SKB_FRAGS + 4. The key here is that this value is possibly less than the value we sent in the first call to ice_maybe_stop_tx in ice_xmit_frame_ring. Now, if the number of unused descriptors is between MAX_SKB_FRAGS + 4 and the value used in the first call to ice_maybe_stop_tx in ice_xmit_frame_ring then we do not update the HW tail because of the "Tx HW tail write condition" above. This is because in ice_maybe_stop_tx we return success from ice_maybe_stop_tx instead of calling __ice_maybe_stop_tx and subsequently calling netif_stop_subqueue, which sets the __QUEUE_STATE_DEV_XOFF bit. This bit is then checked in the "Tx HW tail write condition" by calling netif_xmit_stopped and subsequently updating HW tail if the afore mentioned bit is set. In ice_clean_tx_irq, if next_to_watch is not NULL, we end up cleaning the descriptors that HW sets the DD bit on and we have the budget. The HW head will eventually run into the HW tail in response to the description in the paragraph above. The next time through ice_xmit_frame_ring we make the initial call to ice_maybe_stop_tx with another skb from the stack. This time we do not have enough descriptors available and we return NETDEV_TX_BUSY to the stack and end up setting next_to_watch to NULL. This is where we are stuck. In ice_clean_tx_irq we never clean anything because next_to_watch is always NULL and in ice_xmit_frame_ring we never update HW tail because we already return NETDEV_TX_BUSY to the stack and eventually we hit a tx_timeout. This issue was fixed by making sure that the second call to ice_maybe_stop_tx in ice_tx_map is passed a value that is >= the value that was used on the initial call to ice_maybe_stop_tx in ice_xmit_frame_ring. This was done by adding the following defines to make the logic more clear and to reduce the chance of mucking this up again: ICE_CACHE_LINE_BYTES 64 ICE_DESCS_PER_CACHE_LINE (ICE_CACHE_LINE_BYTES / \ sizeof(struct ice_tx_desc)) ICE_DESCS_FOR_CTX_DESC 1 ICE_DESCS_FOR_SKB_DATA_PTR 1 The ICE_CACHE_LINE_BYTES being 64 is an assumption being made so we don't have to figure this out on every pass through the Tx path. Instead I added a sanity check in ice_probe to verify cache line size and print a message if it's not 64 Bytes. This will make it easier to file issues if they are seen when the cache line size is not 64 Bytes when reading from the GLPCI_CNF2 register. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-06 12:46:47 -08:00
Dave Ertman	25525b69bb	ice: Fix napi delete calls for remove In the remove path, the vsi->netdev is being set to NULL before the call to free vectors. This is causing the netif_napi_del call to never be made. Add a call to ice_napi_del to the same location as the calls to unregister_netdev and just prior to them. This will use the reverse flow as the register and netif_napi_add calls. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-06 12:46:47 -08:00
Anirudh Venkataramanan	31082519c1	ice: Fix typo in error message Print should say "Enabling" instead of "Enaabling" Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-06 12:46:47 -08:00
Md Fahad Iqbal Polash	58297dd133	ice: Fix flags for port VLAN According to the spec, whenever insert PVID field is set, the VLAN driver insertion mode should be set to 01b which isn't done currently. Fix it. Signed-off-by: Md Fahad Iqbal Polash <md.fahad.iqbal.polash@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-06 12:46:47 -08:00
Anirudh Venkataramanan	9ecd25c268	ice: Remove duplicate addition of VLANs in replay path ice_restore_vlan and active_vlans were originally put in place to reprogram VLAN filters in the replay path. This is now done as part of the much broader VSI rebuild/replay framework. So remove both ice_restore_vlan and active_vlans Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-06 12:46:47 -08:00
Victor Raj	33e055fcc2	ice: Free VSI contexts during for unload In the unload path, all VSIs are freed. Also free the related VSI contexts to prevent memory leaks. Signed-off-by: Victor Raj <victor.raj@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-06 12:46:47 -08:00
Akeem G Abodunrin	0f5d4c21a5	ice: Fix dead device link issue with flow control Setting Rx or Tx pause parameter currently results in link loss on the interface, requiring the platform/host to be cold power cycled. Fix it. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-06 12:46:47 -08:00
Anirudh Venkataramanan	afd9d4ab58	ice: Check for reset in progress during remove The remove path does not currently check to see if a reset is in progress before proceeding. This can cause a resource collision resulting in various types of errors. Check for reset in progress and wait for a reasonable amount of time before allowing the remove to progress. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-06 12:46:46 -08:00
Anirudh Venkataramanan	ce317dd9f8	ice: Set carrier state and start/stop queues in rebuild Set the carrier state post rebuild by querying the link status. Also start/stop queues based on link status. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-06 12:46:46 -08:00
Rasmus Villemoes	7131193157	net: alx: make alx_drv_name static alx_drv_name is not used outside main.c, so there's no reason for it to have external linkage. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-05 17:12:58 -08:00
Shalom Toledo	96801552f8	mlxsw: spectrum: Fix IP2ME CPU policer configuration The CPU policer used to police packets being trapped via a local route (IP2ME) was incorrectly configured to police based on bytes per second instead of packets per second. Change the policer to police based on packets per second and avoid packet loss under certain circumstances. Fixes: `9148e7cf73` ("mlxsw: spectrum: Add policers for trap groups") Signed-off-by: Shalom Toledo <shalomt@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-03 19:31:42 -07:00
Arnd Bergmann	9261921052	qed: fix link config error handling gcc-8 notices that qed_mcp_get_transceiver_data() may fail to return a result to the caller: drivers/net/ethernet/qlogic/qed/qed_mcp.c: In function 'qed_mcp_trans_speed_mask': drivers/net/ethernet/qlogic/qed/qed_mcp.c:1955:2: error: 'transceiver_type' may be used uninitialized in this function [-Werror=maybe-uninitialized] When an error is returned by qed_mcp_get_transceiver_data(), we should propagate that to the caller of qed_mcp_trans_speed_mask() rather than continuing with uninitialized data. Fixes: `c56a8be7e7` ("qed: Add supported link and advertise link to display in ethtool.") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-03 19:27:33 -07:00

... 3 4 5 6 7 ...

25805 Commits