linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-26 19:55:21 +07:00

Author	SHA1	Message	Date
Jacob Keller	40de1fad41	fm10k: split fm10k_reinit into two functions There are several flows in the driver which perform the similar function of tearing down software and restoring software to recover from certain errors or PCIe events, including: * fm10k_reinit * fm10k_suspend/resume * fm10k_io_error_detected/fm10k_io_resume In addition, we want to implement a .reset_notify() handler as well which will also perform similar function. Rework how the driver codes reset and resume flows by separating out the reinit logic into two functions "fm10k_prepare_for_reset" and "fm10k_handle_reset". This first step will allow us to re-use this functionality in the similar blocks of code instead of re-coding the same sequence of events slightly different. The end result should be more maintainable and correct, fixing several inconsistencies with the work flow. The new functions expect to take the rtnl_lock() themselves, and it does have the unfortunate side effect of having the reinit flow take then release then take the rtnl_lock. However, this minor downside is out weighted by the benefits of code reduction and reducing needless difference between these flows. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-07-20 15:22:13 -07:00
Jacob Keller	94877768cf	fm10k: wait for queues to drain if stop_hw() fails once It turns out that sometimes during a reset the Tx queues will be temporarily stuck longer than .stop_hw() expects. Work around this issue by attempting to .stop_hw() first. If it tails, wait a number of attempts until the Tx queues appear to be drained. After this, attempt stop_hw() again. This ensures that we avoid waiting if we don't need to, such as during the first initialization of a VF, and give the proper amount of time necessary to recover from most situations. It is possible that the hardware is actually stuck. For PFs, this is usually fixed by a datapath reset. Unfortunately the VF cannot request a similar reset for itself. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-07-20 15:22:12 -07:00
Jacob Keller	106ca42356	fm10k: only warn when stop_hw fails with FM10K_ERR_REQUESTS_PENDING When stop_hw() routine fails with FM10K_ERR_REQUESTS_PENDING, this indicates that the Tx or Rx queues did not shutdown within the time limit. Print a more suitable message at the dev_info level instead of dev_err. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-07-20 15:22:12 -07:00
Jacob Keller	34bad71c7c	fm10k: use actual hardware registers when checking for pending Tx Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-07-20 15:22:11 -07:00
Jacob Keller	892c9e0872	fm10k: perform data path reset even when switch is not ready A while ago, an additional check for the switch being ready was added to reset_hw. A recent refactor accidentally made this check return an error code on failure which caused fm10k_probe to fail when the switch wasn't brought up first. The original reasoning for the check was to prevent additional data path reset when the fabric wasn't ready yet. However, there isn't a compelling reason to keep the check, as the data path reset will restore hardware to a known good state. Remove the check and perform the data path reset regardless of the switch manager state. An alternative fix is to return FM10K_SUCCESS instead, and bypass the actual data path reset. This should be fine as we will perform a reset_hw once the switch is active. However, since data path reset will reset many parts of the hardware it seems better to just perform the reset regardless of switch state. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-07-20 15:22:11 -07:00
Jacob Keller	ce33624f37	fm10k: don't stop reset due to FM10K_ERR_REQUESTS_PENDING Don't report FM10K_ERR_REQUESTS_PENDING when we fail to disable queues within the timeout. This can occur due to a hardware Tx hang, or when the switch ethernet fabric is resetting while we are transmitting traffic. It can sometimes take up to 500ms before the Tx DMA engine gives up. Instead, just skip the DMA engine check and perform a data-path reset anyways. Add a statistic counter to keep track of the number of resets occurring while we have pending DMA on the rings. In order to prevent having to re-assign err to 0, re-order the last few items of the reset_hw_pf function so that we don't perform "return err" at the end. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-07-20 15:22:11 -07:00
Ngai-Mint Kwan	5e93cbadd3	fm10k: Reset mailbox global interrupts When a data path reset is initiated, write control to the PCIE_GMBX is yanked from the switch manager. The switch manager writes to this register to clear mailbox global interrupt bits as part of its mailbox interrupt handling routine. When the device recovers from the data path reset and these bits are not cleared, it will prevent future mailbox global interrupts from being triggered. Upon confirming that the device has exited from a data path reset, clear these bits to ensure the proper functioning of the mailbox global interrupt. Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-07-20 15:22:10 -07:00
Jacob Keller	9d73edee59	fm10k: prevent multiple threads updating statistics Also prevent updating stats while the interface is down. If we're already updating stats, just return doing nothing. When we take the device down, block stat updates until we come back up. This ensures that we avoid tearing down rings when we're updating statistics, and prevents updating statistics until we're up. We can't re-use the __FM10K_DOWN for this because it wouldn't prevent multiple threads from accessing statistics. Neither does it prevent the case where we start updating stats and then start going down in another thread. The fm10k_get_stats64 is except from this, because it has a completely different flow which does not suffer from the same issues as fm10k_update_stats might. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-07-20 15:22:10 -07:00
Jacob Keller	b624714bc9	fm10k: avoid possible null pointer dereference in fm10k_update_stats It's currently possible for fm10k_update_stats to be called during the window when we go down and the rings are removed. This can result in a null pointer dereference. In fm10k_get_stats64 we work around this by using ACCESS_ONCE and a null pointer check inside the loop. Use this same flow in the fm10k_update_stats to avoid the potential null pointer. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-07-20 15:22:09 -07:00
Jacob Keller	1b00c6c064	fm10k: no need to continue in fm10k_down if __FM10K_DOWN already set Return early from fm10k_down() when we are already down, since that means another thread is either already finished or has started going down, so shouldn't conflict with them. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-07-20 15:22:09 -07:00
Guilherme G. Piccoli	7f6c553902	i40e: use valid online CPU on q_vector initialization Currently, the q_vector initialization routine sets the affinity_mask of a q_vector based on v_idx value. Meaning a loop iterates on v_idx, which is an incremental value, and the cpumask is created based on this value. This is a problem in systems with multiple logical CPUs per core (like in SMT scenarios). If we disable some logical CPUs, by turning SMT off for example, we will end up with a sparse cpu_online_mask, i.e., only the first CPU in a core is online, and incremental filling in q_vector cpumask might lead to multiple offline CPUs being assigned to q_vectors. Example: if we have a system with 8 cores each one containing 8 logical CPUs (SMT == 8 in this case), we have 64 CPUs in total. But if SMT is disabled, only the 1st CPU in each core remains online, so the cpu_online_mask in this case would have only 8 bits set, in a sparse way. In general case, when SMT is off the cpu_online_mask has only C bits set: 0, 1N, 2N, ..., C*(N-1) where C == # of cores; N == # of logical CPUs per core. In our example, only bits 0, 8, 16, 24, 32, 40, 48, 56 would be set. This patch changes the way q_vector's affinity_mask is created: it iterates on v_idx, but consumes the CPU index from the cpu_online_mask instead of just using the v_idx incremental value. No functional changes were introduced. Signed-off-by: Guilherme G Piccoli <gpiccoli@linux.vnet.ibm.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-07-14 23:39:12 -07:00
Paolo Abeni	4b732cd4bb	ixgbe: napi_poll must return the work done Currently the function ixgbe_poll() returns 0 when it clean completely the rx rings, but this foul budget accounting in core code. Fix this returning the actual work done, capped to weight - 1, since the core doesn't allow to return the full budget when the driver modifies the napi status Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Venkatesh Srinivas <venkateshs@google.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-07-14 23:34:52 -07:00
Kiran Patil	f6bd09625b	i40e: enable VSI broadcast promiscuous mode instead of adding broadcast filter This patch sets VSI broadcast promiscuous mode during VSI add sequence and prevents adding MAC filter if specified MAC address is broadcast. Change-ID: Ia62251fca095bc449d0497fc44bec3a5a0136773 Signed-off-by: Kiran Patil <kiran.patil@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-07-14 23:26:59 -07:00
Alexander Duyck	858296c878	i40e/i40evf: Fix i40e_rx_checksum There are a couple of issues I found in i40e_rx_checksum while doing some recent testing. As a result I have found the Rx checksum logic is pretty much broken and returning that the checksum is valid for tunnels in cases where it is not. First the inner types are not the correct values to use to test for if a tunnel is present or not. In addition the inner protocol types are not a bitmask as such performing an OR of the values doesn't make sense. I have instead changed the code so that the inner protocol types are used to determine if we report CHECKSUM_UNNECESSARY or not. For anything that does not end in UDP, TCP, or SCTP it doesn't make much sense to report a checksum offload since it won't contain a checksum anyway. This leaves us with the need to set the csum_level based on some value. For that purpose I am using the tunnel_type field. If the tunnel type is GRENAT or greater then this means we have a GRE or UDP tunnel with an inner header. In the case of GRE or UDP we will have a possible checksum present so for this reason it should be safe to set the csum_level to 1 to indicate that we are reporting the state of the inner header. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-07-14 23:17:45 -07:00
Sabrina Dubroca	e5de25dce9	drivers/net: fixup comments after "Future-proof tunnel offload handlers" Some comments weren't updated to reflect the renaming of ndo's and the change of arguments. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-11 13:42:11 -07:00
David S. Miller	30d0844bdc	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/mellanox/mlx5/core/en.h drivers/net/ethernet/mellanox/mlx5/core/en_main.c drivers/net/usb/r8152.c All three conflicts were overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-06 10:35:22 -07:00
David S. Miller	435c556cde	Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2016-06-29 This series contains updates and fixes to e1000e, igb, ixgbe and fm10k. A true smorgasbord of changes. Jake cleans up some obscurity by not using the BIT() macro on bitshift operation and also fixed the calculated index when looping through the indir array. Fixes the issue with igb's workqueue item for overflow check from causing a surprise remove event. The ptp_flags variable is added to simplify the work of writing several complex MAC type checks in the PTP code while fixing the workqueue. Alex Duyck fixes the receive buffers alignment which should not be L1 cache aligned, but to 512 bytes instead. Denys Vlasenko prevents a division by zero which was reported under VMWare for e1000e. Amritha fixes an issue where filters in a child hash table must be cleared from the hardware before delete the filter links in ixgbe. Bhaktipriya Shridhar simply replaces the deprecated create_workqueue() with alloc_workqueue() for fm10k. Tony corrects ixgbe ethtool reporting to show x550 supports hardware timestamping of all packets. Emil fixes an issue where MAC-VLANs on the VF fail to pass traffic due to spoofed packets. Andrew Lunn increases performance on some systems where syncing a buffer for DMA is expensive. So rather than sync the whole 2K receive buffer, only synchronize the length of the frame. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-30 09:29:07 -04:00
David S. Miller	ee58b57100	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Several cases of overlapping changes, except the packet scheduler conflicts which deal with the addition of the free list parameter to qdisc_enqueue(). Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-30 05:03:36 -04:00
Andrew Lunn	64f2525ca4	igb: Only DMA sync frame length On some platforms, syncing a buffer for DMA is expensive. Rather than sync the whole 2K receive buffer, only synchronise the length of the frame, which will typically be the MTU, or a much smaller TCP ACK. For an IMX6Q, this gives around 6% increased TCP receive performance, which is cache operations bound and reduces CPU load for TCP transmit. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-29 13:59:24 -07:00
Emil Tantilov	581e0c7df9	ixgbe: fix spoofed packets with macvlans When setting spoofing, both VLAN and MAC need to be set together. This change resolves an issue where MAC-VLANs on the VF fail to pass traffic due to spoofed packets. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-29 13:06:31 -07:00
Tony Nguyen	918b89e77f	ixgbe: Correct reporting of timestamping for x550 Update ixgbe_ethtool_get_ts_info() to show that x550 supports hardware timestamping of all packets. Reported-by: Guy Harris <guy@alum.mit.edu> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-29 12:57:19 -07:00
Bhaktipriya Shridhar	0a38c17a21	fm10k: Remove create_workqueue alloc_workqueue replaces deprecated create_workqueue(). A dedicated workqueue has been used since the workitem (viz fm10k_service_task, which manages and runs other subtasks) is involved in normal device operation and requires forward progress under memory pressure. create_workqueue has been replaced with alloc_workqueue with max_active as 0 since there is no need for throttling the number of active work items. Since network devices may be used in memory reclaim path, WQ_MEM_RECLAIM has been set to guarantee forward progress. flush_workqueue is unnecessary since destroy_workqueue() itself calls drain_workqueue() which flushes repeatedly till the workqueue becomes empty. Hence the call to flush_workqueue() has been dropped. Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-29 11:18:36 -07:00
Jacob Keller	8646f7b4cd	igb: call igb_ptp_suspend during suspend/resume cycle Properly stop the extra workqueue items and ensure that we resume cleanly. This is better than using igb_ptp_init and igb_ptp_stop since these functions destroy the PHC device, which will cause other problems if we do so. Since igb_ptp_reset now re-schedules the work-queue item we don't need an equivalent igb_ptp_resume in the resume workflow. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-29 11:14:31 -07:00
Jacob Keller	e3f2350de8	igb: implement igb_ptp_suspend Make igb_ptp_stop take advantage of this new function to reduce code duplication. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-29 11:00:22 -07:00
Jacob Keller	4f3ce71bb8	igb: re-use igb_ptp_reset in igb_ptp_init Modify igb_ptp_init to take advantage of igb_ptp_reset, and remove duplicated work that was occurring in both igb_ptp_reset and igb_ptp_init. In total, resetting the TSAUXC register, and resetting the system time both happen in igb_ptp_reset already. igb_ptp_reset now also takes care of starting the delayed work item for overflow checks, as well. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-29 10:56:11 -07:00
Jacob Keller	63737166a0	igb: introduce IGB_PTP_OVERFLOW_CHECK flag Don't continue to use complex MAC type checks for handling various cases where we have overflow check code. Make this code more obvious by introducing a flag which is enabled for hardware that needs these checks. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-29 10:51:34 -07:00
Jacob Keller	462f118882	igb: introduce ptp_flags variable and use it to replace IGB_FLAG_PTP Upcoming patches will introduce new PTP specific flags. To avoid cluttering the normal flags variable, introduce PTP specific "ptp_flags" variable for this purpose, and move IGB_FLAG_PTP to become IGB_PTP_ENABLED. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-29 10:48:07 -07:00
Amritha Nambiar	12746fd21e	ixgbe: Error handler for duplicate filter locations in hardware for cls_u32 offloads For u32 classifier filters, avoid overwriting existing filter in a hardware location without removing it first, to clean up inconsistencies due to duplicate values for filter location. Verified with the following filters: Create child hash tables: handle 1: u32 divisor 1 handle 2: u32 divisor 1 Link to the child hash table from parent hash table: handle 800:0:11 u32 ht 800: link 1: \ offset at 0 mask 0f00 shift 6 plus 0 eat \ match ip protocol 6 ff match ip dst 15.0.0.1/32 handle 800:0:12 u32 ht 800: link 2: \ offset at 0 mask 0f00 shift 6 plus 0 eat \ match ip protocol 17 ff match ip dst 16.0.0.1/32 Add filter into child hash table: handle 1:0:3 u32 ht 1: \ match tcp src 22 ffff action drop Add another filter to the same location: handle 2:0:3 u32 ht 2: \ match tcp src 33 ffff action drop Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-29 10:44:02 -07:00
Amritha Nambiar	1ecedc926b	ixgbe: Fix deleting link filters for cls_u32 offloads On deleting filters which are links to a child hash table, the filters in the child hash table must be cleared from the hardware if there is no link between the parent and child hash table. Verified with the following filters: Create a child hash table: handle 1: u32 divisor 1 Link to the child hash table from parent hash table: handle 800:0:10 u32 ht 800: link 1: \ offset at 0 mask 0f00 shift 6 plus 0 eat \ match ip protocol 6 ff match ip dst 15.0.0.1/32 Add filters into child hash table: handle 1:0:2 u32 ht 1: \ match tcp src 22 ffff action drop handle 1:0:3 u32 ht 1: \ match tcp src 33 ffff action drop Delete link filter from parent hash table: handle 800:0:10 u32 Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com> Acked-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-29 10:05:24 -07:00
Denys Vlasenko	3d05b15b03	e1000e: prevent division by zero if TIMINCA is zero Users report that under VMWare, er32(TIMINCA) returns zero. This causes division by zero at init time as follows: ==> incvalue = er32(TIMINCA) & E1000_TIMINCA_INCVALUE_MASK; for (i = 0; i < E1000_MAX_82574_SYSTIM_REREADS; i++) { /* latch SYSTIMH on read of SYSTIML */ systim_next = (cycle_t)er32(SYSTIML); systim_next \|= (cycle_t)er32(SYSTIMH) << 32; time_delta = systim_next - systim; temp = time_delta; ====> rem = do_div(temp, incvalue); This change makes kernel survive this, and users report that NIC does work after this change. Since on real hardware incvalue is never zero, this should not affect real hardware use case. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-29 10:00:22 -07:00
Jacob Keller	34875887f3	fm10k: fix incorrect index calculation in fm10k_write_reta The index calculated when looping through the indir array passed to fm10k_write_reta was incorrectly calculated as the first part i needs to be multiplied by 4. Fixes: 0cfea7a65738 ("fm10k: fix possible null pointer deref after kcalloc", 2016-04-13) Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-29 09:53:36 -07:00
Alexander Duyck	fb5677aa26	fm10k: Align Rx buffers to 512B blocks While reviewing the i40e driver changes to support page based receive I realized that I had overlooked the fact that the fm10k hardware required a 512 byte alignment for Rx buffers. This patch is meant to address that by changing the alignment for Rx buffers to 512 bytes instead of allowing it to be L1 cache aligned. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-29 09:46:17 -07:00
Jacob Keller	124579de46	fm10k: don't use BIT() macro where the value isn't a bitmask The FM10K_MAX_DATA_PER_TXD is really just using a bitshift as a power of 2 operation in an efficient manner. We shouldn't represent this as a BIT() because that obscures the intention of the operation. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-29 09:38:59 -07:00
Xin Long	b3a3c5176c	ixgbevf: ixgbevf_write/read_posted_mbx should use IXGBE_ERR_MBX to initialize ret_val Now ixgbevf_write/read_posted_mbx use -IXGBE_ERR_MBX as the initiative return value, but it's incorrect, cause in ixgbevf_vlan_rx_add_vid(), it use err == IXGBE_ERR_MBX, the err returned from mac.ops.set_vfta, and in ixgbevf_set_vfta_vf, it return from write/read_posted. so we should initialize err with IXGBE_ERR_MBX, instead of -IXGBE_ERR_MBX. With this fix, the other functions that called it also can work well, cause they only care about if err is 0 or not. Signed-off-by: Xin Long <lucien.xin@gmail.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-29 09:18:06 -07:00
Jarod Wilson	838086414b	e1000e: keep Rx/Tx HW_VLAN_CTAG in sync The bit in the e1000 driver that mentions explicitly that the hardware has no support for separate RX/TX VLAN accel toggling rings true for e1000e as well, and thus both NETIF_F_HW_VLAN_CTAG_RX and NETIF_F_HW_VLAN_CTAG_TX need to be kept in sync. Revert a portion of commit `889ad45666` ("e1000e: keep VLAN interfaces functional after rxvlan off") since keeping the bits in sync resolves the original issue. Signed-off-by: Jarod Wilson <jarod@redhat.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-29 09:10:17 -07:00
Jarod Wilson	889ad45666	e1000e: keep VLAN interfaces functional after rxvlan off I've got a bug report about an e1000e interface, where a VLAN interface is set up on top of it: $ ip link add link ens1f0 name ens1f0.99 type vlan id 99 $ ip link set ens1f0 up $ ip link set ens1f0.99 up $ ip addr add 192.168.99.92 dev ens1f0.99 At this point, I can ping another host on vlan 99, ip 192.168.99.91. However, if I do the following: $ ethtool -K ens1f0 rxvlan off Then no traffic passes on ens1f0.99. It comes back if I toggle rxvlan on again. I'm not sure if this is actually intended behavior, or if there's a lack of software VLAN stripping fallback, or what, but things continue to work if I simply don't call e1000e_vlan_strip_disable() if there are active VLANs (plagiarizing a function from the e1000 driver here) on the interface. Also slipped a related-ish fix to the kerneldoc text for e1000e_vlan_strip_disable here... Signed-off-by: Jarod Wilson <jarod@redhat.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-29 07:39:48 -04:00
Neerav Parikh	85a1aab79c	i40e: Don't notify client(s) for DCB changes on all VSIs When LLDP/DCBX change happens the i40e driver code flow tried to notify the client(s) for each of the PF VSIs. This resulted into kernel panic on the first VSI that didn't have any netdev associated to it. The DCB change notification to the client(s) should be done only once for the PF/LAN VSI where the client(s) instances have been added to. Also, move the notification call after the PF driver has made changes related to the updated DCB configuration. Signed-off-by: Neerav Parikh <neerav.parikh@intel.com> Signed-off-by: Usha Ketineni <usha.k.ketineni@intel.com> Tested-by: Ronald J Bynoe <ronald.j.bynoe@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-27 16:22:28 -07:00
Tushar Dave	a70e407f6d	i40e: Fix errors resulted while turning off TSO On systems with 128 CPUs, turning off TSO results in errors, i40e 0000:03:00.0: failed to get tracking for 1 vectors for VSI 400, err=-12 i40e 0000:03:00.0: Couldn't create FDir VSI i40e 0000:03:00.0: i40e_ptp_init: PTP not supported on eth0 i40e 0000:03:00.0: couldn't add VEB, err I40E_ERR_ADMIN_QUEUE_ERROR aq_err I40E_AQ_RC_ENOENT i40e 0000:03:00.0: rebuild of switch failed: -1, will try to set up simple PF connection i40e 0000:03:00.0 eth0: adding 00:10:e0:8a:24:b6 vid=0 Enabling FD_SB without checking availability of MSI-X vector is the root cause. This change adds necessary check. Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-27 16:21:26 -07:00
Bimmy Pujari	0706195802	i40e/i40evf: Bump version from 1.5.16 to 1.6.4 Signed-off-by: Bimmy Pujari <bimmy.pujari@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-27 16:14:30 -07:00
Shannon Nelson	2d1de8283f	i40e: add VSI info to macaddr messages Since the macaddr add and delete happens asynchronously, error messages don't easily get associated to the actual request. Here we add a bit of information to the error messages to help determine the source of the error. Change-ID: Id2d6df5287141c3579677d72d8bd21122823d79f Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-27 16:10:58 -07:00
Mitch Williams	5bc160319f	i40e: set default VSI without a reset Remove the need for a reset when the device enters limited promiscuous mode. This was causing heartburn for people who were using VFs and bridging, since this would require all of the VFs to undergo a reset each time the PF changed its promiscuity. Change-ID: I0a83495c5e4d68112bbc7a7a076d20fa8dd3b61c Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-27 16:06:50 -07:00
Mitch Williams	63590b6129	i40evf: always activate correct MAC address filter Always add MAC address at the tail of the MAC filter list. Since the device's "real" MAC address is added first, it will always be at the beginning of the list. This prevents an issue where the "real" MAC filter might not get added if too many other filters are added before bringing the interface up. Change-ID: I34a8aeebeb0cb87a44b24118adc4176c7b943c1c Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-27 16:02:48 -07:00
Catherine Sullivan	7d64402f5a	i40e: Fix RSS to not be limited by the number of CPUs Limiting qcount to pf->num_lan_msix, effectively limits the RSS queues to only use the number of CPUs, and ignore all other queues. We don't want to do this. If the user has changed the RSS settings to use more queues then CPUS, we want to trust they know what they are doing and let them. More importantly, if we tell them that is what we did, we want to actually do it and allow traffic into all of the queues we have allocated. This does not change the default setting to initially allocate only the number of CPUS of queue pairs. Change-ID: Ie941a96e806e4bcd016addb4e17affb46770ada5 Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-27 15:58:36 -07:00
Avinash Dayanand	01a7a9fef4	i40e: Removing unnecessary code which caused supported link mode bug Removing this code which wasn't allowing 100BaseT to show up in the supported link modes for 10GBaseT PHYs. Change-ID: Iada2eafa7ef6b4bac9a2a1380ff533ae5de51e1d Signed-off-by: Avinash Dayanand <avinash.dayanand@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-27 15:56:37 -07:00
Serey Kong	6536227d1d	i40e: fix missing DA cable check When a Direct Attach (DA) cable is used, if the i40e_set_settings function is called it would return an error. Add the DA type so the function won't fail. Change-ID: I2b802f27a5d91cfefa72fd1f852acb4d74647a8e Signed-off-by: Serey Kong <serey.kong@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-27 15:51:54 -07:00
Greg Rose	059ff69b5f	i40e: Save PCI state before suspend The i40e_suspend() function was failing to save PCI state and this would result in a kernel stack trace from a WARN_ONCE in the pci_legacy_suspend() function. Add a call to pci_save_state() to fix that problem. Change-ID: I4736e62bb660966bd208cc8af617a14cb07fc4bd Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-27 15:43:39 -07:00
Greg Rose	b33d3b7321	i40e: Clean up MSIX IRQs before suspend The i40e_suspend() function calls another function that preps the device for the power save and resume by freeing all the Tx/Rx resources and interrupts but that function does not free the "other" causes interrupt vector and IRQ. It also fails to call synchronize_irq() before freeing the IRQ vectors. This sometimes may result in some AER errors on those systems with that PCIe error reporting feature enabled. Call synchronize_irq() before freeing IRQ vectors and explicitly free the other causes interrupt resources and shut down that MSIX interrupt. Change-ID: Ib88e4536756518a352446da0232189716618ad81 Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-27 15:37:16 -07:00
Mitch Williams	0e8d95f896	i40evf: don't overflow buffer If the user adds an obscene amount of MAC addresses, the driver will run into the situation where it has too many address requests to fit into a single PF message. The driver checks for this case, and calculates the maximum number of messages that it can send. Then it completely ignores this count and overflows the buffer. Fix this by checking the address count and bailing out of the loop at the appropriate time. Change-ID: If8dcbb04602c75941dc0cd8309065e1de9ca791c Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-27 15:31:50 -07:00
Catherine Sullivan	f980d445e5	i40e: Add a call to set the client interface down We were failing to set the client interface down when we put the VSI down. Add this call so that the client doesn't get an open called with no close. Also remove an un-needed delay. The VF should not be affected at all by i40e_down. Change-ID: I1135dffef534bf84e6fed57cf51bcf590e6cfaf7 Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-27 15:25:36 -07:00
Mitch Williams	bb36071721	i40e: write HENA for VFs Now that VF RSS is configured by the PF driver, it needs to set the RSS Hash Enable registers by default. Without this, no packets will be hashed and they'll all end up on queue 0. Change-ID: I38e425f40ddb81e3b19a951cfbb939fa5b1123f1 Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-27 15:19:40 -07:00
Mitch Williams	3e25a8f31a	i40e: add hw struct local variable This function uses the i40e_hw struct all over the place, so why doesn't it keep a pointer to the struct? Add this pointer as a local variable and use it consistently throughout the function. Change-ID: I10eb688fe40909433fcb8ac7ac891cef67445d72 Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-27 15:15:47 -07:00
Mitch Williams	fb70fabad8	i40e: add functions to control default VSI Add functions to enable and disable default VSI on a VEB. This allows for configuration of limited promiscuous mode specifically for bridging purposes. Change-ID: I0cc5bd68b31c500fdff4d47e1f15d50d2739faf4 Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-06-27 15:08:28 -07:00
Johannes Thumshirn	56d766d64c	ethernet/intel: Use pci_(request\|release)_mem_regions Now that we do have pci_request_mem_regions() and pci_release_mem_regions() at hand, use it in the Intel ethernet drivers. Suggested-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> CC: David S. Miller <davem@davemloft.net>	2016-06-23 11:48:58 -05:00
Alexander Duyck	b3a49557d5	ixgbe: Replace ndo_add/del_vxlan_port with ndo_add/del_udp_enc_port This change replaces the network device operations for adding or removing a VXLAN port with operations that are more generically defined to be used for any UDP offload port but provide a type. As such by just adding a line to verify that the offload type is VXLAN we can maintain the same functionality. In addition I updated the socket address family check so that instead of excluding IPv6 we instead abort of type is not IPv4. This makes much more sense as we should only be supporting IPv4 outer addresses on this hardware. The last change is that I pulled the rtnl_lock/unlock into the conditional statement for IXGBE_FLAG2_VXLAN_REREG_NEEDED. The motivation behind this is to avoid unneeded bouncing of the mutex which will just slow down the handling of this call anyway. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-17 20:23:30 -07:00
Alexander Duyck	06a5f7f167	i40e: Move all UDP port notifiers to single function This patch goes through and combines the notifiers for VXLAN and GENEVE into a single function for each action. So there is now one combined function for getting ports, one for adding the ports, and one for deleting the ports. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-17 20:23:30 -07:00
Alexander Duyck	f174cdbe5b	fm10k: Replace ndo_add/del_vxlan_port with ndo_add/del_udp_enc_port This change replaces the network device operations for adding or removing a VXLAN port with operations that are more generically defined to be used for any UDP offload port but provide a type. As such by just adding a line to verify that the offload type if VXLAN we can maintain the same functionality. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-17 20:23:30 -07:00
Alexander Duyck	bf2d1df395	intel: Add support for IPv6 IP-in-IP offload This patch adds support for offloading IPXIP6 type packets that represent either IPv4 or IPv6 encapsulated inside of an IPv6 outer IP header. In addition with this change we should also be able to support FOU encapsulated traffic with outer IPv6 headers. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-05-20 19:25:52 -04:00
Tom Herbert	7e13318daa	net: define gso types for IPx over IPv4 and IPv6 This patch defines two new GSO definitions SKB_GSO_IPXIP4 and SKB_GSO_IPXIP6 along with corresponding NETIF_F_GSO_IPXIP4 and NETIF_F_GSO_IPXIP6. These are used to described IP in IP tunnel and what the outer protocol is. The inner protocol can be deduced from other GSO types (e.g. SKB_GSO_TCPV4 and SKB_GSO_TCPV6). The GSO types of SKB_GSO_IPIP and SKB_GSO_SIT are removed (these are both instances of SKB_GSO_IPXIP4). SKB_GSO_IPXIP6 will be used when support for GSO with IP encapsulation over IPv6 is added. Signed-off-by: Tom Herbert <tom@herbertland.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-05-20 18:03:15 -04:00
Alexander Duyck	5eee87cd51	ixgbe: Fix VLAN features error It looks like at some point I somehow transposed the location of setting the VLAN features in netdev->features and the configuration of the vlan_features. As a result the driver is now generating a warning about vlan_features being setup incorrectly. This patch corrects that by placing the update of netdev->features to include the VLAN features so that it is after the point where we write netdev->features into netdev->vlan_features. Fixes: `b83e30104b` ("ixgbe/ixgbevf: Add support for GSO partial") Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-16 16:56:38 -07:00
Emil Tantilov	11f2b494bc	ixgbe: use correct mask when enabling sriov Swap the parameters in GENMASK in order to generate the correct mask. This change fixes Tx hangs when enabling SRIOV. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-16 16:49:36 -07:00
Dan Carpenter	1c306f7f62	i40e: fix an uninitialized variable bug We removed this initialization but it is required. Let's put it back. Fixes: `895106a577` ('i40e: trivial fixes') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-14 00:21:51 -07:00
Bimmy Pujari	c74dff1aaa	i40e: Bump version from 1.5.10 to 1.5.16 Signed-off-by: Bimmy Pujari <bimmy.pujari@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-14 00:17:19 -07:00
Mitch Williams	d96a83def2	i40e: don't add broadcast filter for VFs Now that all VSIs are configured to receive broadcasts as default, we don't need to add a filter. This eliminates an annoying but harmless error message each time VFs are created or reset. Change-ID: I4cd6339684df45b0d2722133eeb84c14fa93ea19 Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-14 00:11:48 -07:00
Mitch Williams	a876c3ba59	i40e/i40evf: properly report Rx packet hash This logic is inverted. If the RXHASH flag is set, then we should go ahead and call skb_set_hash. Change-ID: Ib2e30356dced1d3e939c8061ab6ad5bd94197e7c Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-14 00:07:40 -07:00
Ashish Shah	4b28cdba48	i40e: set context to use VSI RSS LUT for SR-IOV For the SR-IOV VSIs, when the queue filtering section is valid, the RSS LUT needs to be set to use the VSI specific lookup table (otherwise it will use the PF RSS LUT table). Change-ID: Ia9377cc818078238a75c3bdeade1b593a91b3480 Signed-off-by: Ashish Shah <ashish.n.shah@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-14 00:00:24 -07:00
Akeem G Abodunrin	73df8c9e3e	i40e: Correct UDP packet header for non_tunnel-ipv6 This patch corrects Rx ptype payload layer for non_tunneled ipv6. It should be layer 4 for UDP, instead of layer 3. Change-ID: I9382e4458ab3c4e58f6d2e9f195d5d4ee513805e Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-13 23:56:13 -07:00
Jacob Keller	c420815d12	i40e: change Rx hang message into a WARN_ONCE Use WARN_ONCE in order to highlight the issue, but don't display a warning every time. The user should be able to see the ethtool counter we created if necessary to see how often it is occurring. Change-ID: I40c4ea159819b64a7d33b7f5716749089791533a Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-13 23:44:59 -07:00
Catherine Sullivan	06566e5dd4	i40e: Refactor ethtool get_settings Previously we were only looking at the FW supported PHY types if link was down, because we want to be more specific when link is up. This refactor changes this. When link is down, we still rely on the FW supported PHY types, but when link is up, we select the possible supported link modes from what we know about the current PHY type, and AND that with the FW supported PHY types. Change-ID: Ice5dad83f2a17932b0b8b59f07439696ad6aa013 Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-13 23:32:15 -07:00
Mitch Williams	eee4172abc	i40e: lie to the VF If an untrusted VF attempts to configure promiscuous mode, log a message pointing out its naughty behavior. But then, instead of returning an error to the offender, just lie to it and say everything's OK. It will continue on its way, thinking it's in promiscuous mode, but receiving no packets except its own. Change-ID: I63369215b1720f3c531eedfc06af86ff8c0e3dc8 Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-13 23:23:19 -07:00
Anjali Singhai Jain	b556989230	i40e: Add vf-true-promisc-support priv flag This patch adds priv-flag knob to configure global true promisc support. With this patch the user can decide the flavor of promiscuous that the VFs will see when promiscuous mode is enabled on the interface. Since this a global setting for the whole device, the priv-flag is exposed only on the first PF of the device. The default is true promisc support is off, which means the promisc mode for the VF will be limited/defport mode. For the PF, we still will be in limited promisc unless in MFP mode irrespective of the flavor picked through this knob. Usage: On PF0 ethtool --show-priv-flags p261p1 Private flags for p261p1: MFP : off LinkPolling : off flow-director-atr : on veb-stats : off hw-atr-eviction : off vf-true-promisc-support: off to enable setting true promisc ethtool --set-priv-flags p261p1 vf-true-promisc-support on At this point if the VF is set to trust and promisc is enabled on the VF through ip link set ... promisc on The VF/VFs will be able to see ALL ingress traffic Change-Id: I8fac4b6eb1af9ca77b5376b79c50bdce5055bd94 Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-13 22:48:46 -07:00
Shannon Nelson	f3d5849756	i40e: Implement the API function for aq_set_switch_config Add the support code for calling the AdminQ API call aq_set_switch_config Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-13 22:37:02 -07:00
Anjali Singhai Jain	f42a5c74da	i40e: Add allmulti support for the VF This patch enables a feature to enable/disable all multicast for a trusted VF. Change-Id: I926eba7f8850c8d40f8ad7e08bbe4056bbd3985f Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-13 22:31:42 -07:00
Kevin Scott	06c0e39bbe	i40e: Add support for disabling all link and change bits needed for PHY interactions Add flag to tell firmware to disable link on all ports. This patch changes the bits set for telling firmware the PHY needs to be modified by driver. Without this patch, the setting will only set that mode for the current port on the device. Because the MDIO interface is common for the copper device. The command needs to set the mode for all ports. Change-ID: I8baa7da91d384291ac95b41ae1a516604f8eb67f Signed-off-by: Kevin Scott <kevin.c.scott@intel.com> Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-13 21:36:59 -07:00
Jacob Keller	aa524b66c5	e1000e: don't modify SYSTIM registers during SIOCSHWTSTAMP ioctl The e1000e_config_hwtstamp function was incorrectly resetting the SYSTIM registers every time the ioctl was being run. If you happened to be running ptp4l and lost the PTP connect (removing cable, or blocking the UDP traffic for example), then ptp4l will eventually perform a restart which involves re-requesting timestamp settings. In e1000e this has the unfortunate and incorrect result of resetting SYSTIME to the kernel time. Since kernel time is usually in UTC, and PTP time is in TAI, this results in the leap second being re-applied. Fix this by extracting the SYSTIME reset out into its own function, e1000e_ptp_reset, which we call during reset to restore the hardware registers. This function will (a) restart the timecounter based on the new system time, (b) restore the previous PPB setting, and (c) restore the previous hwtstamp settings. In order to perform (b), I had to modify the adjfreq ptp function pointer to store the old delta each time it is called. This also has the side effect of restoring the correct base timinca register correctly. The driver does not need to explicitly zero the ptp_delta variable since the entire adapter structure comes zero-initialized. Reported-by: Brian Walsh <brian@walsh.ws> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Brian Walsh <brian@walsh.ws> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-13 15:30:44 -07:00
Alexander Duyck	e10715d3e9	igb/igbvf: Add support for GSO partial This patch adds support for partial GSO segmentation in the case of tunnels. Specifically with this change the driver an perform segmentation as long as the frame either has IPv6 inner headers, or we are allowed to mangle the IP IDs on the inner header. This is needed because we will not be modifying any fields from the start of the start of the outer transport header to the start of the inner transport header as we are treating them like they are just a block of IP options. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-13 15:26:37 -07:00
Jacob Keller	942c711206	e1000e: mark shifted values as unsigned The E1000_ICH_NVM_SIG_MASK value is shifted, out to the 31st bit, which is the signed bit for signed constants. Mark these values as unsigned to prevent compiler warnings and issues on platforms which a different signed bit implementation. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-13 15:19:05 -07:00
Jacob Keller	18dd239207	e1000e: use BIT() macro for bit defines This prevents signed bitshift issues when the shift would overwrite the signed bit, and prevents making this mistake in the future when copying and modifying code. Use GENMASK or the unsigned postfix for cases which aren't suitable for BIT() macro. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-13 15:15:36 -07:00
Jacob Keller	0ed2dbf4f4	igbvf: use BIT() macro instead of shifts To prevent signed bitshift issues, and improve code readability, use the BIT() macro. Also make use of GENMASK or the unsigned postfix where this is more appropriate than BIT() Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-13 15:12:03 -07:00
Jacob Keller	12b28b4108	igbvf: remove unused variable and dead code The variable rdlen is set but never used, and thus setting it is dead code. Remove it. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-13 15:06:33 -07:00
Nathan Sullivan	3f544d2a4d	igb: adjust PTP timestamps for Tx/Rx latency Table 7-62 on page 338 of the i210 datasheet lists TX and RX latencies for the various speeds the chip supports. To give better PTP timestamp accuracy, adjust the timestamps by the amounts Intel gives based on current link speed. Signed-off-by: Nathan Sullivan <nathan.sullivan@ni.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-13 15:02:08 -07:00
Denys Vlasenko	ab507c9a54	e1000e: e1000e_cyclecounter_read(): do overflow check only if needed SYSTIMH:SYSTIML registers are incremented by 24-bit value TIMINCA[23..0] er32(SYSTIML) are probably moderately expensive (they are pci bus reads). Can we avoid one of them? Yes, we can. If the SYSTIML value we see is smaller than 0xff000000, the overflow into SYSTIMH would require at least two increments. We do two reads, er32(SYSTIML) and er32(SYSTIMH), in this order. Even if one increment happens between them, the overflow into SYSTIMH is impossible, and we can avoid doing another er32(SYSTIML) read and overflow check. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-13 14:56:35 -07:00
Denys Vlasenko	a07fd74d5e	e1000e: e1000e_cyclecounter_read(): fix er32(SYSTIML) overflow check If two consecutive reads of the counter are the same, it is also not an overflow. "systimel_1 < systimel_2" should be "systimel_1 <= systimel_2". Before the patch, we could perform an erroneous correction: Let's say that systimel_1 == systimel_2 == 0xffffffff. "systimel_1 < systimel_2" is false, we think it's an overflow, we read "systimeh = er32(SYSTIMH)" which meanwhile had incremented, and use "(systimeh << 32) + systimel_2" value which is 2^32 too large. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> CC: intel-wired-lan@lists.osuosl.org Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-13 14:52:31 -07:00
Denys Vlasenko	fb5277f2c2	e1000e: e1000e_cyclecounter_read(): incvalue is 32 bits, not 64 "incvalue" variable holds a result of "er32(TIMINCA) & E1000_TIMINCA_INCVALUE_MASK" and used in "do_div(temp, incvalue)" as a divisor. Thus, "u64 incvalue" declaration is probably a mistake. Even though it seems to be a harmless one, let's fix it. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-13 14:46:45 -07:00
Jacob Keller	8008f68cb8	igb: make igb_update_pf_vlvf static Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-13 14:39:59 -07:00
Jacob Keller	a51d8c217b	igb: use BIT() macro or unsigned prefix For bitshifts, we should make use of the BIT macro when possible, and ensure that other bitshifts are marked as unsigned. This helps prevent signed bitshift errors, and ensures similar style. Make use of GENMASK and the unsigned postfix where BIT() isn't appropriate. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-13 14:39:47 -07:00
Brian Walsh	847042a6a5	e1000e: Cleanup consistency in ret_val variable usage Fixed the file to use a consistent ret_val for return value checking. Signed-off-by: Brian Walsh <brian@walsh.ws> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-13 14:30:40 -07:00
Steve Shih	e11f303e3d	e1000e: fix ethtool autoneg off for non-copper This patch fixes the issues for disabling auto-negotiation and forcing speed and duplex settings for the non-copper media. For non-copper media, e1000_get_settings should return ETH_TP_MDI_INVALID for eth_tp_mdix_ctrl instead of ETH_TP_MDI_AUTO so subsequent e1000_set_settings call would not fail with -EOPNOTSUPP. e1000_set_spd_dplx should not automatically turn autoneg back on for forced 1000 Mbps full duplex settings for non-copper media. Cc: xe-kernel@external.cisco.com Cc: Daniel Walker <dwalker@fifo99.com> Signed-off-by: Steve Shih <sshih@cisco.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-13 14:23:37 -07:00
Julia Lawall	3949c4ac8c	i40e: constify i40e_client_ops structure The i40e_client_ops structure is never modified, so declare it as const. Done with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-05 23:32:59 -07:00
Arnd Bergmann	ce927db487	i40e: fix misleading indentation Newly added code in i40e_vc_config_promiscuous_mode_msg() is indented in a way that gcc rightly complains about: drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c: In function 'i40e_vc_config_promiscuous_mode_msg': drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:1543:4: error: this 'if' clause does not guard... [-Werror=misleading-indentation] if (f->vlan >= 0 && f->vlan <= I40E_MAX_VLANID) ^~ drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:1550:5: note: ...this statement, but the latter is misleadingly indented as if it is guarded by the 'if' aq_err = pf->hw.aq.asq_last_status; From the context, it looks like the aq_err assignment was meant to be inside of the conditional expression, so I'm adding the appropriate curly braces now. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: `5676a8b9cd` ("i40e: Add VF promiscuous mode driver support") Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-05 23:25:34 -07:00
Jesse Brandeburg	147e81ec75	i40e: Test memory before ethtool alloc succeeds When testing on systems with very limited amounts of RAM, a bug was found where, while changing the number of descriptors using ethtool, the driver didn't test the limits of system memory before permanently assuming it would be able to get receive buffer memory. Work around this issue by pre-allocation of the receive buffer memory, in the "ghost" ring, which is then used during reinit using the new ring length. Change-Id: I92d7a5fb59a6c884b2efdd1ec652845f101c3359 Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-05 23:17:07 -07:00
Mitch Williams	b163098ea1	i40evf: Allocate Rx buffers properly Allocate the correct number of RX buffers, and don't fiddle with next_to_use. The common RX code handles all of this. This fixes a memory leak of one page each time the driver is opened. Change-Id: Id06eca353086e084921f047acad28c14745684ee Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-05 23:07:30 -07:00
Jesse Brandeburg	bec60fc42b	i40e/i40evf: Remove unused hardware receive descriptor code The hardware supports a 16 byte descriptor for receive, but the driver was never using it in production. There was no performance benefit to the real driver of 16 byte descriptors, so drop a whole lot of complexity while getting rid of the code. Also since the previous patch made us use no-split mode all the time, drop any support in the driver for any other value in dtype and assume it is always zero (aka no-split). Hooray for code removal! Change-ID: I2257e902e4dad84a07b94db6d2e6f4ce69b27bc0 Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-05 22:59:54 -07:00
Jesse Brandeburg	ab9ad98eb5	i40evf: refactor receive routine This is part 2 of the Rx refactor series, just including changes to i40evf. This refactor aligns the receive routine with the one in ixgbe which was highly optimized. This reduces the code we have to maintain and allows for (hopefully) more readable and maintainable RX hot path. In order to do this: - consolidate the receive path into a single function that doesn't use packet split but does use pages for Rx buffers. - remove the old _1buf routine - consolidate several routines into helper functions - remove VF ethtool control over packet split - remove priv_flags interface since it is unused Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-05 22:42:58 -07:00
Jesse Brandeburg	19b85e677d	i40evf: Drop packet split receive routine As part of preparation for the rx-refactor, remove the packet split receive routine and ancillary code. Some of the split related context set up code stays in i40e_virtchnl_pf.c in case an older VF driver tries to load and still wants to use packet split. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-05 22:31:23 -07:00
Jesse Brandeburg	1a557afc4d	i40e: Refactor receive routine This is part 1 of the Rx refactor series, just including changes to i40e. This refactor aligns the receive routine with the one in ixgbe which was highly optimized. This reduces the code we have to maintain and allows for (hopefully) more readable and maintainable RX hot path. In order to do this: - consolidate the receive path into a single function that doesn't use packet split but does use pages for Rx buffers. - remove the old _1buf routine - consolidate several routines into helper functions - remove ethtool control over packet split Change-ID: I5ca100721de65992aa0114f8b4bac844b84758e0 Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-05 21:53:16 -07:00
Jesse Brandeburg	04b3b77981	i40e/i40evf: Remove reference to ring->dtype As part of the rx-refactor, the dtype variable in the i40e_ring struct is no longer used, so remove it. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-05 18:59:23 -07:00
Jesse Brandeburg	b32bfa1724	i40e: Drop packet split receive routine As part of preparation for the rx-refactor, remove the packet split receive routine and ancillary code. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-05 18:52:06 -07:00
Jesse Brandeburg	f8a952cb40	i40e/i40evf: Refactor tunnel interpretation Refactor the interpretation of a tunnel. This removes some code and lets us start using the hardware's parsing. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-05-05 18:24:10 -07:00
David S. Miller	aa8a8b05ad	Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 10GbE Intel Wired LAN Driver Updates 2016-05-04 This series contains updates to ixgbe, ixgbevf and traffic class helpers. Sridhar adds helper functions to the tc_mirred header to access tcf_mirred information and then implements them for ixgbe to enable redirection to a SRIOV VF or an offloaded MACVLAN device queue via tc 'mirred' action. Amritha adds support to set filters with multiple header fields (L3,L4) to match on. KY Srinivasan from Microsoft add Hyper-V support into ixgbevf. Emil adds 82599 sub-device IDs that were missing from the list of parts that support WoL. Then simplified the logic we use to determine WoL support by reading the EEPROM bits for MACs X540 and newer. Preethi cleaned up duplicate and unused device IDs. Fixed our ethtool stat reporting where we were ignoring higher 32 bits of stats registers, so fill out 64 bit stat values into two 32 bit words. Babu Moger from Oracle improves VF performance issues on SPARC. Alex Duyck cleans up some of the Hyper-V implementation from KY so that we can just use function pointers instead of having to identify if a given VF is running on a Linux or Windows PF. Usha makes sure that DCB and FCoE is disabled for X550EM_x/a MACs and cleans up the DCB initialization in the process. Tony cleans up the API for ixgbevf_update_xcast_mode() so we do not have to pass in the netdev parameter, since it was never used in the function. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-05-04 17:13:34 -04:00
Florian Westphal	9b36627ace	net: remove dev->trans_start previous patches removed all direct accesses to dev->trans_start, so change the netif_trans_update helper to update trans_start of netdev queue 0 instead and then remove trans_start from struct net_device. AFAICS a lot of the netif_trans_update() invocations are now useless because they occur in ndo_start_xmit and driver doesn't set LLTX (i.e. stack already took care of the update). As I can't test any of them it seems better to just leave them alone. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-05-04 14:16:50 -04:00

1 2 3 4 5 ...

3759 Commits