Currently adaptive interrupt moderation is set by calculating
and configuring an EQ-delay every second. This is done via
a FW-cmd. But, on Skyhawk-R a "re-arm to interrupt" delay
can be set while ringing the EQ-DB. This patch uses this
facility to calculate and set the interrupt delay every 1ms.
This helps moderating interrupts better when the traffic
is bursty.
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@avagotech.com>
Signed-off-by: Sathya Perla <sathya.perla@avagotech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for spoofchk configuration for VFs.
When it is enabled, "spoof checking" is done for both MAC-address and VLAN.
For each VF, the HW ensures that the source MAC address (or vlan) of
every outgoing packet exists in the MAC-list (or vlan-list) configured
for RX filtering for that VF. If not, the packet is dropped and an error
is reported to the driver in the TX completion; this is reflected in the
"tx_spoof_check_err" ethtool counter.
This feature is supported in Skyhawk FW version 10.6.31.0 and above.
Signed-off-by: Kalesh AP <kalesh.purayil@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@avagotech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the device ID of the VF to the PCI device ID table.
Added a boolean flag is_vf in efx_nic_type to differentiate
between a VF and PF at probe time. This flag is useful in later
patches while setting MAC address specially in the
PCI-passthrough case.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow PFs to allocate shared RSS contexts if we exhaust our
exclusive RSS contexts. Make VFs use shared RSS contexts in
all cases.
Spruce up error handling so that the shadow copy of the RSS
table is updated after successful update, rather than in all
cases, so that we report the actual contents of the RSS table
after a failure to set it, rather than what we'd like it to be.
Populate context_size parameter when vacuously allocating RSS
context of size 1.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* Accept EPERM in some simple cases, the following cases are handled:
1) efx_mcdi_read_assertion()
Unprivileged PCI functions aren't allowed to GET_ASSERTS.
We return success as it's up to the primary PF to deal with asserts.
2) efx_mcdi_mon_probe() in efx_ef10_probe()
Unprivileged PCI functions aren't allowed to read sensor info, and
worrying about sensor data is the primary PF's job.
3) phy_op->reconfigure() in efx_init_port() and efx_reset_up()
Unprivileged functions aren't allowed to MC_CMD_SET_LINK, they just have
to accept the settings (including flow-control, which is what
efx_init_port() is worried about) they've been given.
4) Fallback to GET_WORKAROUNDS in efx_ef10_probe()
Unprivileged PCI functions aren't allowed to set workarounds. So if
efx_mcdi_set_workaround() fails EPERM, use efx_mcdi_get_workarounds()
to find out if workaround_35388 is enabled.
5) If DRV_ATTACH gets EPERM, try without specifying fw-variant
Unprivileged PCI functions have to use a FIRMWARE_ID of 0xffffffff
(MC_CMD_FW_DONT_CARE).
6) Don't try to exit_assertion unless one had fired
Previously we called efx_mcdi_exit_assertion even if
efx_mcdi_read_assertion had received MC_CMD_GET_ASSERTS_FLAGS_NO_FAILS.
This is unnecessary, and the resulting MC_CMD_REBOOT, even if the
AFTER_ASSERTION flag made it a no-op, would fail EPERM for unprivileged
PCI functions.
So make efx_mcdi_read_assertion return whether an assert happened, and only
call efx_mcdi_exit_assertion if it has.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To be able to use MC_CMD_VADAPTOR_SET_MAC, vadaptors must be
manually allocated and freed as automatic vadaptors will disappear
when their reference_count reaches zero, which must happen before
the MAC address is changed.
Vadaptors are allocated and freed in the vswitching_probe/remove
functions for PFs and VFs, and this means that vadaptors are restored
correctly following an MC reboot or other reset when required.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The parent PF creates vports for all its child VFs and adds MAC
addresses to these. When the VF driver loads, it can make an MCDI
call to get the MAC address that the parent PF assigned it.
The parent PF also assigns a mac address to its own vport because
implicit creation of a vAdaptor will only work on evb ports with
MAC addresses assigned.
The vport MAC address needs to be stored in the PF's nic_data
struct as it can later be changed on the vadaptor (and its net_dev
struct). When removing a vport the original MAC address must be
deleted.
A new flag is needed in the VF data structure to identify whether
a vport has been assigned to the VF. This is to determine whether
it needs to be un-assigned before freeing the vport. Also,
attempting to un-assign a vport which is not assigned will result
in an EALREADY error.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Added efx_nic_type structure for VF.
Mapped a different BAR for VF as it uses BAR 0 for memory.
Added functions sriov_init and sriov_fini.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use MC_CMD_GET_FUNCTION_INFO to record the PF number in nic_data.
This will be needed when assigned vports to VFs.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adds functions to allocate and free vswitches and vports; vadaptors
are automatically allocated and freed when TX/RX queues are
initialised and finalised. This vswitching structure is only created
if the firmware supports it, so a check that full-featured firmware
is running is performed first.
If the MC resets, the vswitching infrastructure will need to be
recreated, so mark the "must_probe_vswitching" flag when an MC reboot
is detected.
Don't try to create a vswitch if vf-count=0
This allocation of vswitches and vports does not currently support
configuring VLAN tags, but that can be added in a future change.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The default port ID of EVB_PORT_ID_ASSIGNED is a "magic" number
for the MCFW to select the physical port of the PF. If other
vswitches and vports are created on top of the default firmware
configuration, the ID of the newly created vport is then required
when passed to MCDI commands. Currently, this doesn't happen so
the vport_id is never changed, but a subsequent patch will change
this behaviour so that other vswitches and vports are created.
The vport_id recorded in nic_data is only relevant for PFs.
VFs will have their vports created by their parent PF, and in
that case the parent PF will record the vport ID of each VF.
For a VF, nic_data->vport_id is expected to remain at the default
value.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The (future) code to add/remove vswitches and vports will be
dependent on the firmware variant.
To simplify the checking of the firmware variant, record
values for rx_dpcpu_fw_id and tx_dpcpu_fw_id in EF10 nic_data.
There was only one place where this was previously used:
efx_mcdi_print_fwver() in ethtool.c.
The MC_CMD_GET_CAPABILITIES can be replaced and the values from
nic_data used instead.
Note that the printing of "?" if the MC command fails or if the
outlength is incorrect no longer apply, because errors are returned
in efx_ef10_init_datapath_caps() in both of these cases.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The TX_DOMAIN field is currently reserved but its safer to set
it to 0 for future compatibility.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for the use of sriov_configure on EF10
to enable Virtual Functions while the driver is loaded.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The efx_vf struct contains Siena-specific fields for VFs,
so rename to siena_vf.
Also move it into the siena_nic_data struct, as EF10 will
track its VFs in its own ef10_nic_data, storing much less
information about them since VFDI is no longer used.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
By putting all the efx_{siena,ef10}_sriov_* declarations in
{siena,ef10}_sriov.h, ensure they cannot be called from nic-generic code.
Also fixes up an instance of this, where mcdi.c was calling
efx_siena_sriov_flr.
The single instance of netdev_ops should call general high level
functions that can then call something adapter specific in efx_nic_type.
We should only do adapter specialisation via efx_nic_type.
Removal of sriov functionality from the Falcon code means that tests
are needed for the presence of some callbacks.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Synchronize names with other drivers.
Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use of_property_* calls
Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
use devm_* calls
Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Synchronize names with other drivers
Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is rule for network drivers with comments blocks
which is newly checked by checkpatch.pl script.
Let's fix it.
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Removed checkpatch.pl errors and warnings.
Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds proper checks to handle the PHY-less case.
Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the current implementation, jumbo frames are supported only
for the frame sizes > 16K. This patch corrects this logic to
handle jumbo frames for lesser frame sizes (< 16K) ensuring jumbo frame
MTU is within the limit of max frame size configured in the h/w
design.
Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The packet completion interrupts for TX and RX should be serviced before
the packets are consumed. This ensures against the degenerate case when a
new completion interrupt is raised after the handler has exited but before
the interrupts are cleared. In this case its possible for the ISR to clear
an unhandled interrupt (leading to potential deadlock).
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Tested-by: Jason Wu <huanyu@xilinx.com>
Acked-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The AXI-DMA rx-delay interrupt can sometimes be triggered
when there are 0 outstanding packets received. This is due
to the fact that the receive function will greedily consume
as many packets as possible on interrupt. So if two packets
(with a very particular timing) arrive in succession they
will each cause the rx-delay interrupt, but the first interrupt
will consume both packets.
This means the second interrupt is a 0 packet receive.
This is mostly OK, except that the tail pointer register is
updated unconditionally on receive. Currently the tail pointer
is always set to the current bd-ring descriptor under
the assumption that the hardware has moved onto the next
descriptor. What this means for length 0 recv is the current
descriptor that the hardware is potentially yet to use will
be marked as the tail. This causes the hardware to think
its run out of descriptors deadlocking the whole rx path.
Fixed by updating the tail pointer to the most recent
successfully consumed descriptor.
Reported-by: Wendy Liang <wendy.liang@xilinx.com>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Tested-by: Jason Wu <huanyu@xilinx.com>
Acked-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for the RGMII. The h/w configuration
parameter C_PHY_TYPE, which represents the interface configured in
the design, is used to differentiate various interfaces supported
by AXI Ethernet.
Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
pktgen sends raw udp packets and bypasses most of the
linux networking stack. User can specify different packet sizes.
Hence we need to discard the packet if the length is greater than mtu
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adds device node to ULD info. Use the node info to alloc_ring() for ctrl
TX queues
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Passes a Congestion Channel Map to t4_sge_alloc_rxq()
for the Ethernet RX Queues based on the MPS Buffer Group Map
of the TX Channel rather than just the TX Channel Map.
Also, in t4_sge_alloc_rxq() for T5, setting up the
Congestion Manager values of the new RX Ethernet Queue is
done by firmware now.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Also changed the name of t4_hw.c:get_mps_bg_map() to t4_get_mps_bg_map()
and make it an exported routine with a definition in cxgb4.h.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to make sure that the Free List Size, in pointers, is at
least 2 Egress Queue Units (8 pointers/each) larger than the SGE's Egress
Congestion Threshold (in pointers).
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Because error codes are negative, it only makes sense to
consistently use signed types when handling them. Also remove
some explicit comparisons with 0 on these variables.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The global semaphore bits should be released in the reverse of the
order that they were taken, so correct that.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
IOSF is the Intel On-chip System Fabric used in SOCs. IOSF SB is
the IOSF SideBand message interface. This patch serializes IOSF SB
access using both phy bits in the SWFW_SEMAPHORE register. It also
adds a helper function to wait for IOSF SB accesses to complete.
Use the new function to perform this wait before each access, as
specified in the datasheet, in addition to using it to wait for
IOSF SB read/write completion.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
We were using s64 for lat_ns (latency nano-second value) since in
our calculations a negative value could be a resultant. For negative
values, we then assign lat_ns to be zero, so the value passed to
do_div() was never negative, but do_div() expects the argument type
to be u64, so do a cast to resolve a compile warning seen on
PowerPC.
CC: Yanjiang Jin <yanjiang.jin@windriver.com>
CC: Yanir Lubetkin <yanirx.lubetkin@intel.com>
Reported-by: Yanjiang Jin <yanjiang.jin@windriver.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
The driver wasn't allowing jumbo frames to be
enabled when CRC stripping was disabled, however it was allowing CRC
stripping to be disabled while jumbo frames were enabled. This fixes that by
making it so that the NETIF_F_RXFCS flag cannot be set when jumbo frames are
enabled on 82579 and newer parts.
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When the VLAN_HLEN was added to the calculation for the maximum frame size
there seems to have been a number of issues added to the driver.
The first issue is that in some cases the maximum frame size for a device
never really reached the actual maximum frame size as the VLAN header
length was not included the calculation for that value. As a result some
parts only supported a maximum frame size of either 1496 in the case of
parts that didn't support jumbo frames, and 8996 in the case of the parts
that do.
The second issue is the fact that there were several checks that weren't
updated so as a result setting an MTU of 1500 was treated as enabling jumbo
frames as the calculated value was 1522 instead of 1518. I have addressed
those by replacing ETH_FRAME_LEN with VLAN_ETH_FRAME_LEN where appropriate.
The final issue was the fact that lowering the MTU below 1500 would cause
the driver to allocate 2K buffers for the rings. This is an old issue that
was fixed several years ago in igb/ixgbe and I am addressing now by just
replacing == with a <= so that we always just round up to 1522 for anything
that isn't a jumbo frame.
Fixes: c751a3d58c ("e1000e: Correctly include VLAN_HLEN when changing interface MTU")
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
'err' will be overwritten so no need to initialize it to zero.
Signed-off-by: Jean Sacren <sakiwit@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
igb_enable_mas() should only be called for the 82575 and has no clear
return so changing it to void. Also simplify the odd conditional
expression.
Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7ab87ff4c7 ("via-rhine: move work from
irq handler to softirq and beyond") forgot to explicitely control the
lifespan of the tx_dirty and tx_cur pointers.
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Follow the now usual transmit descriptor update path:
1. content change
2. dma_wmb
3. ownership change
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The NAPI receive path depends on desc->rx_status but it does not
enforce any explicit receive barrier.
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver no longer produces holes in its receive ring so rx_head_desc
only duplicates cur_rx.
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rationales:
- throttle work under memory pressure
- lower receive descriptor recycling latency for the network adapter
- lower the maintenance burden of uncommon paths
The patch is twofold:
- it fails early if the receive ring can't be completely initialized
at dev->open() time
- it drops packets on the floor in the napi receive handler so as to
keep the received ring full
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It's used to initialize the receive ring but it will actually shine when
the receive poll code is reworked.
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>