Add the device ID of the VF to the PCI device ID table.
Added a boolean flag is_vf in efx_nic_type to differentiate
between a VF and PF at probe time. This flag is useful in later
patches while setting MAC address specially in the
PCI-passthrough case.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow PFs to allocate shared RSS contexts if we exhaust our
exclusive RSS contexts. Make VFs use shared RSS contexts in
all cases.
Spruce up error handling so that the shadow copy of the RSS
table is updated after successful update, rather than in all
cases, so that we report the actual contents of the RSS table
after a failure to set it, rather than what we'd like it to be.
Populate context_size parameter when vacuously allocating RSS
context of size 1.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* Accept EPERM in some simple cases, the following cases are handled:
1) efx_mcdi_read_assertion()
Unprivileged PCI functions aren't allowed to GET_ASSERTS.
We return success as it's up to the primary PF to deal with asserts.
2) efx_mcdi_mon_probe() in efx_ef10_probe()
Unprivileged PCI functions aren't allowed to read sensor info, and
worrying about sensor data is the primary PF's job.
3) phy_op->reconfigure() in efx_init_port() and efx_reset_up()
Unprivileged functions aren't allowed to MC_CMD_SET_LINK, they just have
to accept the settings (including flow-control, which is what
efx_init_port() is worried about) they've been given.
4) Fallback to GET_WORKAROUNDS in efx_ef10_probe()
Unprivileged PCI functions aren't allowed to set workarounds. So if
efx_mcdi_set_workaround() fails EPERM, use efx_mcdi_get_workarounds()
to find out if workaround_35388 is enabled.
5) If DRV_ATTACH gets EPERM, try without specifying fw-variant
Unprivileged PCI functions have to use a FIRMWARE_ID of 0xffffffff
(MC_CMD_FW_DONT_CARE).
6) Don't try to exit_assertion unless one had fired
Previously we called efx_mcdi_exit_assertion even if
efx_mcdi_read_assertion had received MC_CMD_GET_ASSERTS_FLAGS_NO_FAILS.
This is unnecessary, and the resulting MC_CMD_REBOOT, even if the
AFTER_ASSERTION flag made it a no-op, would fail EPERM for unprivileged
PCI functions.
So make efx_mcdi_read_assertion return whether an assert happened, and only
call efx_mcdi_exit_assertion if it has.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To be able to use MC_CMD_VADAPTOR_SET_MAC, vadaptors must be
manually allocated and freed as automatic vadaptors will disappear
when their reference_count reaches zero, which must happen before
the MAC address is changed.
Vadaptors are allocated and freed in the vswitching_probe/remove
functions for PFs and VFs, and this means that vadaptors are restored
correctly following an MC reboot or other reset when required.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The parent PF creates vports for all its child VFs and adds MAC
addresses to these. When the VF driver loads, it can make an MCDI
call to get the MAC address that the parent PF assigned it.
The parent PF also assigns a mac address to its own vport because
implicit creation of a vAdaptor will only work on evb ports with
MAC addresses assigned.
The vport MAC address needs to be stored in the PF's nic_data
struct as it can later be changed on the vadaptor (and its net_dev
struct). When removing a vport the original MAC address must be
deleted.
A new flag is needed in the VF data structure to identify whether
a vport has been assigned to the VF. This is to determine whether
it needs to be un-assigned before freeing the vport. Also,
attempting to un-assign a vport which is not assigned will result
in an EALREADY error.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Added efx_nic_type structure for VF.
Mapped a different BAR for VF as it uses BAR 0 for memory.
Added functions sriov_init and sriov_fini.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use MC_CMD_GET_FUNCTION_INFO to record the PF number in nic_data.
This will be needed when assigned vports to VFs.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adds functions to allocate and free vswitches and vports; vadaptors
are automatically allocated and freed when TX/RX queues are
initialised and finalised. This vswitching structure is only created
if the firmware supports it, so a check that full-featured firmware
is running is performed first.
If the MC resets, the vswitching infrastructure will need to be
recreated, so mark the "must_probe_vswitching" flag when an MC reboot
is detected.
Don't try to create a vswitch if vf-count=0
This allocation of vswitches and vports does not currently support
configuring VLAN tags, but that can be added in a future change.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The default port ID of EVB_PORT_ID_ASSIGNED is a "magic" number
for the MCFW to select the physical port of the PF. If other
vswitches and vports are created on top of the default firmware
configuration, the ID of the newly created vport is then required
when passed to MCDI commands. Currently, this doesn't happen so
the vport_id is never changed, but a subsequent patch will change
this behaviour so that other vswitches and vports are created.
The vport_id recorded in nic_data is only relevant for PFs.
VFs will have their vports created by their parent PF, and in
that case the parent PF will record the vport ID of each VF.
For a VF, nic_data->vport_id is expected to remain at the default
value.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The (future) code to add/remove vswitches and vports will be
dependent on the firmware variant.
To simplify the checking of the firmware variant, record
values for rx_dpcpu_fw_id and tx_dpcpu_fw_id in EF10 nic_data.
There was only one place where this was previously used:
efx_mcdi_print_fwver() in ethtool.c.
The MC_CMD_GET_CAPABILITIES can be replaced and the values from
nic_data used instead.
Note that the printing of "?" if the MC command fails or if the
outlength is incorrect no longer apply, because errors are returned
in efx_ef10_init_datapath_caps() in both of these cases.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The TX_DOMAIN field is currently reserved but its safer to set
it to 0 for future compatibility.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for the use of sriov_configure on EF10
to enable Virtual Functions while the driver is loaded.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The efx_vf struct contains Siena-specific fields for VFs,
so rename to siena_vf.
Also move it into the siena_nic_data struct, as EF10 will
track its VFs in its own ef10_nic_data, storing much less
information about them since VFDI is no longer used.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
By putting all the efx_{siena,ef10}_sriov_* declarations in
{siena,ef10}_sriov.h, ensure they cannot be called from nic-generic code.
Also fixes up an instance of this, where mcdi.c was calling
efx_siena_sriov_flr.
The single instance of netdev_ops should call general high level
functions that can then call something adapter specific in efx_nic_type.
We should only do adapter specialisation via efx_nic_type.
Removal of sriov functionality from the Falcon code means that tests
are needed for the presence of some callbacks.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
DSA can have nested MDIO busses, where the Ethernet MDIO bus is used
to access an MDIO bus within the switch which has the PHYs connected
to it. This nesting causes lockdep to give false positives. Use
mutex_lock_nested() to avoid this.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The SMI bus is the bottleneck in all switch operations, not the
granularity of locks. Replace the stats mutex by the SMI mutex to make
the locking concept simpler.
The REG_READ/REG_WRITE macros cannot be used while holding the SMI
mutex, since they try to acquire it. Replace with calls to the
appropriate function which does not try to get the mutex.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The SMI bus is the bottleneck in all switch operations, not the
granularity of locks. Replace the PHY mutex by the SMI mutex to make
the locking concept simpler.
The REG_READ/REG_WRITE macros cannot be used while holding the SMI
mutex, since they try to acquire it. Replace with calls to the
appropriate function which does not try to get the mutex.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The mv88e6185 is part of the family that the mv88e6131 driver
supports. Add it to the probe function, and set the number of ports.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 6171 is one member of the family 6171/6175/6350/6351. Add the
other family members to the driver.
Not tested on these new devices.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The mv88e6172 is part of the mv88e6352 family of devices. Move support
for it out of the mv88e6171 driver into the mv88e6352, which results
in some simplifications to the code.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use defines for registers, shifts and bits in the remaining register
accesses in the individual drivers, in order to aid readability.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that setting up a port is identical for all switches, centralisers
the code looping over all the ports to set them up.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The port setup code in the individual drivers is identical for 6123,
6171, and 6352, and very similar in 6131. Move it all into mv88e6xxx,
using the chip families to differentiate on features.
Similarly, the global setup is also very similar. Move the majority
into mv8e6xxx.
The chips themselves fall into families. Add helpers which uses the
device IDs to determine if a device is a member of a family or not.
Add some additional device IDs to the existing list, to make these
helper functions more complete. However these IDs are not yet added to
the probe functions.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
VXLAN must provide the skb mark and specifiy IPPROTO_UDP when doing
the FIB lookup for the remote ip. Otherwise an invalid route might
be returned.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Synchronize names with other drivers.
Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use of_property_* calls
Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
use devm_* calls
Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Synchronize names with other drivers
Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is rule for network drivers with comments blocks
which is newly checked by checkpatch.pl script.
Let's fix it.
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Removed checkpatch.pl errors and warnings.
Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds proper checks to handle the PHY-less case.
Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the current implementation, jumbo frames are supported only
for the frame sizes > 16K. This patch corrects this logic to
handle jumbo frames for lesser frame sizes (< 16K) ensuring jumbo frame
MTU is within the limit of max frame size configured in the h/w
design.
Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The packet completion interrupts for TX and RX should be serviced before
the packets are consumed. This ensures against the degenerate case when a
new completion interrupt is raised after the handler has exited but before
the interrupts are cleared. In this case its possible for the ISR to clear
an unhandled interrupt (leading to potential deadlock).
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Tested-by: Jason Wu <huanyu@xilinx.com>
Acked-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The AXI-DMA rx-delay interrupt can sometimes be triggered
when there are 0 outstanding packets received. This is due
to the fact that the receive function will greedily consume
as many packets as possible on interrupt. So if two packets
(with a very particular timing) arrive in succession they
will each cause the rx-delay interrupt, but the first interrupt
will consume both packets.
This means the second interrupt is a 0 packet receive.
This is mostly OK, except that the tail pointer register is
updated unconditionally on receive. Currently the tail pointer
is always set to the current bd-ring descriptor under
the assumption that the hardware has moved onto the next
descriptor. What this means for length 0 recv is the current
descriptor that the hardware is potentially yet to use will
be marked as the tail. This causes the hardware to think
its run out of descriptors deadlocking the whole rx path.
Fixed by updating the tail pointer to the most recent
successfully consumed descriptor.
Reported-by: Wendy Liang <wendy.liang@xilinx.com>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Tested-by: Jason Wu <huanyu@xilinx.com>
Acked-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for the RGMII. The h/w configuration
parameter C_PHY_TYPE, which represents the interface configured in
the design, is used to differentiate various interfaces supported
by AXI Ethernet.
Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
pktgen sends raw udp packets and bypasses most of the
linux networking stack. User can specify different packet sizes.
Hence we need to discard the packet if the length is greater than mtu
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adds device node to ULD info. Use the node info to alloc_ring() for ctrl
TX queues
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Passes a Congestion Channel Map to t4_sge_alloc_rxq()
for the Ethernet RX Queues based on the MPS Buffer Group Map
of the TX Channel rather than just the TX Channel Map.
Also, in t4_sge_alloc_rxq() for T5, setting up the
Congestion Manager values of the new RX Ethernet Queue is
done by firmware now.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Also changed the name of t4_hw.c:get_mps_bg_map() to t4_get_mps_bg_map()
and make it an exported routine with a definition in cxgb4.h.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to make sure that the Free List Size, in pointers, is at
least 2 Egress Queue Units (8 pointers/each) larger than the SGE's Egress
Congestion Threshold (in pointers).
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Earlier tricks of setting broadcast bit only when IPv4 address is added
onto interface are not good enough especially when autoconf comes in play.
Setting them on always is performance drag but now that multicast /
broadcast is not processed in fast-path; enabling broadcast will let
autoconf work correctly without affecting performance characteristics of
the device.
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Processing multicast / broadcast in fast path is performance draining
and having more links means more cloning and bringing performance
down further.
Broadcast; in particular, need to be given to all the virtual links.
Earlier tricks of enabling broadcast bit for IPv4 only interfaces are not
really working since it fails autoconf. Which means enabling broadcast
for all the links if protocol specific hacks do not have to be added into
the driver.
This patch defers all (incoming as well as outgoing) multicast traffic to
a work-queue leaving only the unicast traffic in the fast-path. Now if we
need to apply any additional tricks to further reduce the impact of this
(multicast / broadcast) type of traffic, it can be implemented while
processing this work without affecting the fast-path.
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Because error codes are negative, it only makes sense to
consistently use signed types when handling them. Also remove
some explicit comparisons with 0 on these variables.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The global semaphore bits should be released in the reverse of the
order that they were taken, so correct that.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
IOSF is the Intel On-chip System Fabric used in SOCs. IOSF SB is
the IOSF SideBand message interface. This patch serializes IOSF SB
access using both phy bits in the SWFW_SEMAPHORE register. It also
adds a helper function to wait for IOSF SB accesses to complete.
Use the new function to perform this wait before each access, as
specified in the datasheet, in addition to using it to wait for
IOSF SB read/write completion.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
We were using s64 for lat_ns (latency nano-second value) since in
our calculations a negative value could be a resultant. For negative
values, we then assign lat_ns to be zero, so the value passed to
do_div() was never negative, but do_div() expects the argument type
to be u64, so do a cast to resolve a compile warning seen on
PowerPC.
CC: Yanjiang Jin <yanjiang.jin@windriver.com>
CC: Yanir Lubetkin <yanirx.lubetkin@intel.com>
Reported-by: Yanjiang Jin <yanjiang.jin@windriver.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
The driver wasn't allowing jumbo frames to be
enabled when CRC stripping was disabled, however it was allowing CRC
stripping to be disabled while jumbo frames were enabled. This fixes that by
making it so that the NETIF_F_RXFCS flag cannot be set when jumbo frames are
enabled on 82579 and newer parts.
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When the VLAN_HLEN was added to the calculation for the maximum frame size
there seems to have been a number of issues added to the driver.
The first issue is that in some cases the maximum frame size for a device
never really reached the actual maximum frame size as the VLAN header
length was not included the calculation for that value. As a result some
parts only supported a maximum frame size of either 1496 in the case of
parts that didn't support jumbo frames, and 8996 in the case of the parts
that do.
The second issue is the fact that there were several checks that weren't
updated so as a result setting an MTU of 1500 was treated as enabling jumbo
frames as the calculated value was 1522 instead of 1518. I have addressed
those by replacing ETH_FRAME_LEN with VLAN_ETH_FRAME_LEN where appropriate.
The final issue was the fact that lowering the MTU below 1500 would cause
the driver to allocate 2K buffers for the rings. This is an old issue that
was fixed several years ago in igb/ixgbe and I am addressing now by just
replacing == with a <= so that we always just round up to 1522 for anything
that isn't a jumbo frame.
Fixes: c751a3d58c ("e1000e: Correctly include VLAN_HLEN when changing interface MTU")
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>