Issue found by checkpatch.
Signed-off-by: Rohit Visavalia <rohit.visavalia@softnautics.com>
Acked-by: Michal Kalderon <michal.kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Resolved Warning: networking block comments don't use an empty /* line,
use /* Comment...
Issue found by checkpatch.
Signed-off-by: Rohit Visavalia <rohit.visavalia@softnautics.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
t4_wr_mbox_meat_timeout() can be called from both softirq
context and process context, hence protect the mbox with
spin_lock_bh() instead of simple spin_lock()
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Support increased precision frequency adjustment format (FP44) used
by Medford2 adapters.
Signed-off-by: Laurence Evans <levans@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The time_format that we stash in the PTP data structure is never
referenced, so we can remove it. Instead, store the information needed
to interpret sync event timestamps.
Also rolls in a couple of other related minor PTP fixes.
Based on patches by Bert Kenward <bkenward@solarflare.com> and Laurence
Evans <levans@solarflare.com>.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Support MC_CMD_PTP_OUT_GET_TIMESTAMP_CORRECTIONS_V2. Extract general
timestamp corrections in addition to PTP corrections. Apply receive
timestamp corrections for general datapath receive timestamping, and
correspondingly for transmit.
Signed-off-by: Laurence Evans <levans@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use timestamp conversion function with correction to avoid duplicate
correction handling.
Signed-off-by: Laurence Evans <levans@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We check the license for TX hardware timestamping capability.
The PTP probe will have enabled PTP sync events from the adapter. If
later, at TX queue init, it turns out we do not have the license, we
don't need the sync events either.
Signed-off-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For this we create and use one or more new TX queues on the PTP channel,
and enable sync events for it.
Based on a patch by Martin Habets <mhabets@solarflare.com>.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
TX timestamps on 8000 series are supplied from the MAC. This timestamp is
only 48 bits long. The high order bits from the last time sync event are
used for the top 16 bits.
Signed-off-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If we try to enable the feature and do not have the license for it, the
MCPU will refuse and fail our TX queue init.
Signed-off-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We can now transmit SKBs in 2 ways:
1. Via the MC (for the 7XXX series and earlier), using
efx_ptp_xmit_skb_mc().
2. Via the TX queues on the dedicated PTP channel (8XXX series and later),
using efx_ptp_xmit_skb_queue().
The PTP worker thread uses the method set up at probe time. It never
checked the return code from the old efx_ptp_xmit_skb(), so it now
returns void.
We increment the TX dropped counter of the device if the transmit fails.
As a result of the probe per channel the remove gets called multiple times.
Clean up efx->ptp_data properly to avoid the 2nd call blowing up.
Signed-off-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use MC capability MC_CMD_GET_CAPABILITIES_V2_OUT_TX_MAC_TIMESTAMPING to
detect whether the NIC supports timestamping packets sent out the main
datapath.
Signed-off-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Before this work, TX timestamping is done by sending each SKB to the MC.
On the 8000 series (Medford1) we have high speed timestamping via the
MAC, which means we can use normal TX queues for this without a
significant drop in bandwidth. On the X2000 series (Medford2) support
for transmitting via the MC is removed, so the new way must be used.
This patch enables timestamping on a TX queue, if requested.
It also enhances TX event handling to process the extra completion events,
and puts the time in the SKB.
Signed-off-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The NAPI budget is only for RX processing work, not other work such as
TX or MCDI completion handling.
Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
STA control register has areas of mode and opcodes for opeations. 18 bit is
using for mode selection, where 0 is old MIO/MDIO access method and 1 is
indirect access mode. 19-20 bits are using for setting up read/write
operation(STA opcodes). In current state 'read' is set into old MIO/MDIO mode
with 19 bit and write operation is set into 18 bit which is mode selection,
not a write operation. To correlate write with read we set it into 20 bit.
All those bit operations are MSB 0 based.
Signed-off-by: Ivan Mikhaylov <ivan@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
emac4syn chips has availability to use 8192 rx/tx fifo buffer sizes,
in current state if we set it up in dts 8192 as example, we will get
only 2048 which may impact on network speed.
Signed-off-by: Ivan Mikhaylov <ivan@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
100GbE Intel Wired LAN Driver Updates 2018-01-24
This series contains updates to fm10k only.
Alex fixes MACVLAN offload for fm10k, where we were not seeing unicast
packets being received because we did not correctly configure the
default VLAN ID for the port and defaulting to 0.
Jake cleans up unnecessary parenthesis in a couple of "if" statements.
Fixed the driver to stop adding VLAN 0 into the VLAN table, since it
would cause the VLAN table to be inconsistent between the PF and VF.
Also fixed an issue where we were assuming that VLAN 1 is enabled when
the default VLAN ID is not set, so resolve by not requesting any filters
for the default_vid if it has not yet been assigned.
Ngai fixes an issue which was generating a dmesg regarding unbale to
kill a particular VLAN ID for the device. This is due to
ndo_vlan_rx_kill_vid() exits with an error and the handler for this ndo
is fm10k_update_vid() which exits prematurely under PF VLAN management.
So to resolve, we must check the VLAN update action type before exiting
fm10k_update_vid(), and act appropriately based on the action type.
Also corrected code comment typos.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Clarify the comment for when entering promiscuous mode that we update
the VLAN table. Add a comment distinguishing the case where we're
exiting promiscuous mode and need to clear the entire VLAN table.
Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@gmail.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Since commit 856dfd69e84f ("fm10k: Fix multicast mode synch issues",
2016-03-03) we've incorrectly assumed that VLAN 1 is enabled when the
default VID is not set.
This occurs because we check the default_vid and if it's zero, start
several loops over the active_vlans bitmask at 1, instead of checking to
ensure that that bit is active.
This happened because of commit d9ff3ee8efe9 ("fm10k: Add support for
VLAN 0 w/o default VLAN", 2014-08-07) which mistakenly assumed that we
should send requests for MAC and VLAN filters with VLAN 0 when the
default_vid isn't set.
However, the switch generally considers this an invalid configuration,
so the only time we'd have a default_vid of 0 is when the switch is
down.
Instead, lets just not request any filters for the default_vid if it's
not yet been assigned.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently, when the driver loads, it sends a request to add VLAN 0 to the
VLAN table. For the PF, this is honored, and VLAN 0 is indeed set. For
the VF, this request is silently converted into a request for the
default VLAN as defined by either the switch vid or the PF vid.
This results in the odd behavior that the VLAN table doesn't appear
consistent between the PF and the VF.
Furthermore, setting a MAC filter with VLAN 0 is generally considered an
invalid configuration by the switch, and since commit 856dfd69e84f
("fm10k: Fix multicast mode synch issues", 2016-03-03) we've had code
which prevents us from ever sending such a request.
Since there's not really a good reason to keep VLAN 0 in the VLAN table,
stop requesting it in fm10k_restore_rx_state().
This might seem to indicate that we would no longer properly configure
the MAC and VLAN tables for the default vid. However, due to the way
that fm10k_find_next_vlan() behaves, it will always return the
default_vid as enabled.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When a VF is under PF VLAN assignment:
ip link set <pf> vf <#> vlan <vid>
This will remove all previous entries in the VLAN table including those
generated by VLAN interfaces created on the VF. The issue arises when
the VF is under PF VLAN assignment and one or more of these VLAN
interfaces of the VF are deleted. When deleting these VLAN interfaces,
the following message will be generated in "dmesg":
failed to kill vid 0081/<vid> for device <vf>
This is due to the fact that "ndo_vlan_rx_kill_vid" exits with an error.
The handler for this ndo is "fm10k_update_vid". Any calls to this
function while under PF VLAN management will exit prematurely and, thus,
it will generate the failure message.
Additionally, since "fm10k_update_vid" exits prematurely, none of the
VLAN update is performed. So, even though the actual VLAN interfaces of
the VF will be deleted, the active_vlans bitmask is not cleared. When
the VF is no longer under PF VLAN assignment, the driver mistakenly
restores the previous entries of the VLAN table based on an
unsynchronized list of active VLANs.
The solution to this issue involves checking the VLAN update action type
before exiting "fm10k_update_vid". If the VLAN update action type is to
"add", this action will not be permitted while the VF is under PF VLAN
assignment and the VLAN update is abandoned like before.
However, if the VLAN update action type is to "kill", then we need to
also clear the active_vlans bitmask. However, we don't need to actually
queue any messages to the PF, because the MAC and VLAN tables have
already been cleared, and the PF would silently ignore these requests
anyways.
Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This fixes a few warnings found by checkpatch.pl --strict
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Fixes the following sparse warning:
drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c:46:27: warning:
symbol 'pedits' was not declared. Should it be static?
Fixes: 27ece1f357 ("cxgb4: add tc flower support for ETH-DMAC rewrite")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since TC block changes drivers are required to check if
the TC hw offload flag is set on the interface themselves.
Fixes: 2f4b411a3d ("i40e: Enable cloud filters via tc-flower")
Fixes: 44ae12a768 ("net: sched: move the can_offload check from binding phase to rule insertion phase")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Amritha Nambiar <amritha.nambiar@intel.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A tid was allocated for reserved MR during initialization but
not freed. This lead to an annoying output message during
rdma unload flow.
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Double reservation for kernel dedicated dpi was performed.
Once in the core module and once in qedr.
Remove the reservation from core.
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The fm10k driver didn't work correctly when macvlan offload was enabled.
Specifically what would occur is that we would see no unicast packets being
received. This was traced down to us not correctly configuring the default
VLAN ID for the port and defaulting to 0.
To correct this we either use the default ID provided by the switch or
simply use 1. With that we are able to pass and receive traffic without any
issues.
In addition we were not repopulating the filter table following a reset. To
correct that I have added a bit of code to fm10k_restore_rx_state that will
repopulate the Rx filter configuration for the macvlan interfaces.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bit TPD3[31] is used as a timestamp bit if PTP is enabled, but
it's used as an address bit if PTP is disabled. Since PTP isn't
supported by the driver, we can extend the DMA address to 46 bits.
Signed-off-by: Wang Dongsheng <dongsheng.wang@hxt-semitech.com>
Acked-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Doing a bitwise AND between a bool and an int is generally not a good idea.
The bool will be promoted to an int with value 0 or 1,
the int is generally regarded as true with a non-zero value,
thus ANDing them has the potential to yield an undesired result.
This commit fixes the following smatch warnings:
drivers/net/ethernet/stmicro/stmmac/enh_desc.c:344 enh_desc_prepare_tx_desc() warn: maybe use && instead of &
drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c:337 dwmac4_rd_prepare_tx_desc() warn: maybe use && instead of &
drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c:380 dwmac4_rd_prepare_tso_tx_desc() warn: maybe use && instead of &
Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Problem description:
After ethernet cable connect and disconnect for several iterations on a
device with i210, tx timestamp will stop being put into the socket.
Steps to reproduce:
1. Setup a device with i210 and wire it to a 802.1AS capable switch (
Extreme Networks Summit x440 is used in our case)
2. Have the gptp daemon running on the device and make sure it is synced
with the switch
3. Have the switch disable and enable the port, wait for the device gets
resynced with the switch
4. Iterates step 3 until the device is not albe to get resynced
5. Review the log in dmesg and you will see warning message "igb : clearing
Tx timestamp hang"
Root cause:
If ptp_tx_work() gets scheduled just before the port gets disabled, a LINK
DOWN event will be processed before ptp_tx_work(), which may cause timeout
in ptp_tx_work(). In the timeout logic, the TSYNCTXCTL's TXTT bit (Transmit
timestamp valid bit) is not cleared, causing no new timestamp loaded to
TXSTMP register. Consequently therefore, no new interrupt is triggerred by
TSICR.TXTS bit and no more Tx timestamp send to the socket.
Signed-off-by: Daniel Hua <daniel.hua@ni.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Omit an extra message for a memory allocation failure in this function.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
I personally spent a long time trying to decypher why my CPU would not
reach deeper C-states. Let's just tell the next user what's going on.
Signed-off-by: Matt Turner <matt.turner@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
By design, the idleslope increments are restricted to 16.384kbps steps.
Add a comment to igb_main.c making that explicit and add one example
that illustrates the impact of that.
Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
According to section 12.0.3.4.13 "Receive Descriptor Control - RXDCTL"
of the Intel® 82579 Gigabit Ethernet PHY Datasheet v2.1:
"HTHRESH should be given a non zero value when ever PTHRESH is
used."
In RXDCTL(0), PTHRESH lives at bits 5:0, and HTHREST lives at bits 13:8.
Set only bit 8 of HTHREST as is done in e1000_flush_rx_ring(). Found by
inspection.
Signed-off-by: Matt Turner <matt.turner@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds a new function igb_get_max_rss_queues() to get maximum
RSS queues, this will reduce duplicate code and facilitate future
maintenance.
Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Before libvirt modifies the MAC address and vlan tag for an SRIOV VF
for use by a virtual machine (either using vfio device assignment or
macvtap passthru mode), it saves the current MAC address and vlan tag
so that it can reset them to their original value when the guest is
done. Libvirt can't leave the VF MAC set to the value used by the
now-defunct guest since it may be started again later using a
different VF, but it certainly shouldn't just pick any random value,
either. So it saves the state of everything prior to using the VF, and
resets it to that.
The igb driver initializes the MAC addresses of all VFs to
00:00:00:00:00:00, and reports that when asked (via an RTM_GETLINK
netlink message, also visible in the list of VFs in the output of "ip
link show"). But when libvirt attempts to restore the MAC address back
to 00:00:00:00:00:00 (using an RTM_SETLINK netlink message) the kernel
responds with "Invalid argument".
Forbidding a reset back to the original value leaves the VF MAC at the
value set for the now-defunct virtual machine. Especially on a system
with NetworkManager enabled, this has very bad consequences, since
NetworkManager forces all interfacess to be IFF_UP all the time - if
the same virtual machine is restarted using a different VF (or even on
a different host), there will be multiple interfaces watching for
traffic with the same MAC address.
To allow libvirt to revert to the original state, we need a way to
remove the administrative set MAC on a VF, to allow normal host
operation again, and to reset/overwrite the VF MAC via VF netdev.
This patch implements the outlined scenario by allowing to set the
VF MAC to 00:00:00:00:00:00 via RTM_SETLINK on the PF.
igb_ndo_set_vf_mac resets the IGB_VF_FLAG_PF_SET_MAC flag to 0,
so it's possible to reset the VF MAC back to the original value via
the VF netdev.
Note: Recent patches to libvirt allow for a workaround if the NIC
isn't capable of resetting the administrative MAC back to all 0, but
in theory the NIC should allow resetting the MAC in the first place.
Signed-off-by: Corinna Vinschen <vinschen@redhat.com>
Tested-by: Aaron Brown <arron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Driver periodically samples all neighbors configured in device
in order to update the kernel regarding their state. When finding
an entry configured in HW that doesn't show in neigh_lookup()
driver logs an error message.
This introduces a race when removing multiple neighbors -
it's possible that a given entry would still be configured in HW
as its removal is still being processed but is already removed
from the kernel's neighbor tables.
Simply remove the error message and gracefully accept such events.
Fixes: c723c735fa ("mlxsw: spectrum_router: Periodically update the kernel's neigh table")
Fixes: 60f040ca11 ("mlxsw: spectrum_router: Periodically dump active IPv6 neighbours")
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
memset variables to 0 to fix sparse warnings:
drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c:409:42: sparse: Using
plain integer as NULL pointer
drivers/net/ethernet/chelsio/cxgb4/cudbg_zlib.c:43:47: sparse: Using
plain integer as NULL pointer
Fixes: ad75b7d32f ("cxgb4: implement ethtool dump data operations")
Fixes: 91c1953de3 ("cxgb4: use zlib deflate to compress firmware dump")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes:
drivers/net/ethernet/chelsio/cxgb4/cudbg_zlib.c:39:5: error:
redefinition of 'cudbg_compress_buff'
int cudbg_compress_buff(struct cudbg_init *pdbg_init,
^~~~~~~~~~~~~~~~~~~
In file included from
drivers/net/ethernet/chelsio/cxgb4/cudbg_zlib.c:23:0:
drivers/net/ethernet/chelsio/cxgb4/cudbg_zlib.h:45:19: note: previous
definition of 'cudbg_compress_buff' was here
static inline int cudbg_compress_buff(struct cudbg_init *pdbg_init,
^~~~~~~~~~~~~~~~~~~
Fixes: 91c1953de3 ("cxgb4: use zlib deflate to compress firmware dump")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
implement ndo_set_vf_vlan for mgmt netdevice to configure
the PCIe VF.
Original work by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
40GbE Intel Wired LAN Driver Updates 2018-01-23
This series contains updates to i40e and i40evf only.
Pawel enables FlatNVM support on x722 devices by allowing nvmupdate tool
to configure the preservation flags in the AdminQ command.
Mitch fixes a potential divide by zero error when DCB is enabled and
the firmware fails to configure the VSI, so check for this state.
Fixed a bug where the driver could fail to adhere to ETS bandwidth
allocations if 8 traffic classes were configured on the switch.
Sudheer fixes a potential deadlock by avoiding to call
flush_schedule_work() in i40evf_remove(), since cancel_work_sync()
and cancel_delayed_work_sync() already cleans up necessary work items.
Fixed an issue with the problematic detection and recovery from
hung queues in the PF which was causing lost interrupts. This is done
by triggering a software interrupt so that interrupts are forced on
and if we are already in napi_poll and an interrupt fires, napi_poll
will not be rescheduled and the interrupt is lost.
Avinash fixes an issue in the VF where is was possible to issue a
reset_task while the device is currently being removed.
Michal fixes an issue occurring while calling i40e_led_set() with
the blink parameter set to true, which was causing the activity LED
instead of the link LED to blink for port identification.
Shiraz changes the client interface to not call client close/open on
netdev down/up events, since this causes a lot of thrash that is
not needed. Instead, disable the PE TCP-ENA flag during a netdev
down event and re-enable on a netdev up event, since this blocks all
TCP traffic to the RDMA protocol engine.
Alan fixes an issue which was causing a potential transmit hang by
ignoring the PF link up message if the VF state is not yet in the
RUNNING state.
Amritha fixes the channel VSI recreation during the reset flow to
reconfigure the transmit rings and the queue context associated with
the channel VSI.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
10GbE Intel Wired LAN Driver Updates 2018-01-23
This series contains updates to ixgbe only.
Shannon Nelson provides an implementation of the ipsec hardware offload
feature for the ixgbe driver for these devices: x540, x550, 82599.
The ixgbe NICs support ipsec offload for 1024 Rx and 1024 Tx Security
Associations (SAs), using up to 128 inbound IP addresses, and using the
rfc4106(gcm(aes)) encryption. This code does not yet support checksum
offload, or TSO in conjunction with the ipsec offload - those will be
added in the future.
This code shows improvements in both packet throughput and CPU utilization.
For example, here are some quicky numbers that show the magnitude of the
performance gain on a single run of "iperf -c <dest>" with the ipsec
offload on both ends of a point-to-point connection:
9.4 Gbps - normal case
7.6 Gbps - ipsec with offload
343 Mbps - ipsec no offload
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix recreating the channel VSIs during the reset flow to reconfigure
the Tx rings and the queue context associated with the channel VSI.
Also update the next_base_queue for the VSI while rebuilding the
channel VSIs after a reset.
Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
If we receive the link status message from PF with link up before queues
are actually enabled, it will trigger a TX hang. This fixes the issue
by ignoring a link up message if the VF state is not yet in RUNNING
state.
Signed-off-by: Alan Brady <alan.brady@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Omit an extra message for a memory allocation failure in this function.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Client close is overloaded to handle both un-registration and
netdev down event. On netdev down, i40iw client close is called
which unregisters the RDMA dev and this is too destructive
since the netdev is still registered.
Do not call client close/open on netdev down/up events. Instead
disable the PE TCP_ENA flag during a netdev down event. This
blocks all TCP traffic to the RDMA Protocol Engine. On netdev up,
re-enable the flag.
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Now that i40e_vsi_config_tc() has the pf and hw variable defined, use
them, instead of dereferencing vsi->back. Much easier to read.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The driver (and the entire netdev layer for that matter) assumes
that TC0 will always be present in our DCB configuration.
Unfortunately, this isn't always the case. Rather than fail to
configure the VSI, let's go ahead and try to make it work, even
though DCB will end up being disabled by the kernel.
If the driver fails to configure DCB, the driver queries what's
valid, then writes that back to the hardware, always forcing TC0.
This fixes a bug where the driver could fail to adhere to ETS BW
allocations if 8 TCs were configured on the switch.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In VFs, there is a known issue which can cause writebacks
to not occur when interrupts are disabled and there are
less than 4 descriptors resulting in TX timeout. Timeout
can also occur due to lost interrupt.
The current implementation for detecting and recovering
from hung queues in the PF is problematic because it actually
actively encourages lost interrupts. By triggering a SW
interrupt, interrupts are forced on. If we are already in
napi_poll and an interrupt fires, napi_poll will not be
rescheduled and the interrupt is effectively lost; thereby
potentially *causing* hung queues.
This patch checks whether packets are being processed between
every watchdog cycle and determine potential hung queue and
fires triggers SW interrupt only for that particular queue.
Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This fix solves an issue occurring while calling i40e_led_set function
from the driver with "blink" parameter set as TRUE. This call resulted
in Activity LED blinking instead of Link LED, which may lead to errors
in physically identifying the port, since Activity LED may be blinking
for different reasons as well.
Signed-off-by: Michal Kuchta <michal.kuchta@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When a host disables and enables a PF device, all the associated
VFs are removed and added back in. It also generates a PFR which in turn
resets all the connected VFs. This behaviour is different from that of
Linux guest on Linux host. Hence we end up in a situation where there's
a PFR and device removal at the same time. And watchdog doesn't have a
clue about this and schedules a reset_task. This patch adds code to send
signal to reset_task that the device is currently being removed.
Signed-off-by: Avinash Dayanand <avinash.dayanand@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
flush_schedule_work blocks until completion of all scheduled
work items in global work-queue. This can cause deadlock in some
cases. i40evf_remove() cleans up necessary work items with
cancel_delayed_work_sync and cancel_work_sync. This fix removes
flush_schedule_work call inside i40evf_remove().
Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In some weird circumstances with DCB enabled, the firmware can fail to
configure the VSI, leaving us with zero traffic classes. Check for this
state when we configure RSS to avoid a panic.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds new I40E_NVMUPD_GET_AQ_EVENT state to allow
retrieval of AdminQ events as a result of AdminQ commands sent
to firmware.
Add preservation flags support on X722 devices for NVM update
AdminQ function wrapper. Add new parameter and handling to
nvmupdate admin queue function intended to allow nvmupdate tool
to configure the preservation flags in the AdminQ command.
This is required to implement FlatNVM on X722 devices.
Signed-off-by: Pawel Jablonski <pawel.jablonski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
en_rx_am.c was deleted in 'net-next' but had a bug fixed in it in
'net'.
The esp{4,6}_offload.c conflicts were overlapping changes.
The 'out' label is removed so we just return ERR_PTR(-EINVAL)
directly.
Signed-off-by: David S. Miller <davem@davemloft.net>
With all the support code in place we can now link in the ipsec
offload operations and set the ESP feature flag for the XFRM
subsystem to see.
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add a simple statistic to count the ipsec offloads.
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
If the skb has a security association referenced in the skb, then
set up the Tx descriptor with the ipsec offload bits. While we're
here, we fix an oddly named field in the context descriptor struct.
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
If the chip sees and decrypts an ipsec offload, set up the skb
sp pointer with the ralated SA info. Since the chip is rude
enough to keep to itself the table index it used for the
decryption, we have to do our own table lookup, using the
hash for speed.
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
On a chip reset most of the table contents are lost, so must be
restored. This scans the driver's ipsec tables and restores both
the filled and empty table slots to their pre-reset values.
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add the functions for setting up and removing offloaded SAs (Security
Associations) with the x540 hardware. We set up the callback structure
but we don't yet set the hardware feature bit to be sure the XFRM service
won't actually try to use us for an offload yet.
The software tables are made up to mimic the hardware tables to make it
easier to track what's in the hardware, and the SA table index is used
for the XFRM offload handle. However, there is a hashing field in the
Rx SA tracking that will be used to facilitate faster table searches in
the Rx fast path.
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Set up the data structures to be used by the ipsec offload.
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add in the code for running and stopping the hardware ipsec
encryption/decryption engine. It is good to keep the engine
off when not in use in order to save on the power draw.
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add a few routines to make access to the ipsec registers just a little
easier, and throw in the beginnings of an initialization.
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Clean up the ipsec/macsec descriptor bit definitions to match the rest
of the defines and file organization. Also recognise the bit-definition
overlap in the error mask macro.
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Pointers txr and rxr are being initialized and a few statements later
are being assigned new values without the original values ever being
read. The initialized values are therefore redundant and can be
removed.
Cleans up clang warnings:
drivers/net/ethernet/broadcom/bnx2.c:5821:28: warning: Value stored to
'txr' during its initialization is never read
drivers/net/ethernet/broadcom/bnx2.c:5822:28: warning: Value stored to
'rxr' during its initialization is never read
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since both first_rx_ctx and rx_skb are the head of rx ctx, it not
necessary to use two structure members to statically indicate
the head of rx ctx. So first_rx_ctx is removed.
CC: Srinivas Eeda <srinivas.eeda@oracle.com>
CC: Joe Jin <joe.jin@oracle.com>
CC: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes the following sparse warning:
drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c:50:34: warning:
symbol 'hw_atl_boards' was not declared. Should it be static?
Fixes: 4948293ff9 ("net: aquantia: Introduce new AQC devices and capabilities")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix to return error code -ENOMEM from the aq_ndev_alloc() error
handling case instead of 0, as done elsewhere in this function.
Fixes: 23ee07ad3c ("net: aquantia: Cleanup pci functions module")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix to return error code -EINVAL instead of 0 when num_vfs above
limit_vfs, as done elsewhere in this function.
Fixes: 0dc7862191 ("nfp: handle SR-IOV already enabled when driver is probing")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix bug that causes _absolute_ rtsym sizes of > 8 bytes (as per symbol
table) to result in incorrect space used during a TLV-based debug dump.
Detail: The size calculation stage calculates the correct size (size of
the rtsym address field == 8), while the dump uses the size in the table
to calculate the TLV size to reserve. Symbols with size <= 8 are handled
OK due to aligning sizes to 8, but including any absolute symbol with
listed size > 8 leads to an ENOSPC error during the dump.
Fixes: da762863ed ("nfp: fix absolute rtsym handling in debug dump")
Signed-off-by: Carl Heymann <carl.heymann@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the recently added extack support for eBPF offload in the driver.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pass a pointer to an extack object to nfp_app_xdp_offload() in order to
prepare for extack usage in the nfp driver. Next step will be to forward
this extack pointer to nfp_net_bpf_offload(), once this function is able
to use it for printing error messages.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The commit 6221906694 ("be2net: Request RSS capability of Rx interface
depending on number of Rx rings") modified be_update_queues() so the
IFACE (HW representation of the netdevice) is destroyed and then
re-created. This causes a regression because potential promiscuous mode
is not restored properly during be_open() because the driver thinks
that the HW has promiscuous mode already enabled.
Note that Lancer is not affected by this bug because RX-filter flags are
disabled during be_close() for this chipset.
Cc: Sathya Perla <sathya.perla@broadcom.com>
Cc: Ajit Khaparde <ajit.khaparde@broadcom.com>
Cc: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Cc: Somnath Kotur <somnath.kotur@broadcom.com>
Fixes: 6221906694 ("be2net: Request RSS capability of Rx interface depending on number of Rx rings")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Spaces were mistakenly used instead of tabs in some of the code related
to reset functionality, which caused checkpatch.pl errors. These were
missed earlier so fixing them now.
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
T3 devices have different ports on same PCI function,
so using dev_port to identify ports.
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Broadcom tags inserted by Broadcom switches put a 4 byte header after
the MAC SA and before the EtherType, which may look like some sort of 0
length LLC/SNAP packet (tcpdump and wireshark do think that way). With
ACS enabled in stmmac the packets were truncated to 8 bytes on
reception, whereas clearing this bit allowed normal reception to occur.
In order to make that possible, we need to pass a net_device argument to
the different core_init() functions and we are dependent on the Broadcom
tagger padding packets correctly (which it now does). To be as little
invasive as possible, this is only done for gmac1000 when the network
device is DSA-enabled (netdev_uses_dsa() returns true).
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Check the net status per second, include port speed, total rx/tx packets
and link status. Updating the led status for fiber port.
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add led location support for fiber port. The led will keep blinking
when locating.
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The manager table is empty by default. If it is not initialized, the
management pkgs like LLDP will be dropped by hardware. Default entries
need to be added to manager table.
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds get_regs support for ethtool cmd.
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In reset events in which our memory allocations need to be reallocated,
VPD data is being freed, but never reallocated. This can cause issues if
we later attempt to access that memory or reset and attempt to free the
memory. This patch moves the allocation of the VPD data to init_resources
so that it will be symmetrically freed during release resources.
Signed-off-by: John Allen <jallen@linux.vnet.ibm.com>
Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If we request an unsupported mtu value, the vnic server will suggest a
different value. Currently we take the suggested value without question
and login with that value. However, the behavior doesn't seem completely
sane as attempting to change the mtu to some specific value will change
the mtu to some completely different value most of the time. This patch
fixes the issue by logging in with the previously used mtu value and
printing an error message saying that the given mtu is unsupported.
Signed-off-by: John Allen <jallen@linux.vnet.ibm.com>
Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using newer backing devices can cause the required padding at the end of
buffer as well as the number of queues to change after a failover.
Since we currently assume that these values never change, after a
failover to a backing device with different capabilities, we can get
errors from the vnic server, attempt to free long term buffers that are
no longer there, or not free long term buffers that should be freed.
This patch resolves the issue by checking whether any of these values
change, and if so perform the necessary re-allocations.
Signed-off-by: John Allen <jallen@linux.vnet.ibm.com>
Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When !CONFIG_REGMAP hns throws compiler warnings since
dsaf_read_syscon ignores the return result from regmap_read,
which allows val to be uninitialized.
Fixes: 86897c960b ("net: hns: add syscon operation for dsaf")
Reported-by: Jason Gunthorpe <jgg@ziepe.ca>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The i.MX8 is a ARMv8 based SoC, that uses the same FEC IP as the
earlier, ARMv7 based, i.MX SoCs. Allow the driver to work on ARM64.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Don't change endianness when assigning vlan value in cxgb4_tc_flower
code when processing flow match parameters. The value gets converted
to network order as part of filtering code in set_filter_wr.
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For ethtype_key = ETH_P_IPV6, set filter type as 1 in cxgb4_tc_flower
code when processing flow match parameters.
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch introduces an alternative way of obtaining resources - via
ACPI tables provided by firmware. Enabling coexistence with the DT
support, in addition to the OF_*->device_*/fwnode_* API replacement,
required following steps to be taken:
* Add mvpp2_acpi_match table
* Omit clock configuration and obtain tclk from the property - in ACPI
world, the firmware is responsible for clock maintenance.
* Disable comphy and syscon handling as they are not available for ACPI.
* Modify way of obtaining interrupts - use newly introduced
fwnode_irq_get() routine
* Until proper MDIO bus and PHY handling with ACPI is established in the
kernel, use only link interrupts feature in the driver. For the RGMII
port it results in depending on GMAC settings done during firmware
stage.
* When booting with ACPI MVPP2_QDIST_MULTI_MODE is picked by
default, as there is no need to keep any kind of the backward
compatibility.
Moreover, a memory region used by mvmdio driver is usually placed in
the middle of the address space of the PP2 network controller.
The MDIO base address is obtained without requesting memory region
(by devm_ioremap() call) in mvmdio.c, later overlapping resources are
requested by the network driver, which is responsible for avoiding
a concurrent access.
In case the MDIO memory region is declared in the ACPI, it can
already appear as 'in-use' in the OS. Because it is overlapped by second
region of the network controller, make sure it is released, before
requesting it again. The care is taken by mvpp2 driver to avoid
concurrent access to this memory region.
Signed-off-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
OF functions can be used only for the driver using DT.
As a preparation for introducing ACPI support in mvpp2
driver, use struct fwnode_handle in order to obtain
properties from the hardware description.
This patch replaces of_* function with device_*/fwnode_*
where possible in the mvpp2.
Signed-off-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
'port_count' field of the mvpp2 structure holds an overall amount
of available ports, based on DT nodes status. In order to be prepared
to support other HW description, obtain the value by incrementing it
upon each successful port initialization. This allowed for simplifying
port indexing in the controller's private array, whose size is now not
dynamically allocated, but fixed to MVPP2_MAX_PORTS.
This patch simplifies creating and filling list of enabled ports and
is a part of the preparation for adding ACPI support in the mvpp2 driver.
Signed-off-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add geneve segmentation offload support of T6 cards.
Original work by: Santosh Rastapur <santosh@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit fc922bb0dd ("mlxsw: spectrum_router: Use one LPM tree for
all virtual routers") I tried to make sure only used prefix lengths are
present in the LPM tree shared between all virtual routers.
However, this optimization had to be removed in commit a69518cf0b
("mlxsw: spectrum_router: Avoid expensive lookup during route removal"),
since determining the used prefix lengths required us to traverse all
the active virtual routers, which could result in a hung task depending
on the number of VRFs and whether routes were removed due to abort or
not.
Re-introduce the optimization by moving the prefix usage accounting from
the virtual routers to the LPM tree, as this accounting is only used in
order to determine the tree's structure.
To make the sharing of the trees more explicit, the two trees (for IPv4
and IPv6) are stored in the shared router struct and upon the creation
of a virtual router it is immediately bound to both.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Next patch will try to optimize the LPM tree and make sure only used
prefix lengths are present, to avoid unnecessary look-ups.
Pass the currently removed FIB node to the unlinking function as its
associated prefix length is a potential candidate for removal from the
tree.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, each FIB (IPv4 / IPv6) in a virtual router holds a prefix
usage that is used to choose a matching LPM tree, but also to check if
the FIB is empty, so that the LPM tree could be unbound.
Next patches will remove the reliance on the per-FIB prefix usage for
LPM tree matching. Keeping it only to check if the FIB is empty is a
waste, since we can use the nodes ({Prefix, Length}) list instead.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for mirror action. Only one mirror action can be set per rule.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce extension of mlxsw_afa_ops in order to add/del mirroring and
implement the ops for Spectrum.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Extend SPAN API for ACL case. In case of ACL triggering the MPAR register
shouldn't be configured. This patch also export those helpers for
ACL usage.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The patch extends the trap action for mirroring.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
So far, the caller of mlxsw_afa_block_append_counter needed to allocate
counter index by hand. Benefit from the previously introduced resource
infra and counter_index_get/put callbacks, and allocate the counter
index in place where it is needed, inside the action append function.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since the resource list needs to be used also for other entries different
to fwd_entry_ref, make the list generic. For that purpose, introduce a
resource structure with couple of helpers that the code which need to
store a per-block resource should use.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce extension of mlxsw_afa_ops in order to get/put counter indexes
and implement the ops for Spectrum.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For devices with known capabilities of Fibre media type we now report that
to ethtool.
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The detailed reset sequence ensures all HW components are in aligned
state before NIC startup. It also supports cards with signed firmware (RBL)
and checks if their FW is valid.
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This defines fw2x operations table and corresponding methods.
Some of the functions are being shared with 1.x firmware
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
New AQC cards will have an updated firmware with new binary interface.
This patch extracts firmware specific operations into a separate table
and prepares for the introduction of new fw 2.x and 3.x
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The address to check if HW is not dead/hang could be stored in
capabilities, since it is a constant. Change its name to better reflect
the idea.
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These ops are not related to HW and are now implemented in pci module.
Thus, remove these ops pointers and implementation.
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Driver contained a dead code of maintaining multiple pci port instances.
That will never be used since for each pci function a separate NIC
instance is created.
Simplify this, making pci module only responsible for pci resource
management.
NIC initialization is also simplified accordingly.
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This removes unnecessary structure copying, and prepares the driver for
separate firmware ops table introduction.
We also remove extra copy of capabilities structure (which is const actually)
and also replace it with a const pointer in aq_nic_cfg.
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A number of new AQC devices is going to be released. To support more
flexible capabilities management a number of static caps instances is now
declared. Devices now are mainly differs by supported speeds, but in future
more parameters will be customized. A set of AQC100 devices have
fibre media, not twisted pair - this is also reflected in
new capabilities definitions.
HW level also now directly exports hw_ops for each of A0/B0 hardware.
PCI configuration now uses a device configuration table where each
device ID is explicitly mapped with hardware OPs and capabilities
structures.
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
New set of aquantia devices has an upgraded hardware (B1).
The hardware interface is identical to B0. The difference will
be in firmware which is incompatible with old one.
Reorganized and removed duplicate speed and devid definitions
Introduced explicit flow control configuration defines
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
From: Or Gerlitz <ogerlitz@mellanox.com>
=======
First six patches of this series further enhances the mlx5 hairpin support.
The first two patches deal with using different hairpin instances
for flows whose packets have different priorities to align with the port
TX QoS model. The next four patches allow us to do HW spreading
of flows over a set of hairpin pairs using RSS. The last two patches
change the driver to also set the size of the HW hairpin queues.
========
Next four patches from Eran Ben Elisha <eranbe@mellanox.com>:
Add more debug data for TX timeout handling, and further enhance and optimize
TX timeout handling upon lost interrupts, which adds a mechanism for explicitly
polling EQ in case of a TX timeout in order to recover from a lost interrupt.
If this is not the case (no pending EQEs), perform a channels full recovery as
usual.
From Kamal Heib <kamalh@mellanox.com>, Two patches to extend the stats group API
to have an update_stats() callback which will be used to fetch the hardware or
software counters data, this will improve the current API and reduce code
duplication.
From Gal Pressman <galp@mellanox.com>, Last patch, Add likely to the common RX checksum
flow.
Thanks,
Saeed.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJaYmJ7AAoJEEg/ir3gV/o+SBIH/2VUS1RPNQfgv1pU2WI78Me3
2Iy8zA2fyx5Ko28Kzu/QljlIUs5/4K0rIjDoT5NpmkLf22lWB3QSqyCkqducdl6J
6kAwZ5kPfA8r1jnhlQfhy5VLhaoNhcHeMafXwP9jSy3BvCYRWTQOsp8fN6fBcSX5
s75DcmqE5ljm5b2Y9pIVkYdzz3usQ9/kq+5MPcZodw3XISuXgv1UPlkon4DdVsXQ
7zf3frQnctfTAepkflWMm8BGUO15a4uYfhrMUyvOFHihtIWA9ILEpUC7qdn3uQ6X
phNA6BRoo9bdo7ZSl8+ZsCi+K4LtyevQTHs7Hn/OMSMvxvDLWZswldu2mbg9xVQ=
=YP/D
-----END PGP SIGNATURE-----
Merge tag 'mlx5-updates-2018-01-19' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2018-01-19
From: Or Gerlitz <ogerlitz@mellanox.com>
=======
First six patches of this series further enhances the mlx5 hairpin support.
The first two patches deal with using different hairpin instances
for flows whose packets have different priorities to align with the port
TX QoS model. The next four patches allow us to do HW spreading
of flows over a set of hairpin pairs using RSS. The last two patches
change the driver to also set the size of the HW hairpin queues.
========
Next four patches from Eran Ben Elisha <eranbe@mellanox.com>:
Add more debug data for TX timeout handling, and further enhance and optimize
TX timeout handling upon lost interrupts, which adds a mechanism for explicitly
polling EQ in case of a TX timeout in order to recover from a lost interrupt.
If this is not the case (no pending EQEs), perform a channels full recovery as
usual.
From Kamal Heib <kamalh@mellanox.com>, Two patches to extend the stats group API
to have an update_stats() callback which will be used to fetch the hardware or
software counters data, this will improve the current API and reduce code
duplication.
From Gal Pressman <galp@mellanox.com>, Last patch, Add likely to the common RX checksum
flow.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously it was possible to interrupt processing stats updates because
they were handled in a work queue. Interrupting the stats updates could
lead to a situation where we backup the control message queue. This patch
moves the stats update processing out of the work queue to be processed as
soon as hardware sends a request.
Reported-by: Louis Peens <louis.peens@netronome.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Helmut reported a bug about division by zero while
running traffic and doing physical cable pull test.
When the cable unplugged the ppms become zero, so when
dividing the current ppms by the previous ppms in the
next dim iteration there is division by zero.
This patch prevent this division for both ppms and epms.
Fixes: c3164d2fc4 ("net/mlx5e: Added BW check for DIM decision mechanism")
Reported-by: Helmut Grauer <helmut.grauer@de.ibm.com>
Signed-off-by: Talat Batheesh <talatb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The zeroday builder notices that since Usermode Linux does not
have IO memory, the build fails for them when selecting everything
it can enable.
As the driver is clearly using memory-mapped registers to access
the network adapter, we add depends on HAS_IOMEM to solve this
problem.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov says:
====================
pull-request: bpf-next 2018-01-19
The following pull-request contains BPF updates for your *net-next* tree.
The main changes are:
1) bpf array map HW offload, from Jakub.
2) support for bpf_get_next_key() for LPM map, from Yonghong.
3) test_verifier now runs loaded programs, from Alexei.
4) xdp cpumap monitoring, from Jesper.
5) variety of tests, cleanups and small x64 JIT optimization, from Daniel.
6) user space can now retrieve HW JITed program, from Jiong.
Note there is a minor conflict between Russell's arm32 JIT fixes
and removal of bpf_jit_enable variable by Daniel which should
be resolved by keeping Russell's comment and removing that variable.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The BPF verifier conflict was some minor contextual issue.
The TUN conflict was less trivial. Cong Wang fixed a memory leak of
tfile->tx_array in 'net'. This is an skb_array. But meanwhile in
net-next tun changed tfile->tx_arry into tfile->tx_ring which is a
ptr_ring.
Signed-off-by: David S. Miller <davem@davemloft.net>
During initialization the driver checks whether the flashed FW image
suits its requirements by checking that it's sufficiently new.
However, there's only a weak backward compatibility scheme that is
actually guaranteed by the FW, so driver must also upper bound the
version to prevent compatibility issues between current driver and some
possible future fw.
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
BPF firmware currently exposes IRQ moderation capability.
The driver will make use of it by default, inserting 50 usec
delay to every control message exchange. This cuts the number
of messages per second we can exchange by almost half.
None of the other capabilities make much sense for BPF control
vNIC, either. Disable them all.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Most vNIC capabilities are netdev related. It makes no sense
to initialize them and waste FW resources. Some are even
counter-productive, like IRQ moderation, which will slow
down exchange of control messages.
Add to nfp_app a mask of enabled control vNIC capabilities
for apps to use. Make flower and BPF enable all capabilities
for now. No functional changes.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
nfp_net_init() is a little long and we are about to add more
code to reading capabilties. Move the capability reading,
parsing and validating out. Only actual initialization
will stay in nfp_net_init().
No functional changes.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow specifying alternative vNIC mailbox location in TLV caps.
This way we can size the mailbox to the needs and not necessarily
waste 512B of ctrl memory space.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
PCIe island clock frequency is used when converting coalescing
parameters from usecs to NFP timestamps. Most chips don't run
at 1200MHz, allow FW to provide us with the real frequency.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
NFP is entirely programmable, including the PCI data interface.
Using a fixed control BAR layout certainly makes implementations
easier, but require careful considerations when space is allocated.
Once BAR area is allocated to one feature nothing else can use it.
Allocating space statically also requires it to be sized upfront,
which leads to either unnecessary limitation or wastage.
We currently have a 32bit capability word defined which tells drivers
which application FW features are supported. Most of the bits
are exhausted. The same bits are also reused for enabling specific
features. Bulk of capabilities don't have a need for an enable bit,
however, leading to confusion and wastage.
TLVs seems like a better fit for expressing capabilities of applications
running on programmable hardware.
This patch leaves the front of the BAR as is, and declares a TLV
capability start at offset 0x58. Most of the space up to 0x0d90
is already allocated, but the used space can be wrapped with RESERVED
TLVs. E.g.:
Address Type Length
0x0058 RESERVED 0xe00 /* Wrap basic structures */
0x0e5c FEATURE_A 0x004
0x0e64 FEATURE_B 0x004
0x0e6c RESERVED 0x990 /* Wrap qeueue stats */
0x1800 FEATURE_C 0x100
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When driver app matching loaded FW is not found users are faced with:
nfp: failed to find app with ID 0x%02x
This message does not properly explain that matching driver code is
either not built into the driver or the driver is too old.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Representors are grouped in sets by type. Currently the whole
sets are under RCU protection, but individual representor pointers
are not. This causes some inconveniences when representors have
to be destroyed, because we have to allocate new sets to remove
any representors. Protect the individual pointers with RCU.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The write side of repr tables is always done under pf->lock.
Add a helper to dereference repr table pointers under protection
of that lock.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Devlink used to have two global locks: devlink lock and port lock,
our lock ordering looked like this:
devlink lock -> driver's pf->lock -> devlink port lock
After recent changes port lock was replaced with per-instance
lock. Unfortunately, new per-instance lock is taken on most
operations now. This means we can only grab the pf->lock from
the port split/unsplit ops. Lock ordering looks like this:
devlink lock -> driver's pf->lock -> devlink instance lock
Since we can't take pf->lock from most devlink ops, make sure
nfp_apps are prepared to service them as soon as devlink is
registered. Locking the pf must be pushed down after
nfp_app_init() callback.
The init order looks like this:
nfp_app_init
devlink_register
nfp_app_start
netdev/port_register
As soon as app_init is done nfp_apps must be ready to service
devlink-related callbacks. apps can only register their own
devlink objects from nfp_app_start.
Fixes: 2406e7e546 ("devlink: Add per devlink instance lock")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
NFP app is currently shut down as soon as all the vNICs are gone.
This means we can't depend on the app existing throughout the
lifetime of the device. Free the app only from PCI remove path.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently the helpers for accessing 4 or 8 byte values over
the CPP bus return the length of IO on success. If the IO
was short caller has to deal with error handling. The short
IO for 4/8B values is completely impractical. Make the
helpers return an error if full access was not possible.
Fix the few places which are actually dealing with errors
correctly, most call sites already only deal with negative
return codes.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Most of the packets return true for is_last_ethertype_ip, surround it
with likely compiler hint.
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Extend the stats group API to have an update_stats() callback which
will be used to fetch the hardware or software counters data.
Signed-off-by: Kamal Heib <kamalh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Merge the per priority traffic and pfc groups into one group, because
both groups share the same update_stats() callback which will be
introduced in the upcoming patch.
Signed-off-by: Kamal Heib <kamalh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add per-channel counter ch#_eq_rearm to monitor how many lost interrupt
recovery actions happened upon TX timeouts.
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Up until this patch, on every TX timeout we would try to do channels
recovery. However, in case of a lost interrupt for an EQ, the channel
associated to it cannot be recovered if reopened as it would never get
another interrupt on sent/received traffic, and eventually ends up with
another TX timeout (Restarting the EQ is not part of channel recovery).
This patch adds a mechanism for explicitly polling EQ in case of a TX
timeout in order to recover from a lost interrupt. If this is not the
case (no pending EQEs), perform a channels full recovery as usual.
Once a lost EQE is recovered, it triggers the NAPI to run and handle all
pending completions. This will free some budget in the bql (via calling
netdev_tx_completed_queue) or by clearing pending TXWQEs and waking up
the queue. One of the above actions will move the queue to be ready for
transmit again.
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When TX timeout occurs, EQ consumer index and irqn can help in debug for
understanding the SW state of EQ. Add them to the logger prints for the
relevant EQ only.
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When driver callback for TX timeout is being called, it handles all
stopped xmit queues (not only the ones which their timeout expired).
Add usecs since last transmit to TX timeout logs per send queue in order
to monitor if the queue timeout expired.
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
For a given hairpin packet buffer size, different queue sizes
(values of log_hairpin_num_packets) determine how the data is broken
to strides on the RQ. Currently the chosen value is set to 64B strides.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Allow to specify the size of the hairpin queues along with the
packet buffer data size from the core setup code.
If the driver doesn't provide this, the FW applies proper value that
matches the provided data size and a FW chosen RQ stride size.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Support RSS for hairpin traffic. We create multiple hairpin RQ/SQ pairs
and RSS TTC table per hairpin instance and steer the related flows
through that table so they are spread between the pairs.
We open one pair per 50Gbs link speed, for all speeds <= 50Gbs, there
is one pair and no RSS while for 100Gbs ports two RSSed pairs.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Enhance the hairpin setup code at the core to support a set of N
(RQ,SQ) pairs. This will be later used by the caller to set RSS
spreading among the different RQs.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
This will allow to be able and set TC rule whose steering dest is
RSS TTC steering table.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
In order to use RSS for hairpin, we refactor the code that deals with
setup of the TTC steering tables. This is done using an interim ttc
params object that has the flow table attributes, TIR numbers, etc.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
As part of the QoS model, on xmit, the HW mandates that all packets going
through a given SQ have the same priority. To align hairpin SQs with that,
we use the priority given as part of the matching for the hairpin hash key.
This ensures that flows/packets mapped to different HW priorities will
go through different hairpin instances. If no priority is given for
matching, we treat that as an 8th priority, this is in order not to
harm cases where priority is specified.
Only the PCP priority trust model is supported.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The peer vhca id spans less bits vs the ifindex and can
well serve for the hairpin hash key, move to use that.
This is a pre-step to put more info into the hairpin hash
key in downstream patch while keeping it at 32 bits.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
on T6, IPv6 filter would occupy 2 tids instead of 4.
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use zlib deflate to compress firmware dump. Collect and compress
as much firmware dump as possible into a 32 MB buffer.
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update firmware dump collection logic to use compression when available.
Let collection logic attempt to do compression, instead of returning out
of memory early.
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes the following sparse warning:
drivers/net/ethernet/mellanox/mlxsw/spectrum_kvdl.c:289:5: warning:
symbol 'mlxsw_sp_kvdl_part_occ' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The variable miistat is not used. So it is removed.
CC: Srinivas Eeda <srinivas.eeda@oracle.com>
CC: Joe Jin <joe.jin@oracle.com>
CC: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Packet descriptor generation for IPv6 is broken.
Properly set L3 and L4 protocol flags for IPv6 descriptors.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Set some missing fields in the IP control offload buffer. This buffer is
used to enable checksum and TCP segmentation offload in the VNIC server.
The buffer length field and the checksum offloading bits were not set
properly, so fix that here.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a new LPM tree is created, we try to replace the trees in the
existing virtual routers with it. If we fail, the tree needs to be
freed.
Currently, this does not happen in the unlikely case where we fail to
bind the tree to the first virtual router, since its reference count
never transitions from 1 to 0.
Fix that by taking a reference before binding the tree.
Fixes: fc922bb0dd ("mlxsw: spectrum_router: Use one LPM tree for all virtual routers")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Scheduling out and in for every FW message can slow us down
unnecessarily. Our experiments show that even under heavy load
the FW responds to 99.9% messages within 200 us. Add a short
busy wait before entering the wait queue.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
The special handling of different map types is left to the driver.
Allow offload of array maps by simply adding it to accepted types.
For nfp we have to make sure array elements are not deleted.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
The trailing semicolon is an empty statement that does no operation.
Removing it since it doesn't do anything.
Signed-off-by: Luis de Bethencourt <luisbg@kernel.org>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A cleanup of the PM code left an incorrect #ifdef in place, leading
to a harmless build warning:
drivers/net/ethernet/intel/fm10k/fm10k_pci.c:2502:12: error: 'fm10k_suspend' defined but not used [-Werror=unused-function]
drivers/net/ethernet/intel/fm10k/fm10k_pci.c:2475:12: error: 'fm10k_resume' defined but not used [-Werror=unused-function]
It's easier to use __maybe_unused attributes here, since you
can't pick the wrong one.
Fixes: 8249c47c6b ("fm10k: use generic PM hooks instead of legacy PCIe power hooks")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adds a new page to mlx5 core containing clock info data that allows
user level applications to translate between cqe timestamp to
nanoseconds. The information stored into this page is represented
through mlx5_ib_clock_info.
In order to synchronize between kernel and user space a sequence
number is incremented at the beginning and end of each update.
An odd number means the data is being updated while an even means
the access was already done. To guarantee that the data structure
was accessed atomically user will:
repeat:
seq1 = <read sequence>
goto <repeate> while odd
<read data structure>
seq2 = <read sequence>
if seq1 != seq2 goto repeat
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Eitan Rabin <rabin@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This patch set those new jit info fields introduced in this patch set.
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
We need to pass a struct device to the DMA API, even if some architectures
still support that for legacy reasons, and should not mix it with the old
PCI DMA API.
Note that the driver also seems to never actually unmap its DMA mappings,
but to fix that we'll need someone more familar with the driver.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
In the receive queue for 4096 bytes fragments, the page address
set in the SW data0 field of the descriptor is not the one we got
when doing the reassembly in receive. The page structure was retrieved
from the wrong descriptor into SW data0 which is then causing a
page fault when UDP checksum is accessing data above 1500.
Signed-off-by: Rex Chang <rchang@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
restructure the code which adds support for configuring
PCIe VF via mgmt netdevice. which was added by
commit 7829451c69 ("cxgb4: Add control net_device for
configuring PCIe VF")
Original work by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
No need to convert from mlxsw_sp_port to net_device and back again.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Benefit from the prepared TC and in-driver ACL infrastructure and
introduce block sharing offload. For that, a new struct "block" is
introduced in spectrum_acl in order to hold a list of specific
block-port bindings.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead, pass netdev and ingress flag to ruleset unbind op.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to prepare for follow-up changes, make the bind/unbind helpers
very simple. That required move of ht insertion/removal and bind/unbind
calls into mlxsw_sp_acl_ruleset_create/destroy.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently the driver exports different switchdev PARENT_IDs for
representors belonging to different SR-IOV PF-pools of an adapter.
This is not correct as the adapter can switch across all vports
of an adapter. This patch fixes this by exporting a common switchdev
PARENT_ID for all reps of an adapter. The PCIE DSN is used as the id.
Signed-off-by: Sathya Perla <sathya.perla@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The chip supports 64-byte and 128-byte cache line size for more optimal
DMA performance when matched to the CPU cache line size. The default is 64.
If the system is using 128-byte cache line size, set it to 128.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Forward hwrm_func_vf_cfg command from VF to PF driver, to store
VF MAC address in PF's context. This will allow "ip link show"
to display all VF MAC addresses.
Maintain 2 locations of MAC address in VF info structure, one for
a PF assigned MAC and one for VF assigned MAC.
Display VF assigned MAC in "ip link show", only if PF assigned MAC is
not valid.
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bnxt_check_rings() is called by ethtool, XDP setup, and ndo_setup_tc()
to see if there are enough resources to support the new configuration.
Expand the call to test all resources if the firmware supports the new
API. With the more flexible resource allocation scheme, this call must
be made to check that all resources are available before committing to
allocate the resources.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of the old method of evenly dividing the resources to the VFs,
use the new firmware API to specify min and max resources for each VF.
This way, there is more flexibility for each VF to allocate more or less
resources.
The min is the absolute minimum for each VF to function. The max is the
global resources minus the resources used by the PF. Each VF is
guaranteed the min. Up to max resources may be available for some VFs.
The PF driver can use one of 2 strategies specified in NVRAM to assign
the resources. The old legacy strategy of evenly dividing the resources
or the new flexible strategy.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In bnxt_rfs_capable(), add call to reserve vnic resources to support
NTUPLE. Return true if we can successfully reserve enough vnics.
Otherwise, reserve the minimum 1 VNIC for normal operations not
supporting NTUPLE and return false.
Also, suppress warning message about not enough resources for NTUPLE when
only 1 RX ring is in use. NTUPLE filters by definition require multiple
RX rings.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The new method will call firmware to reserve the desired tx, rx, cmpl
rings, ring groups, stats context, and vnic resources. A second query
call will check the actual resources that firmware is able to reserve.
The driver will then trim and adjust based on the actual resources
provided by firmware. The driver will then reserve the final resources
in use.
This method is a more flexible way of using hardware resources. The
resources are not fixed and can by adjusted by firmware. The driver
adapts to the available resources that the firmware can reserve for
the driver.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In combined mode, the driver is currently not setting RX and TX ring
numbers the same when firmware can allocate more RX than TX or vice versa.
This will confuse the user as the ethtool convention assumes they are the
same in combined mode. Fix it by adding bnxt_trim_dflt_sh_rings() to trim
RX and TX ring numbers to be the same as the completion ring number in
combined mode.
Note that if TCs are enabled and/or XDP is enabled, the number of TX rings
will not be the same as RX rings in combined mode.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The new API HWRM_FUNC_RESOURCE_QCAPS provides min and max hardware
resources. Use the new API when it is supported by firmware.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation for new firmware APIs to allocate hardware resources,
add a new struct bnxt_hw_resc to hold various min, max and reserved
resources. This new structure is common for PFs and VFs.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After SRIOV has been enabled and disabled, the MSIX vectors assigned to
the VFs have to be re-initialized. Otherwise they cannot be re-used by
the PF. For example, increasing the number of PF rings after disabling
SRIOV may fail if the PF uses MSIX vectors previously assigned to the VFs.
To fix this, we add logic in bnxt_restore_pf_fw_resources() to close the
NIC, clear and re-init MSIX, and re-open the NIC.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a new __bnxt_close_nic() function to do all the work previously done
in bnxt_close_nic() except waiting for SRIOV configuration. The new
function will be used in the next patch as part of SRIOV cleanup.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The version has new firmware APIs to allocate PF/VF resources more
flexibly.
New toolchains were used to generate this file, resulting in a one-time
large diffstat.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On Meson8b the only valid input clock is MPLL2. The bootloader
configures that to run at 500002394Hz which cannot be divided evenly
down to 125MHz using the m250_div clock. Currently the common clock
framework chooses a m250_div of 2 - with the internal fixed
"divide by 10" this results in a RGMII TX clock of 125001197Hz (120Hz
above the requested 125MHz).
Letting the common clock framework propagate the rate changes up to the
parent of m250_mux allows us to get the best possible clock rate. With
this patch the common clock framework calculates a rate of
very-close-to-250MHz (249999701Hz to be exact) for the MPLL2 clock
(which is the mux input). Dividing that by 2 (which is an internal,
fixed divider for the RGMII TX clock) gives us an RGMII TX clock of
124999850Hz (which is only 150Hz off the requested 125MHz, compared to
1197Hz based on the MPLL2 rate set by u-boot and the Amlogic GPL kernel
sources).
SoCs from the Meson GX series are not affected by this change because
the input clock is FCLK_DIV2 whose rate cannot be changed (which is fine
since it's running at 1GHz, so it's already a multiple of 250MHz and
125MHz).
Fixes: 566e825162 ("net: stmmac: add a glue driver for the Amlogic Meson 8b / GXBB DWMAC")
Suggested-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Jerome Brunet <jbrunet@baylibre.com>
Tested-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Meson8b only supports MPLL2 as clock input. The rate of the MPLL2 clock
set by Odroid-C1's u-boot is close to (but not exactly) 500MHz. The
exact rate is 500002394Hz, which is calculated in
drivers/clk/meson/clk-mpll.c using the following formula:
DIV_ROUND_UP_ULL((u64)parent_rate * SDM_DEN, (SDM_DEN * n2) + sdm)
Odroid-C1's u-boot configures MPLL2 with the following values:
- SDM_DEN = 16384
- SDM = 1638
- N2 = 5
The 250MHz clock (m250_div) inside dwmac-meson8b driver is derived from
the MPLL2 clock. Due to MPLL2 running slightly faster than 500MHz the
common clock framework chooses a divider which is too big to generate
the 250MHz clock (a divider of 2 would be needed, but this is rounded up
to a divider of 3). This breaks the RTL8211F RGMII PHY on Odroid-C1
because it requires a (close to) 125MHz RGMII TX clock (on Gbit speeds,
the IP block internally divides that down to 25MHz on 100Mbit/s
connections and 2.5MHz on 10Mbit/s connections - we don't need any
special configuration for that).
Round the divider to the closest value to prevent this issue on Meson8b.
This means we'll now end up with a clock rate for the RGMII TX clock of
125001197Hz (= 125MHz plus 1197Hz), which is close-enough to 125MHz.
This has no effect on the Meson GX SoCs since there fclk_div2 is used as
input clock, which has a rate of 1000MHz (and thus is divisible cleanly
to 250MHz and 125MHz).
Fixes: 566e825162 ("net: stmmac: add a glue driver for the Amlogic Meson 8b / GXBB DWMAC")
Reported-by: Emiliano Ingrassia <ingrassia@epigenesys.com>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Jerome Brunet <jbrunet@baylibre.com>
Tested-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tests (using an oscilloscope and an Odroid-C1 board with a RTL8211F
RGMII PHY) have shown that the PRG_ETH0 register behaves as follows:
- bit 4 is a mux to choose between two parent clocks. according to the
public S805 datasheet the only supported parent clock is MPLL2 (this
was not verified using the oscilloscope).
The public S805/S905 datasheet claims that this bit is reserved.
- bits 9:7 control a one-based divider (register value 1 means "divide
by 1", etc.) for the input clock. we call this clock the "m250_div"
clock because it's value is always supposed to be (close to) 250MHz
(see below for an explanation).
The description in the public S805/S905 datasheet is a bit cryptic,
but it comes down to "input clock = 250MHz * value" (which could also
be expressed as "250MHz = input clock / value")
- there seems to be an internal fixed divide-by-2 clock which takes the
output from the m250_div and divides it by 2. This is not unusual on
Amlogic SoCs, since the SDIO (MMC) driver also uses an internal fixed
divide-by-2 clock.
This is not documented in the public S805/S905 datasheet
- bit 10 controls a gate clock which enables or disables the RGMII TX
clock (which is an output on the MAC/SoC and an input in the PHY). we
call this the "rgmii_tx_en" clock. if this bit is set to "0" the RGMII
TX clock output is close to 0
The description for this bit in the public S805/S905 datasheet is
"Generate 25MHz clock for PHY". Based on these tests it's believed
that this is wrong, and should probably read "Generate the 125MHz
RGMII TX clock for the PHY"
- the RGMII TX clock has to be set to 125MHz - the IP block adjusts the
output (automatically) depending on the line speed (RGMII specifies
that Gbit connections use a 125MHz clock, 100Mbit/s connections use a
25MHz clock and 10Mbit/s connections use a 2.5MHz clock. only Gbit and
100Mbit/s were tested with an oscilloscope). Due to the requirement
that this clock always has to be set to 125MHz and due to the fixed
divide-by-2 parent clock this means that m250_div will always end up
with a rate of (close to) 250MHz.
- bits 6:5 are the TX delay, which is also named "clock phase" in some
of Amlogic's older GPL kernel sources.
The PHY also has an XTAL_IN pin where a 25MHz clock has to be provided.
Tests with the oscilloscope have shown that this is routed to a crystal
right next to the RTL8211F PHY. The same seems to be true on the Khadas
VIM2 (which uses a GXM SoC) board - however the 25MHz crystal is on the
other side of the PCB there.
This updates the clocks in the dwmac-meson8b driver by replacing the
"m25_div" with the "rgmii_tx_en" clock and additionally introducing a
fixed divide-by-2 clock between "m250_div" and "rgmii_tx_en".
Now we also need to set a frequency of 125MHz on the RGMII clock
(opposed to the 25MHz we set before, with that non-existing
divide-by-5-or-10 divider).
Special thanks go to Linus Lüssing for testing the various bits and
checking the results with an oscilloscope on his Odroid-C1!
Fixes: 566e825162 ("net: stmmac: add a glue driver for the Amlogic Meson 8b / GXBB DWMAC")
Reported-by: Emiliano Ingrassia <ingrassia@epigenesys.com>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Acked-by: Jerome Brunet <jbrunet@baylibre.com>
Tested-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Neither the m25_div_clk nor the m250_div_clk or m250_mux_clk are used in
RMII mode. The m25_div_clk output is routed to the RGMII PHY's "RGMII
clock".
This means that we don't need to configure the clocks in RMII mode. The
driver however did this - with no effect since the clocks are not routed
to the PHY in RMII mode.
While here also rename meson8b_init_clk to meson8b_init_rgmii_tx_clk to
make it easier to understand the code.
Fixes: 566e825162 ("net: stmmac: add a glue driver for the Amlogic Meson 8b / GXBB DWMAC")
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Tested-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 0dfb33a0d7 ("sch_red: report backlog information") copied
child's backlog into RED's backlog. Back then RED did not maintain
its own backlog counts. This has changed after commit 2ccccf5fb4
("net_sched: update hierarchical backlog too") and commit d7f4f332f0
("sch_red: update backlog as well"). Copying is no longer necessary.
Tested:
$ tc -s qdisc show dev veth0
qdisc red 1: root refcnt 2 limit 400000b min 30000b max 30000b ecn
Sent 20942 bytes 221 pkt (dropped 0, overlimits 0 requeues 0)
backlog 1260b 14p requeues 14
marked 0 early 0 pdrop 0 other 0
qdisc tbf 2: parent 1: rate 1Kbit burst 15000b lat 3585.0s
Sent 20942 bytes 221 pkt (dropped 0, overlimits 138 requeues 0)
backlog 1260b 14p requeues 14
Recently RED offload was added. We need to make sure drivers don't
depend on resetting the stats. This means backlog should be treated
like any other statistic:
total_stat = new_hw_stat - prev_hw_stat;
Adjust mlxsw.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Nogah Frankel <nogahf@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as
where a PCI device is present. This restricts the device drivers to be
reused for other domain numbers.
Getting ready to remove pci_get_bus_and_slot() function in favor of
pci_get_domain_bus_and_slot().
Use the domain information from pdev while calling into
pci_get_domain_bus_and_slot() function.
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as
where a PCI device is present. This restricts the device drivers to be
reused for other domain numbers.
Getting ready to remove pci_get_bus_and_slot() function in favor of
pci_get_domain_bus_and_slot().
Introduce bnx2x_vf_domain() function to extract the domain information
and save it to VF specific data structure.
Use the saved domain value while calling pci_get_domain_bus_and_slot().
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann says:
====================
pull-request: bpf-next 2018-01-17
The following pull-request contains BPF updates for your *net-next* tree.
The main changes are:
1) Add initial BPF map offloading for nfp driver. Currently only
programs were supported so far w/o being able to access maps.
Offloaded programs are right now only allowed to perform map
lookups, and control path is responsible for populating the
maps. BPF core infrastructure along with nfp implementation is
provided, from Jakub.
2) Various follow-ups to Josef's BPF error injections. More
specifically that includes: properly check whether the error
injectable event is on function entry or not, remove the percpu
bpf_kprobe_override and rather compare instruction pointer
with original one, separate error-injection from kprobes since
it's not limited to it, add injectable error types in order to
specify what is the expected type of failure, and last but not
least also support the kernel's fault injection framework, all
from Masami.
3) Various misc improvements and cleanups to the libbpf Makefile.
That is, fix permissions when installing BPF header files, remove
unused variables and functions, and also install the libbpf.h
header, from Jesper.
4) When offloading to nfp JIT and the BPF insn is unsupported in the
JIT, then reject right at verification time. Also fix libbpf with
regards to ELF section name matching by properly treating the
program type as prefix. Both from Quentin.
5) Add -DPACKAGE to bpftool when including bfd.h for the disassembler.
This is needed, for example, when building libfd from source as
bpftool doesn't supply a config.h for bfd.h. Fix from Jiong.
6) xdp_convert_ctx_access() is simplified since it doesn't need to
set target size during verification, from Jesper.
7) Let bpftool properly recognize BPF_PROG_TYPE_CGROUP_DEVICE
program types, from Roman.
8) Various functions in BPF cpumap were not declared static, from Wei.
9) Fix a double semicolon in BPF samples, from Luis.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
If an eBPF instruction is unknown to the driver JIT compiler, we can
reject the program at verification time.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Use the verifier log to output error messages if map lookup
can't be offloaded.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Currently in the cases where cmp_type == CMP_TYPE_RX_L2_TPA_START_CMP or
CMP_TYPE_RX_L2_TPA_END_CMP the exit path updates cpr->rx_bytes with an
uninitialized length len. Fix this by adding a new exit path that does
not update the cpr stats with the bogus length len and remove the unused
label next_rx_no_prod.
Detected by CoverityScan, CID#1463807 ("Uninitialized scalar variable")
Fixes: 6a8788f256 ("bnxt_en: add support for software dynamic interrupt moderation")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to check if p_ent->comp_mode is QED_SPQ_MODE_EBLOCK before
calling qed_spq_add_entry(). The test is fine is the mode is EBLOCK,
but if it isn't then qed_spq_add_entry() might kfree(p_ent).
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: David S. Miller <davem@davemloft.net>