ethtool expects ETHTOOL_GRXCLSRLALL to set ethtool_rxnfc->data with the
total number of entries in the rx classifier table. Surprisingly, mlx4
is missing this part (in principle ethtool could still move forward and
try the insert).
Tested: compiled and run command:
phh13:~# ethtool -N eth1 flow-type udp4 queue 4
Added rule with ID 255
Signed-off-by: Luigi Rizzo <lrizzo@google.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Load Realtek-provided firmware for RTL8168fp/RTL8117. Unlike the
firmware for other chip versions which is for the PHY, firmware for
RTL8168fp/RTL8117 is for the MAC.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using constant MII_EXPANSION is misleading here because register 0x06
has a different meaning on page 0x0005. Here a proprietary PHY
parameter is read by writing the parameter id to register 0x05 on page
0x0005, followed by reading the parameter value from register 0x06.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch cleans-up the stray left over code. It has no
functionality impact.
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The commit 0c65b2b90d ("net: of_get_phy_mode: Change API to solve
int/unit warnings") updated the function of_get_phy_mode declaration.
Now it returns an error code and in case the node doesn't contain the
property 'phy-mode' or 'phy-connection-type' it returns -EINVAL and would
set the phy_interface_t to PHY_INTERFACE_MODE_NA.
Ocelot VSC7514 has 4 internal phys which have the phy interface
PHY_INTERFACE_MODE_NA. So because of_get_phy_mode would assign
PHY_INTERFACE_MODE_NA to phy_mode when there is an error, there is no need
to add the error check.
Updates for v2:
- drop error check because of_get_phy_mode already assigns phy_interface
to PHY_INTERFACE_MODE in case of error.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This driver forgets to free allocated netdev in remove like
what is done in probe failure.
Add the free to fix it.
Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
All .rw_reset callbacks except bnx2x_84833_hw_reset_phy() use a
void return type. No callers of .hw_reset check a return value and
bnx2x_84833_hw_reset_phy() unconditionally returns 0. Remove all
hw_reset_t casts and fix the return type to void.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The return values for format_fw_ver_t callbacks are supposed to be
"int", not "u8". Ultimately, the top-level caller doesn't actually check
the return value at all, but just clean this all up anyway and fix the
prototypes so that casts are no longer needed.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
No callers of .config_init check return values. Remove the casting and
change all callbacks to have the correct function prototype.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function casts for .read_status callbacks end up casting some int
return values to u8. This seems to be bug-prone (-EINVAL being returned
into something that appears to be true/false), but fixing the function
prototypes doesn't change the existing behavior. Fix the return values
to remove the casts.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
NULL is already "void *" so it will auto-cast in assignments and
initializers. Additionally, all the callbacks for .link_reset,
.config_loopback, .set_link_led, and .phy_specific_func are already
correct. No casting is needed for these, so remove them.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
ENETC has a register PSPEED to indicate the link speed of hardware.
It is need to update accordingly. PSPEED field needs to be updated
with the port speed for QBV scheduling purposes. Or else there is
chance for gate slot not free by frame taking the MAC if PSPEED and
phy speed not match. So update PSPEED when link adjust. This is
implement by the adjust_link.
Signed-off-by: Po Liu <Po.Liu@nxp.com>
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ENETC supports in hardware for time-based egress shaping according
to IEEE 802.1Qbv. This patch implement the Qbv enablement by the
hardware offload method qdisc tc-taprio method.
Also update cbdr writeback to up level since control bd ring may
writeback data to control bd ring.
Signed-off-by: Po Liu <Po.Liu@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The page pool keeps track of the number of pages in flight, and
it isn't safe to remove the pool until all pages are returned.
Disallow removing the pool until all pages are back, so the pool
is always available for page producers.
Make the page pool responsible for its own delayed destruction
instead of relying on XDP, so the page pool can be used without
the xdp memory model.
When all pages are returned, free the pool and notify xdp if the
pool is registered with the xdp memory system. Have the callback
perform a table walk since some drivers (cpsw) may share the pool
among multiple xdp_rxq_info.
Note that the increment of pages_state_release_cnt may result in
inflight == 0, resulting in the pool being released.
Fixes: d956a048cd ("xdp: force mem allocator removal and periodic warning")
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch corrects the SPDX License Identifier style in
header files related to STMicroelectronics based Multi-Gigabit
Ethernet driver. For C header files Documentation/process/license-rules.rst
mandates C-like comments (opposed to C source files where
C++ style should be used).
Changes made by using a script provided by Joe Perches here:
https://lkml.org/lkml/2019/2/7/46.
Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Nishad Kamdar <nishadkamdar@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch corrects the SPDX License Identifier style in
header files related to Marvell OcteonTX2 network devices.
It uses an expilict block comment for the SPDX License
Identifier.
Changes made by using a script provided by Joe Perches here:
https://lkml.org/lkml/2019/2/7/46.
Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Nishad Kamdar <nishadkamdar@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This driver enables rising edge or falling edge, but not both, and so
this patch validates that the request contains only one of the two
edges.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This hardware always time stamps rising and falling edges, and so this
patch validates that the request does contains both edges.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
User space may request time stamps on rising edges, falling edges, or
both. However, the particular mode may or may not be supported in the
hardware or in the driver. This patch adds a "strict" flag that tells
drivers to ensure that the requested mode will be honored.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the renesas PTP support to explicitly reject any future flags that
get added to the external timestamp request ioctl.
In order to maintain currently functioning code, this patch accepts all
three current flags. This is because the PTP_RISING_EDGE and
PTP_FALLING_EDGE flags have unclear semantics and each driver seems to
have interpreted them slightly differently.
Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the mlx5 core PTP support to explicitly reject any future flags that
get added to the external timestamp request ioctl.
In order to maintain currently functioning code, this patch accepts all
three current flags. This is because the PTP_RISING_EDGE and
PTP_FALLING_EDGE flags have unclear semantics and each driver seems to
have interpreted them slightly differently.
[ RC: I'm not 100% sure what this driver does, but if I'm not wrong it
follows the dp83640:
flags Meaning
---------------------------------------------------- --------------------------
PTP_ENABLE_FEATURE Time stamp rising edge
PTP_ENABLE_FEATURE|PTP_RISING_EDGE Time stamp rising edge
PTP_ENABLE_FEATURE|PTP_FALLING_EDGE Time stamp falling edge
PTP_ENABLE_FEATURE|PTP_RISING_EDGE|PTP_FALLING_EDGE Time stamp falling edge
]
Cc: Feras Daoud <ferasda@mellanox.com>
Cc: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the igb PTP support to explicitly reject any future flags that
get added to the external timestamp request ioctl.
In order to maintain currently functioning code, this patch accepts all
three current flags. This is because the PTP_RISING_EDGE and
PTP_FALLING_EDGE flags have unclear semantics and each driver seems to
have interpreted them slightly differently.
This HW always time stamps both edges:
flags Meaning
---------------------------------------------------- --------------------------
PTP_ENABLE_FEATURE Time stamp both edges
PTP_ENABLE_FEATURE|PTP_RISING_EDGE Time stamp both edges
PTP_ENABLE_FEATURE|PTP_FALLING_EDGE Time stamp both edges
PTP_ENABLE_FEATURE|PTP_RISING_EDGE|PTP_FALLING_EDGE Time stamp both edges
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 823eb2a3c4 ("PTP: add support for one-shot output") introduced
a new flag for the PTP periodic output request ioctl. This flag is not
currently supported by any driver.
Fix all drivers which implement the periodic output request ioctl to
explicitly reject any request with flags they do not understand. This
ensures that the driver does not accidentally misinterpret the
PTP_PEROUT_ONE_SHOT flag, or any new flag introduced in the future.
This is important for forward compatibility: if a new flag is
introduced, the driver should reject requests to enable the flag until
the driver has actually been modified to support the flag in question.
Cc: Felipe Balbi <felipe.balbi@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Christopher Hall <christopher.s.hall@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver calls release_resource in remove to match request_mem_region
in probe, which is incorrect.
Fix it by using the right one, release_mem_region.
Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Felix DSA driver needs to write to SYS_RAM_INIT_RAM_INIT for its own
chip initialization process.
Also update the MAINTAINERS file such that the headers exported by the
ocelot driver are under the same maintainers' umbrella as the driver
itself.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We will be registering another switch driver based on ocelot, which
lives under drivers/net/dsa.
Make sure the Felix DSA front-end has the necessary abstractions to
implement a new Ocelot driver instantiation. This includes the function
prototypes for implementing DSA callbacks.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Felix switch has a different reset procedure, so a function pointer
needs to be created and added to the ocelot_ops structure.
The reset procedure has been moved into ocelot_init.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
When using the NPI port, the DSA tag is passed through Ethernet, so the
switch's MAC needs to accept it as it comes from the DSA master. Increase
the MTU on the external CPU port to account for the length of the
injection header.
Without this patch, MTU-sized frames are dropped by the switch's CPU
port on xmit, which is especially obvious in TCP sessions.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This constant will be used in a future patch to increase the MTU on NPI
ports, and will also be used in the tagger driver for Felix.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since in an NPI/DSA setup, not all ports will have the same MTU, we need
to make sure the watermarks for pause frames and/or tail dropping logic
that existed in the driver is still coherent for the new MTU values.
We need to do this because the NPI (aka external CPU) port needs an
increased MTU for the DSA tag. This will be done in a future patch.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It doesn't make sense to rewrite all these registers every time the PHY
library notifies us about a link state change.
In a future patch we will customize the MTU for the CPU port, and since
the MTU was previously configured from adjust_link, if we don't make
this change, its value would have got overridden.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The adjust_link routine should be generic enough to be (re)used by
any SoC that integrates a switch core compatible with the Ocelot
core switch driver. Currently all configurations are generic except
for the PCS settings that are SoC specific. Move these out to the
Ocelot SoC/board instance.
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Let's make this ioremap and regmap init code common. It should not
be platform dependent as it should be usable by PCI devices too.
Use better names where necessary to avoid clashes.
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that TX Coalesce has been rewritten we no longer need this
additional interrupt enabled. This reduces CPU usage.
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Coalesce logic currently increments the number of packets and sets the
IC bit when the coalesced packets have passed a given limit. This does
not reflect very well what coalesce was meant for as we can have a large
number of packets that are coalesced and then a single one, sent later
on that has the IC bit.
Rework the logic so that it coalesces only upon a limit of packets and
sets the IC bit for large number of packets.
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tune-up the defalt coalesce settings for optimal values. This gives the
best performance in most of the use-cases.
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
RFA and RFD should not be dependent on FIFO size. In fact, the more FIFO
space we have, the later we can activate Flow Control. Let's use
hard-coded values for RFA and RFD for all FIFO sizes with the exception
of 4k, which is a special case.
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
RFA and RFD should not be dependent on FIFO size. In fact, the more FIFO
space we have, the later we can activate Flow Control. Let's use
hard-coded values for RFA and RFD for all FIFO sizes with the exception
of 4k, which is a special case.
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For performance reasons, sometimes using the minimum RX Coalesce value
is not optimal. Lets setup a default value that is optimal in most of
the use cases.
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We may only want to use the RX Watchdog so lets check if RX Coalesce
settings are non-zero and only set the RX Interrupt on Completion bit if
its not.
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 0c3cbbf96d ("mlxsw: Add specific trap for packets routed via
invalid nexthops") allocated an adjacency entry during driver
initialization whose purpose is to discard packets hitting the route
pointing to it.
These adjacency entries are allocated from a resource called KVD linear
(KVDL). There are situations in which the user can decide to set the
size of this resource (via devlink-resource) to 0, in which case the
driver will not be able to load.
Therefore, instead of pre-allocating this adjacency entry, simply
allocate it only when needed. A variable indicating the validity of the
entry is added and is used to ensure it is only allocated and written
once and that it is freed after all the routes were flushed.
Fixes: 0c3cbbf96d ("mlxsw: Add specific trap for packets routed via invalid nexthops")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Traffic for a CGX mapped NIXLF can be stopped by disabling entries
in NPC MCAM or by configuring CGX and mailbox messages exist for the
two options. If traffic is stopped at CGX then VFs of that PF are
also effected hence CGX traffic should be started/stopped by
tracking all the users of it. This patch implements that CGX users
tracking. CGX is also configured along with NPC if required.
Also removed a check which mandates even number of LBK VFs.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A config option is added to disable caching of dynamic entries
like SQEs and stack pages. Also locks down all HW contexts in NDC,
preventing them from being evicted.
This option is useful when the queue count is large and there are
huge NDC cache misses. It's trade off between SQ context misses and
dynamically changing entries like SQE and stack page pointers.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Each of the NIX/NPA LFs can choose which ways of their respective
NDC caches should be used to cache their contexts. This enables
flexible configurations like disabling caching for a LF, limiting
it's context to a certain set of ways etc etc. Separate way_mask
for NIX-TX and NIX-RX is not supported.
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ingress packet replication support has been added to 96xx B0
silicon. This patch enables using that feature to replicate
ingress broadcast packets to PF and it's VFs.
Also fixed below issues
- VFs can also install NPC MCAM entry to forward broadcast pkts.
Otherwise, unless PF's interface is UP, VFs will not receive
bcast packets.
- NPC MCAM entry is disabled when PF and all it's VFs are down.
- Few corner cases in installing multicast entry list.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CN96xx initial silicon doesn't support all features pertaining to
NIX transmit scheduling and shaping.
- It supports a fixed topology of 1:1 mapped transmit
limiters at all levels.
- Supports DWRR only at SMQ/MDQ and TL1.
- Doesn't support shaping and coloring.
This patch adds HW capability structure by which each variant
and skew of silicon can be differentiated by their supported
features. And adds support for A0 silicon's transmit scheduler
capabilities or rather limitations.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for few more RSS key types for flow key
algorithm to compute rss hash index.
Following flow key types have been added.
- Tunnel types like NVGRE, VXLAN, GENEVE.
- L2 offload type ETH_DMAC, Here we will consider only DMAC 6 bytes.
- And extension header IPV6_EXT (1 byte followed by IPV6 header
- Hashing inner protocol fields for inner DMAC, IPv4/v6, TCP, UDP, SCTP.
Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Writing into NPC MCAM1 and MCAM0 registers are suppressed if
they happened to form a reserved combination. Hence
clear and disable MCAM entries before update.
For HRM:
[CAM(1)]<n>=1, [CAM(0)]<n>=1: Reserved.
The reserved combination is not allowed. Hardware suppresses any
write to CAM(0) or CAM(1) that would result in the reserved combination for
any CAM bit.
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Updated NPC KPU packet parsing profile with support for following
- Fragmentation support for IPv4 IPv6 outer header
- NIX instruction header support
- QinQ with TPID of 0x8100 as non inner most vlan tag, as legacy
network equipments still generate QinQ packets with this configuration.
- To better support RSS for tunnelled packets, udp based tunnel
protocols such as vxlan, vxlan-gpe, geneve and gtpu are now
captured into a separate layer E. Consequently, the inner
packet headers are pushed one layer down to LF, LG, and LH
accordingly.
- Support for rfc7510 mpls in udp. Up to 4 MPLS labels can be parsed
and captured in one layer LE.
- Parser support for DSA, extended DSA and eDSA tags right after
ethernet header by Marvell SOHO and Falcon switches. For extended
DSA and eDSA tags, a special PKIND of 62 is used, as these tags don't
contain a tpid field.
- Higig2 protocol header parsing support, added a NPC_LT_LA_HIGIG2_ETHER
for a combined header of HIGIG2 and Ethernet. Add a
NPC_LT_LA_IH_NIX_HIGIG2_ETHER for a combined header of nix_ih,
HIGIG2 and Ethernet on egress side. Also added 2 upper flags in LA to
indicate the presence of nix_ih and HIGIG2.
Other changes include
- IPv4.TTL==0 IPv6.HLIM==0 check
- Per RFC 1858, mark fragment offset == 1 as error
- TCP invalid flags check
- Separate error codes for outer and inner IPv4 checksum errors.
- Fix a parser error when KPU parses incoming IPSec ESP and AH packets
- NPC vtag capture/strip hardware expect tag pointer to point to
tpid/ethertype instead of tci. So move lb_ptr to point to tpid/ethertype.
- Fix npc parser error when parsing udp packets that don't have any payload.
- For a single MCAM entry to match on packets with one or stacked vlan tags
combine NPC_LT_LB_STAG and NPC_LT_LB_QINQ to NPC_LT_LB_STAG_QINQ.
- NVGRE to have a separate ltype LD_NVGRE instead of combined with LD_GRE.
- Reserve top LD/LTYPEs to support custom KPU profile fields.
Signed-off-by: Hao Zheng <haoz@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For every mailbox handler added to rvu, we are adding a function
declaration in rvu header file. Cleaned this up by adding a macro
to generate these declarations automatically.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If mailbox client has a bounce buffer or a intermediate buffer where
mbox messages are framed then copy them from there to HW buffer.
If 'mbase' and 'hw_mbase' are not same then assume 'mbase' points to
bounce buffer.
This patch also adds msg_size field to mbox header to copy only valid
data instead of whole buffer.
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Added a new mailbox API which goes through all responses
to check their IDs and response codes.
Also added logic to prevent queuing multiple works to
process the same mailbox message. This scenario happens
when AF is processing a PF's request and menawhile PF
sends ACK to AF sent UP message, then mbox_hdr->num_msgs
in the PF->AF DOWN mbox region will be nonzero and AF
will end up processing PF's request again. This is fixed
by taking a backup of num_msgs counter and clearing the
same in the mbox region before scheduling work.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Added support to display current NPC MCAM entries and counter's allocation
status ín debugfs.
cat /sys/kernel/debug/octeontx2/npc/mcam_info' will dump following info
- MCAM Rx and Tx keysize
- Total MCAM entries and counters
- Current available count
- Count of number of MCAM entries and counters allocated
by a RVU PF/VF device.
Also, one NPC MCAM counter (last one) is reserved and mapped to
NPC RX_INTF's MISS_ACTION to count dropped packets due to no MCAM
entry match. This pkt drop counter can be checked via debugfs.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A CGX port is shared by a RVU PF and it's VFs. These per
CGX port level NIX Rx/Tx counters are cumilative stats of
all NIXLFs sharing this port. These stats when compared
to CGX Rx/Tx stats helps in identifying pkts dropped within
the system, if any.
Signed-off-by: Linu Cherian <lcherian@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds CGX LMAC physical interface or serdes Rx/Tx
packet stats to debugfs.
'cat cgx<idx>/lmac<idx>/stats' dumps the current interface link
status and Rx/Tx stats. Stats include pkt received/transmitted,
dropped, pause frames etc etc.
Signed-off-by: Prakash Brahmajyosyula <bprakash@marvell.com>
Signed-off-by: Linu Cherian <lcherian@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
NDC is a data cache unit which caches NPA and NIX block's
aura/pool/RQ/SQ/CQ/etc contexts to reduce number of costly
DRAM accesses.
This patch adds support to dump cache's performance stats
like cache line hit/miss counters, average cycles taken for
accessing cached and non-cached data. This will help in
checking if NPA/NIX context reads/writes are having NDC cache
misses which inturn might effect performance.
Also changed NDC enums to reflect correct NDC hardware instance.
Signed-off-by: Prakash Brahmajyosyula <bprakash@marvell.com>
Signed-off-by: Linu Cherian <lcherian@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To aid in debugging NIX block related issues, added support to dump
NIX block LF's RQ, SQ and CQ hardware contexts in debugfs. User can
check which contexts are enabled currently and dump it's current HW
context.
Four new files 'qsize', 'rq_ctx', 'sq_ctx' and 'cq_ctx' are added to the
debugfs at 'sys/kernel/debug/octeontx2/nix/'
'echo <nixlf index> > qsize' will display current enabled CQ/SQ/RQs.
'echo <nixlf> [rq number/all] > rq_ctx',
'echo <nixlf> [sq number/all] > sq_ctx' &
'echo <nixlf> [cq number/all] > cq_ctx' will dump RQ/SQ/CQ's current
hardware context.
Signed-off-by: Prakash Brahmajyosyula <bprakash@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To aid in debugging NPA related issues, added support to dump
NPA (pool allocator) block LF's aura and pool hardware contexts in
debugfs. User can check which contexts are enabled currently and dump
it's current HW context.
Three new files 'qsize', 'aura_ctx', 'pool_ctx' are added to the
debugfs at 'sys/kernel/debug/octeontx2/npa/'
'echo <npalf index> > qsize' will display current enabled Aura/Pools.
'echo <npalf> [aura number/all] > aura_ctx' &
'echo <npalf> [aura number/all] > pool_ctx' will dump Aura/Pool
context info.
Signed-off-by: Christina Jacob <cjacob@marvell.com>
Signed-off-by: Prakash Brahmajyosyula <bprakash@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Added support to dump current resource provisioning status
of all resource virtualization unit (RVU) block's
(i.e NPA, NIX, SSO, SSOW, CPT, TIM) local functions attached
to a PF_FUNC into a debugfs file.
'cat /sys/kernel/debug/octeontx2/rsrc_alloc'
will show the current block LF's allocation status.
Signed-off-by: Christina Jacob <cjacob@marvell.com>
Signed-off-by: Prakash Brahmajyosyula <bprakash@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some device only support 4 TCs, but the driver check the total
bandwidth of 8 TCs, so may cause wrong configurations write to
the hw.
This patch uses hdev->tc_max to instead HNAE3_MAX_TC to fix it.
Fixes: e432abfb99 ("net: hns3: add common validation in hclge_dcb")
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a TC's PFC is disabled or enabled, the RX private buffer for
this TC need to be changed too, otherwise this may cause packet
dropped problem.
This patch fixes it by calling hclge_buffer_alloc to reallocate
buffer when pfc_en changes.
Fixes: cacde272dd ("net: hns3: Add hclge_dcb module for the support of DCB feature")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously, hns3 driver just directly send specific setting bit
and mask bits of MAC VLAN switch parameter to the firmware, it
can not be compatible with the old firmware, because the old one
ignores mask bits and covers all bits with new setting bits.
So when running with old firmware, the communication between PF
and VF will fail after resetting or configuring spoof check, since
they will do the MAC VLAN switch parameter configuration.
This patch fixes this problem by reading switch parameter firstly,
then just modifies the corresponding bit and sends it to firmware.
Fixes: dd2956eab1 ("net: hns3: not allow SSU loopback while execute ethtool -t dev")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pre-allocates buffers sufficient for the maximum supported MTU (2026) in
order to eliminate the possibility of resource exhaustion when changing the
MTU while the device is up.
Signed-off-by: Ulrich Hecht <uli+renesas@fpond.eu>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix build_skb for bm capable devices when they fall-back using swbm path
(e.g. when bm properties are configured in device tree but
CONFIG_MVNETA_BM_ENABLE is not set). In this case rx_offset_correction is
overwritten so we need to use it building skb instead of
MVNETA_SKB_HEADROOM directly
Fixes: 8dc9a0888f ("net: mvneta: rely on build_skb in mvneta_rx_swbm poll routine")
Fixes: 0db51da7a8 ("net: mvneta: add basic XDP support")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reported-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use r8168d_modify_extpage() also in rtl8168f_config_eee_phy() to
simplify the code.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, when ibmveth receive a loopback packet, it reports an
ambiguous error message "tx: h_send_logical_lan failed with rc=-4"
because the hypervisor rejects those types of packets. This fix
detects loopback packet and assures the source packet's MAC address
matches the driver's MAC address before transmitting to the
hypervisor.
Signed-off-by: Cris Forno <cforno12@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enable GDM GDMA_DROP_ALL mode to drop all packet during the
stop operation. This is recommended by the mt762x HW design
to drop all packet from GMAC before stopping PDMA.
Signed-off-by: MarkLee <Mark-MC.Lee@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Refine the timing of GDM/PSE setup, move it from mtk_hw_init
to mtk_open. This is recommended by the mt762x HW design to
do GDM/PSE setup only after PDMA has been started.
We exclude mt7628 in mtk_gdm_config function since it is a old IP
and there is no GDM/PSE block on it.
Signed-off-by: MarkLee <Mark-MC.Lee@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Integrate GDM/PSE setup operations into single function "mtk_gdm_config"
Signed-off-by: MarkLee <Mark-MC.Lee@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
"ret" is zero or possibly uninitialized on this error path. It
should be a negative error code instead.
Fixes: 2d0cb84dd9 ("cxgb4: add ETHOFLD hardware queue support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
irqreturn_t type is an enum and in this context it's unsigned, so "err"
can't be irqreturn_t or it breaks the error handling. In fact the "err"
variable is only used to store integers (never irqreturn_t) so it should
be declared as int.
I removed the initialization because it's not required. Using a bogus
initializer turns off GCC's uninitialized variable warnings. Secondly,
there is a GCC warning about unused assignments and we would like to
enable that feature eventually so we have been trying to remove these
unnecessary initializers.
Fixes: 7b0c342f1f ("net: atlantic: code style cleanup")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the array overrun while keeping the eth_addr and eth_addr_mask
pointers as u16 to avoid unaligned u16 access. These were overlooked
when modifying the code to use u16 pointer for proper alignment.
Fixes: 90f906243b ("bnxt_en: Add support for L2 rewrite")
Reported-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since both tc rules and flow table rules are of the same format,
we can re-use tc parsing for that, and move the flow table rules
to their steering domain - In this case, the next chain after
max tc chain.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Use devlink instance name space to set the netdev net namespace.
Preparation patch for devlink reload implementation.
Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Be sure to release the neighbour in case of failures after successful
route lookup.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Neighbour initializations to NULL are not necessary as the pointers are
not used if an error is returned, and if success returned, pointers are
initialized.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
mlx5_device_disable_sriov() currently reads num_vfs from the PCI core.
However when mlx5_device_disable_sriov() is executed, SR-IOV is
already disabled at the PCI level.
Due to this disable_hca() cleanup is not done during SR-IOV disable
flow.
mlx5_sriov_disable()
pci_enable_sriov()
mlx5_device_disable_sriov() <- num_vfs is zero here.
When SR-IOV enablement fails during mlx5_sriov_enable(), HCA's are left
in enabled stage because mlx5_device_disable_sriov() relies on num_vfs
from PCI core.
mlx5_sriov_enable()
mlx5_device_enable_sriov()
pci_enable_sriov() <- Fails
Hence, to overcome above issues,
(a) Read num_vfs before disabling SR-IOV and use it.
(b) Use num_vfs given when enabling sriov in error unwinding path.
Fixes: d886aba677 ("net/mlx5: Reduce dependency on enabled_vfs counter and num_vfs")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When selecting a matcher ste_builder_arr will always be evaluated
as true, instead check if num_of_builders is set for validity.
Fixes: 667f264676 ("net/mlx5: DR, Support IPv4 and IPv6 mixed matcher")
Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
1) New generic devlink param "enable_roce", for downstream devlink
reload support
2) Do vport ACL configuration on per vport basis when
enabling/disabling a vport. This enables to have vports enabled/disabled
outside of eswitch config for future
3) Split the code for legacy vs offloads mode and make it clear
4) Tide up vport locking and workqueue usage
5) Fix metadata enablement for ECPF
6) Make explicit use of VF property to publish IB_DEVICE_VIRTUAL_FUNCTION
7) E-Switch and flow steering core low level support and refactoring for
netfilter flowtables offload
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Netfilter tables (nftables) implements a software datapath that
comes after tc ingress datapath. The datapath supports offloading
such rules via the flow table offload API.
This API is currently only used by NFT and it doesn't provide the
global priority in regards to tc offload, so we assume offloading such
rules must come after tc. It does provide a flow table priority
parameter, so we need to provide some supported priority range.
For that, split fastpath prio to two, flow table offload and tc offload,
with one dedicated priority chain for flow table offload.
Next patch will re-use the multi chain API to access this chain by
allowing access to this chain by the fdb_sub_namespace.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Next patch will re-use this to add a new chain but in a
different prio.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Tc chains are implemented by creating a chained prio steering type, and
inside it there is a namespace for each chain (FDB_TC_MAX_CHAINS). Each
of those has a list of priorities.
Currently, all namespaces in a prio start at the parent prio level.
But since we can jump from chain (namespace) to another chain in the
same prio, we need the levels for higher chains to be higher as well.
So we created unused prios to account for levels in previous namespaces.
Fix that by accumulating the namespaces levels if we are inside a chained
type prio, and removing the unused prios.
Fixes: 328edb499f ('net/mlx5: Split FDB fast path prio to multiple namespaces')
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Define FDB_TC_LEVELS_PER_PRIO instead of magic number 2.
This is the number of levels used by each tc prio table in the fdb.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Rename it to prepare for next patch that will add a
different type of offload to the FDB.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
FDB_MAX_CHAIN and FDB_MAX_PRIO were defined differently depending
on if CONFIG_MLX5_ESWITCH is enabled to save space on allocations.
This is a minor space saving, and there is no real need for it.
Simplify things instead, and define them the same in both cases.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
There is a return statement that is indented too deeply, remove
the extraneous tab.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch converts the MACB Ethernet driver to the Phylink framework.
The MAC configuration is moved to the Phylink ops and Phylink helpers
are now used in the ethtools functions. This helps to access the flow
control and pauseparam logic and this will be helpful in the future for
boards using this controller with SFP cages.
Tested-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch moves the Tx and Rx buffer initialization into its own
function. This does not modify the behaviour of the driver and will be
helpful to convert the driver to phylink.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To enable xilinx axi_emac driver support on zynqmp ultrascale platform
(ARCH64) there are two choices, mention ARCH64 as a dependency list
and other is to check if this ARCH dependency list is really needed.
Later approach seems more reasonable, so remove the obsolete ARCH
dependency list for the axi_emac driver.
Sanity test done for microblaze, zynq and zynqmp ultrascale platform.
Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds the missing support for the PHY mode RGMII_RXID.
It's necessary for the Raspberry Pi 4.
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The register access in bcmgenet_mii_config() is a little bit opaque and
not easy to extend. In preparation for the missing RGMII PHY modes
move all the phy name assignments into the switch statement and the
register access to the end of the function. This make the code easier
to read and extend.
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The BCM2711 needs a different maximum DMA burst length. If not set
accordingly a timeout in the transmit queue happens and no package
can be sent. So use the new compatible to derive this value.
Until now the GENET HW version was used as the platform identifier.
This doesn't work with SoC-specific modifications, so introduce a proper
platform data structure.
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Matthias Brugger <mbrugger@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes the error handling for the mandatory IRQs. There is no need
for the error message anymore, this is now handled by platform_get_irq.
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As platform_get_irq() now prints an error when the interrupt does not
exist, we are getting a confusing error message in case the optional
WOL IRQ is not defined:
bcmgenet fd58000.ethernet: IRQ index 2 not found
Fix this by using the platform_get_irq_optional().
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The setup_dpio() function tries to allocate a number of channels equal
to the number of CPUs online. When there are not enough DPCON objects
already probed, the function will return EPROBE_DEFER. When this
happens, the already allocated channels are not freed. This results in
the incapacity of properly probing the next time around.
Fix this by freeing the channels on the error path.
Fixes: d7f5a9d89a ("dpaa2-eth: defer probe on object allocate")
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The sfc driver can drop packets processed with XDP, notably when running
out of buffer space on XDP_TX, or returning an unknown XDP action.
This increments the rx_xdp_bad_drops ethtool counter.
Call trace_xdp_exception everywhere rx_xdp_bad_drops is incremented,
except for fragmented RX packets as the XDP program hasn't run yet.
This allows it to easily be monitored from userspace.
This mirrors the behavior of other drivers.
Signed-off-by: Arthur Fabre <afabre@cloudflare.com>
Acked-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove redundant code from fw_fatal reporter's dump callback. Use
updated devlink interface of binary fmsg pair which breaks the output
into chunks internally.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix sparse warnings:
drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_mqprio.c:242:6: warning: symbol 'cxgb4_mqprio_free_hw_resources' was not declared. Should it be static?
Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: 2d0cb84dd9 ("cxgb4: add ETHOFLD hardware queue support")
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix sparse warnings:
drivers/net/ethernet/aquantia/atlantic/aq_ethtool.c:706:5: warning: symbol 'aq_ethtool_get_priv_flags' was not declared. Should it be static?
drivers/net/ethernet/aquantia/atlantic/aq_ethtool.c:713:5: warning: symbol 'aq_ethtool_set_priv_flags' was not declared. Should it be static?
Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: ea4b4d7fc1 ("net: atlantic: loopback tests via private flags")
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix sparse warnings:
drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c:426:25: warning: symbol 'aq_pm_ops' was not declared. Should it be static?
Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: 8aaa112a57 ("net: atlantic: refactoring pm logic")
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make sure to enable EMAD string TLV only after using the required firmware
version.
Signed-off-by: Shalom Toledo <shalomt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case the firmware had an error while processing EMADs, it can send back
an ASCII string with the reason using EMAD string TLV.
This patch adds the support for using EMAD string TLV. In case of an error,
reports the reason using devlink hwerr tracepoint.
Signed-off-by: Shalom Toledo <shalomt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Extend EMAD information reported to devlink hwerr tracepoint with
transaction id and reg id (both, hex and string).
Signed-off-by: Shalom Toledo <shalomt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During parsing of incoming EMADs, fill the string TLV's offset when it is
used.
Signed-off-by: Shalom Toledo <shalomt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add EMAD string TLV, an ASCII string the driver can receive from the
firmware in case of an error.
Signed-off-by: Shalom Toledo <shalomt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Until now the code assumes a fixed structure which makes it difficult to
support EMADs with and without new TLVs.
Make it more generic by parsing the TLVs when the EMADs are received and
store the offset to the different TLVs in the control block. Using these
offsets to extract information from the EMADs without relying on a specific
structure.
Signed-off-by: Shalom Toledo <shalomt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If TI_DAVINCI_EMAC=y and GENERIC_ALLOCATOR is not set,
below erros can be seen:
drivers/net/ethernet/ti/davinci_cpdma.o: In function `cpdma_desc_pool_destroy.isra.14':
davinci_cpdma.c:(.text+0x359): undefined reference to `gen_pool_size'
davinci_cpdma.c:(.text+0x365): undefined reference to `gen_pool_avail'
davinci_cpdma.c:(.text+0x373): undefined reference to `gen_pool_avail'
davinci_cpdma.c:(.text+0x37f): undefined reference to `gen_pool_size'
drivers/net/ethernet/ti/davinci_cpdma.o: In function `__cpdma_chan_free':
davinci_cpdma.c:(.text+0x4a2): undefined reference to `gen_pool_free_owner'
drivers/net/ethernet/ti/davinci_cpdma.o: In function `cpdma_chan_submit_si':
davinci_cpdma.c:(.text+0x66c): undefined reference to `gen_pool_alloc_algo_owner'
davinci_cpdma.c:(.text+0x805): undefined reference to `gen_pool_free_owner'
drivers/net/ethernet/ti/davinci_cpdma.o: In function `cpdma_ctlr_create':
davinci_cpdma.c:(.text+0xabd): undefined reference to `devm_gen_pool_create'
davinci_cpdma.c:(.text+0xb79): undefined reference to `gen_pool_add_owner'
drivers/net/ethernet/ti/davinci_cpdma.o: In function `cpdma_check_free_tx_desc':
davinci_cpdma.c:(.text+0x16c6): undefined reference to `gen_pool_avail'
This patch mades TI_DAVINCI_EMAC select GENERIC_ALLOCATOR.
Fixes: 99f6297182 ("net: ethernet: ti: cpsw: drop TI_DAVINCI_CPDMA config option")
Signed-off-by: Mao Wenan <maowenan@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement the UDP Segmentation Offload feature in stmmac. This is only
available in GMAC4+ cores.
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This looks over-engineered. Let's use some helpers to get the buffer
length and hereby simplify the stmmac_rx() function. No performance drop
was seen with the new implementation.
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
XGMAC3 supports full CBS features with speeds that can go up to 10G so
we can now remove the maximum speed check of CBS.
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the support for C45 PHYs in the MDIO callbacks for XGMAC. This was
tested using Synopsys DesignWare XPCS.
v2:
- Pull out the readl_poll_timeout() calls into common code (Andrew)
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
GMAC4+ cores also support the Split Header feature.
Add the support for Split Header feature in the RX path following the
same implementation logic that XGMAC followed.
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The VID is converted to le16 so the variable must be __le16 type.
Reported-by: kbuild test robot <lkp@intel.com>
Fixes: c7ab0b8088 ("net: stmmac: Fallback to VLAN Perfect filtering if HASH is not available")
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Variable hdr_len is being assigned a value that is never read.
The assignment is redundant and hence can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Call devlink enable only during probe time and avoid deadlock
during reload.
Reported-by: Shalom Toledo <shalomt@mellanox.com>
Fixes: a0c76345e3 ("devlink: disallow reload operation during device cleanup")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Tested-by: Shalom Toledo <shalomt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Call devlink enable only during probe time and avoid deadlock
during reload.
Reported-by: Shalom Toledo <shalomt@mellanox.com>
Fixes: 5a508a254b ("devlink: disallow reload operation during device cleanup")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Tested-by: Shalom Toledo <shalomt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for chip version RTL8117. Settings have been copied from
Realtek's r8168 driver, there however chip ID 54a belongs to a chip
version called RTL8168FP. It was confirmed that RTL8117 works with
Realtek's driver, so both chip versions seem to be the same or at
least compatible.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, if network is re-started, we advertise all supported EEE
modes, thus potentially overriding a manual adjustment the user made
e.g. via ethtool. Be friendly to the user and preserve a manual
setting on network re-start.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When PHY is not powered, the probe function fail and some resource are
still unallocated.
Furthermore some BUG happens:
dwmac-sun8i 5020000.ethernet: EMAC reset timeout
------------[ cut here ]------------
kernel BUG at /linux-next/net/core/dev.c:9844!
So let's use the right function (stmmac_pltfr_remove) in the error path.
Fixes: 9f93ac8d40 ("net-next: stmmac: Add dwmac-sun8i")
Cc: <stable@vger.kernel.org> # v4.15+
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
VSC7514 is a 10-port switch with 2 extra "CPU ports" (targets in the
queuing subsystem for terminating traffic locally).
There are 2 issues with hardcoding the CPU port as #10:
- It is not clear which snippets of the code are configuring something
for one of the CPU ports, and which snippets are just doing something
related to the number of physical ports.
- Actually any physical port can act as a CPU port connected to an
external CPU (in addition to the local CPU). This is called NPI mode
(Node Processor Interface) and is the way that the 6-port VSC9959
(Felix) switch is integrated inside NXP LS1028A (the "local management
CPU" functionality is not used there).
This patch makes it clear that the ocelot_bridge_stp_state_set function
operates on the CPU port (by making it an implicit member of the
bridging domain), and at the same time adds logic for the NPI port (aka
a physical port) to play the role of a CPU port (it shouldn't be part of
bridge_fwd_mask, as it's not explicitly enslaved to a bridge).
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that the places that configure routing destinations for the CPU port
have been marked as such, allow callers to specify their own CPU port
that is different than ocelot->num_phys_ports. A user will be the Felix
DSA driver, where the CPU port is one of the physical ports (NPI mode).
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This will be called from the Felix DSA frontend, which will work in
PHYLIB compatibility mode initially.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is just common path code that belongs to ocelot_init,
it has nothing to do with a specific SoC/board instance.
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow these functions to be called from the .port_enable and
.port_disable callbacks of DSA.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need a function for the DSA front-end that does none of the
net_device registration, but initializes the hardware ports.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The VSC7514 switch (Ocelot) is a 10-port device, while VSC9959 (Felix)
is 6-port. Therefore the VLAN filtering mask would be out of bounds when
calling for this new switch. Fix that.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert them into an implementation that can be called from DSA as well.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ocelot and ocelot_port structures will be used by a new DSA driver,
so the ocelot_board.c file will have to allocate and work with a private
structure (ocelot_port_private), which embeds the generic struct
ocelot_port. This is because in DSA, at least one interface does not
have a net_device, and the DSA driver API does not interact with that
anyway.
The ocelot_port structure is equivalent to dsa_port, and ocelot to
dsa_switch. The members of ocelot_port which have an equivalent in
dsa_port (such as dp->vlan_filtering) have been moved to
ocelot_port_private.
We want to enforce the coding convention that "ocelot_port" refers to
the structure, and "port" refers to the integer index. One can retrieve
the structure at any time from ocelot->ports[port].
The patch is large but only contains variable renaming and mechanical
movement of fields from one structure to another.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ocelot_port structure has a net_device embedded in it, which makes
it unsuitable for leaving it in the driver implementation functions.
Leave ocelot_flower.c untouched. In that file, ocelot_port is used as an
interface to the tc shared blocks. That will be addressed in the next
patch.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is needed so that the Felix DSA front-end can call the Ocelot
implementations.
The implementation of the "mc_disabled" switchdev attribute has also
been simplified by using the read-modify-write macro instead of
open-coding that operation.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is needed in order to present a simpler prototype to the DSA
front-end of ocelot.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To be able to implement a DSA front-end over ocelot_fdb_add,
ocelot_fdb_del, ocelot_fdb_dump, these need to have a simple function
prototype that is independent of struct net_device, netlink skb, etc.
So rename the ndo ops of the ocelot driver into
ocelot_port_fdb_{add,del,dump}, and have them all call the abstract
implementations. At the same time, refactor ocelot_port_fdb_do_dump into
a function whose prototype is compatible with dsa_fdb_dump_cb_t, so that
the do_dump implementations can live together and be called by the
ocelot_fdb_dump through a function pointer.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need an implementation of these functions that is agnostic to the
higher layer (switchdev or dsa).
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch transforms the ocelot_vlan_port_apply function ("apply
what?") into 3 standalone functions:
- ocelot_port_vlan_filtering
- ocelot_port_set_native_vlan
- ocelot_port_set_pvid
These functions have a prototype that is better aligned to the DSA API.
The function also had some static initialization (TPID, drop frames with
multicast source MAC) which was not being changed from any place, so
that was just moved to ocelot_probe_port (one of the 6 callers of
ocelot_vlan_port_apply).
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The LAN743x Ethernet controller provides two independent PTP event
channels. Each one can be used to generate a periodic output from
the PTP clock. The output can be routed to any one of the available
GPIO pins on the device.
The PTP clock API can now be used to:
- select any LAN743x GPIO pin to function as a periodic output
- select either LAN743x PTP event channel to generate the output
The LAN7430 has 4 GPIO pins that are multiplexed with its internal
PHY LED control signals. A pin assigned to the LED control function
will be assigned to the GPIO function if selected for PTP periodic
output.
Signed-off-by: John Efstathiades <john.efstathiades@pebblebay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Register "enable_roce" param, default value is RoCE enabled.
Current configuration is stored on mlx5_core_dev and exposed to user
through the cmode runtime devlink param.
Changing configuration requires changing the cmode driverinit devlink
param and calling devlink reload.
Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
rtl8168c_4_hw_phy_config() duplicates rtl8168c_3_hw_phy_config(),
so we can remove the function.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Certain integrated PHY's from RTL8168d support extended pages. On page
0x0007 the number of the extended page is written to register 0x1e,
then the registers on the extended page can be accessed. Add a helper
for this to improve readability and simplify the code.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the phylib MDIO access functions in more places to simplify
the code.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Integrated PHY's from RTL8168d support an indirect access method for
PHY parameters. On page 0x0005 parameter number is written to register
0x05, then the parameter can be accessed via register 0x06.
Add a helper for this to improve readability and simplify the code.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Integrated PHY's from RTL8168g support an indirect access method for
PHY parameters. On page 0x0a43 parameter number is written to register
0x13, then the parameter can be accessed via register 0x14.
Add a helper for this to improve readability and simplify the code.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a race between driver code that does setup/cleanup of device
and devlink reload operation that in some drivers works with the same
code. Use after free could we easily obtained by running:
while true; do
echo "0000:00:10.0" >/sys/bus/pci/drivers/mlxsw_spectrum2/bind
devlink dev reload pci/0000:00:10.0 &
echo "0000:00:10.0" >/sys/bus/pci/drivers/mlxsw_spectrum2/unbind
done
Fix this by enabling reload only after setup of device is complete and
disabling it at the beginning of the cleanup process.
Reported-by: Ido Schimmel <idosch@mellanox.com>
Fixes: 2d8dc5bbf4 ("devlink: Add support for reload")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
One conflict in the BPF samples Makefile, some fixes in 'net' whilst
we were converting over to Makefile.target rules in 'net-next'.
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
Intel Wired LAN Driver Fixes 2019-11-08
This series contains fixes to igb, igc, ixgbe, i40e, iavf and ice
drivers.
Colin Ian King fixes a potentially wrap-around counter in a for-loop.
Nick fixes the default ITR values for the iavf driver to 50 usecs
interval.
Arkadiusz fixes 'ethtool -m' for X722 devices where the correct value
cannot be obtained from the firmware, so add X722 to the check to ensure
the wrong value is not returned.
Jake fixes igb and igc drivers in their implementation of launch time
support by declaring skb->tstamp value as ktime_t instead of s64.
Magnus fixes ixgbe and i40e where the need_wakeup flag for transmit may
not be set for AF_XDP sockets that are only used to send packets.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The need_wakeup flag for Tx might not be set for AF_XDP sockets that
are only used to send packets. This happens if there is at least one
outstanding packet that has not been completed by the hardware and we
get that corresponding completion (which will not generate an
interrupt since interrupts are disabled in the napi poll loop) between
the time we stopped processing the Tx completions and interrupts are
enabled again. In this case, the need_wakeup flag will have been
cleared at the end of the Tx completion processing as we believe we
will get an interrupt from the outstanding completion at a later point
in time. But if this completion interrupt occurs before interrupts
are enable, we lose it and should at that point really have set the
need_wakeup flag since there are no more outstanding completions that
can generate an interrupt to continue the processing. When this
happens, user space will see a Tx queue need_wakeup of 0 and skip
issuing a syscall, which means will never get into the Tx processing
again and we have a deadlock.
This patch introduces a quick fix for this issue by just setting the
need_wakeup flag for Tx to 1 all the time. I am working on a proper
fix for this that will toggle the flag appropriately, but it is more
challenging than I anticipated and I am afraid that this patch will
not be completed before the merge window closes, therefore this easier
fix for now. This fix has a negative performance impact in the range
of 0% to 4%. Towards the higher end of the scale if you have driver
and application on the same core and issue a lot of packets, and
towards no negative impact if you use two cores, lower transmission
speeds and/or a workload that also receives packets.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The need_wakeup flag for Tx might not be set for AF_XDP sockets that
are only used to send packets. This happens if there is at least one
outstanding packet that has not been completed by the hardware and we
get that corresponding completion (which will not generate an
interrupt since interrupts are disabled in the napi poll loop) between
the time we stopped processing the Tx completions and interrupts are
enabled again. In this case, the need_wakeup flag will have been
cleared at the end of the Tx completion processing as we believe we
will get an interrupt from the outstanding completion at a later point
in time. But if this completion interrupt occurs before interrupts
are enable, we lose it and should at that point really have set the
need_wakeup flag since there are no more outstanding completions that
can generate an interrupt to continue the processing. When this
happens, user space will see a Tx queue need_wakeup of 0 and skip
issuing a syscall, which means will never get into the Tx processing
again and we have a deadlock.
This patch introduces a quick fix for this issue by just setting the
need_wakeup flag for Tx to 1 all the time. I am working on a proper
fix for this that will toggle the flag appropriately, but it is more
challenging than I anticipated and I am afraid that this patch will
not be completed before the merge window closes, therefore this easier
fix for now. This fix has a negative performance impact in the range
of 0% to 4%. Towards the higher end of the scale if you have driver
and application on the same core and issue a lot of packets, and
towards no negative impact if you use two cores, lower transmission
speeds and/or a workload that also receives packets.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When implementing launch time support in the igb and igc drivers, the
skb->tstamp value is assumed to be a s64, but it's declared as a ktime_t
value.
Although ktime_t is typedef'd to s64 it wasn't always, and the kernel
provides accessors for ktime_t values.
Use the ktime_to_timespec64 and ktime_set accessors instead of directly
assuming that the variable is always an s64.
This improves portability if the code is ever moved to another kernel
version, or if the definition of ktime_t ever changes again in the
future.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch contains fix for a problem with command:
'ethtool -m <dev>'
which breaks functionality of:
'ethtool <dev>'
when called on X722 NIC
Disallowed update of link phy_types on X722 NIC
Currently correct value cannot be obtained from FW
Previously wrong value returned by FW was used and was
a root cause for incorrect output of 'ethtool <dev>' command
Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Since commit 92418fb147 ("i40e/i40evf: Use usec value instead of reg
value for ITR defines") the driver tracks the interrupt throttling
intervals in single usec units, although the actual ITRN registers are
programmed in 2 usec units. Most register programming flows in the driver
correctly handle the conversion, although it is currently not applied when
the registers are initialized to their default values. Most of the time
this doesn't present a problem since the default values are usually
immediately overwritten through the standard adaptive throttling mechanism,
or updated manually by the user, but if adaptive throttling is disabled and
the interval values are left alone then the incorrect value will persist.
Since the intended default interval of 50 usecs (vs. 100 usecs as
programmed) performs better for most traffic workloads, this can lead to
performance regressions.
This patch adds the correct conversion when writing the initial values to
the ITRN registers.
Signed-off-by: Nicholas Nunley <nicholas.d.nunley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently the for-loop counter i is a u8 however it is being checked
against a maximum value hw->num_tx_sched_layers which is a u16. Hence
there is a potential wrap-around of counter i back to zero if
hw->num_tx_sched_layers is greater than 255. Fix this by making i
a u16.
Addresses-Coverity: ("Infinite loop")
Fixes: b36c598c99 ("ice: Updates to Tx scheduler code")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There is a race between driver code that does setup/cleanup of device
and devlink reload operation that in some drivers works with the same
code. Use after free could we easily obtained by running:
while true; do
echo 10 > /sys/bus/netdevsim/new_device
devlink dev reload netdevsim/netdevsim10 &
echo 10 > /sys/bus/netdevsim/del_device
done
Fix this by enabling reload only after setup of device is complete and
disabling it at the beginning of the cleanup process.
Reported-by: Ido Schimmel <idosch@mellanox.com>
Fixes: 2d8dc5bbf4 ("devlink: Add support for reload")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
100GbE Intel Wired LAN Driver Updates 2019-11-08
Another series that contains updates to the ice driver only.
Anirudh cleans up the code of kernel config of ifdef wrappers by moving
code that is needed by DCB to disable and enable the PF VSI for
configuration. Implements ice_vsi_type_str() to convert an VSI type
enum value to its string equivalent to help identify VSI types from
module print statements.
Usha and Tarun add support for setting the maximum per-queue bit rate
for transmit queues.
Dave implements dcb_nl set functions and supporting software DCB
functions to support the callbacks defined in the dcbnl_rtnl_ops
structure.
Henry adds a check to ensure we are not resetting the device when trying
to configure it, and to return -EBUSY during a reset.
Usha fixes a call trace caused by the receive/transmit descriptor size
change request via ethtool when DCB is configured by using the number of
enabled queues and not the total number of allocated queues.
Paul cleans up and refactors the software LLDP configuration to handle
when firmware DCBX is disabled.
Akeem adds checks to ensure the VF or PF is not disabled before honoring
mailbox messages to configure the VF.
Brett corrects the check to make sure the vector_id passed down from
iavf is less than the max allowed interrupts per VF. Updates a flag bit
to align with the current specification.
Bruce updates a switch statement to use the correct status of the
Download Package AQ command. Does some housekeeping by cleaning up a
conditional check that is not needed.
Mitch shortens up the delay for SQ responses to resolve issues with VF
resets failing.
Jake cleans up the code reducing namespace pollution and to simplify
ice_debug_cq() since it always uses the same mask, not need to pass it
in. Improve debugging by adding the command opcode in the debug
messages that print an error code.
v2: fixed reverse christmas tree issue in patch 3 of the series.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
To help aid in debugging, display the command opcode in debug messages
that print an error code. This makes it easier to see what command
failed if only ICE_DBG_AQ_MSG is enabled.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
ice_debug_cq is passed a mask which is always ICE_DBG_AQ_CMD. Modify this
function, removing the mask parameter entirely, and directly use the more
appropriate ICE_DBG_AQ_DESC and ICE_DBG_AQ_DESC_BUF.
The function is only called from ice_controlq.c, and has no
other callers outside of that file. Move it and mark it static to avoid
namespace pollution.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
ice_vsi_type_str converts an ice_vsi_type enum value to its string
equivalent. This is expected to help easily identify VSI types from
module print statements.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There is no reason to do this conditional check before the assignment so
simply remove it.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently the VLAN ice_flg64_bits are off by 1. Fix this by
setting the ICE_FLG_EVLAN_x8100 flag to 14, which also updates
ICE_FLG_EVLAN_x9100 to 15 and ICE_FLG_VLAN_x8100 to 16.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Shorten the delay for SQ responses, but increase the number of loops.
Max delay time is unchanged, but some operations complete much more
quickly.
In the process, add a new define to make the delay count and delay time
more explicit. Add comments to make things more explicit.
This fixes a problem with VF resets failing on with many VFs.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Since the return value from the Download Package AQ command is stored in
hw->pkg_dwnld_status, use that instead of sq_last_status since that may
have the return value from some other AQ command leading to unexpected
results.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently we check to make sure the vector_id passed down from iavf
is less than or equal to pf->hw.func_caps.common_caps.num_msix_vectors.
This is incorrect because the vector_id is always 0-based and never
greater than or equal to the ICE_MAX_INTR_PER_VF. Fix this by checking
to make sure the vector_id is less than the max allowed interrupts per
VF (ICE_MAX_INTR_PER_VF).
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds code to check if PF or VF is disabled before honoring
mailbox message to configure VF - If it is disabled, and opcode is for
resetting VF, the PF driver simply tell VF that all is set. In addition,
if reset is ongoing, and Admin intend to configure VF on the host, we can
poll the VF enabling bit to make sure it is ready before continue - If
after ~250 milliseconds, VF is not in active state, we can bail out with
invalid error.
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Move software LLDP configuration when FW DCBX is disabled to
ice_init_pf_dcb, since that is where the FW DCBX state is determined.
Remove this software LLDP configuration from ice_vsi_setup and
ice_set_priv_flags. Software configuration includes redirecting Rx LLDP
packets up the stack, when FW DCBX is not running.
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch fixes the call trace caused by the kernel when the Rx/Tx
descriptor size change request is initiated via ethtool when DCB is
configured. ice_set_ringparam() should use vsi->num_txq instead of
vsi->alloc_txq as it represents the queues that are enabled in the
driver when DCB is enabled/disabled. Otherwise, queue index being
used can go out of range.
For example, when vsi->alloc_txq has 104 queues and with 3 TCS enabled
via DCB, each TC gets 34 queues, vsi->num_txq will be 102 and only 102
queues will be enabled.
Signed-off-by: Usha Ketineni <usha.k.ketineni@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Certain subsystems behave very badly when called during reset (core
dump). This patch returns -EBUSY when reconfiguring some subsystems
during reset. With this patch some ethtool functions will not core
dump during reset.
Signed-off-by: Henry Tieman <henry.w.tieman@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Implement interface layer for the DCBNL subsystem. These are the functions
to support the callbacks defined in the dcbnl_rtnl_ops struct. These
callbacks are going to be used to interface with the DCB settings of the
device. Implementation of dcb_nl set functions and supporting SW DCB
functions.
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Allow for rate limiting Tx queues. Bitrate is set in
Mbps(megabits per second).
Mbps max-rate is set for the queue via sysfs:
/sys/class/net/<iface>/queues/tx-<queue>/tx_maxrate
ex: echo 100 >/sys/class/net/ens7/queues/tx-0/tx_maxrate
echo 200 >/sys/class/net/ens7/queues/tx-1/tx_maxrate
Note: A value of zero for tx_maxrate means disabled,
default is disabled.
Signed-off-by: Usha Ketineni <usha.k.ketineni@intel.com>
Co-developed-by: Tarun Singh <tarun.k.singh@intel.com>
Signed-off-by: Tarun Singh <tarun.k.singh@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
DCB configuration flow needs to disable and enable only the PF (main)
VSI, so use ice_ena_vsi and ice_dis_vsi. To avoid the use of ifdef to
control the staticness of these functions, move them to ice_lib.c.
Also replace the allocate and copy of old_cfg to kmemdup() in
ice_pf_dcb_cfg().
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Fix following compile error on i386 architecture.
ERROR: "__udivdi3" [drivers/net/ethernet/chelsio/cxgb4/cxgb4.ko] undefined!
Fixes: 0e395b3cb1 ("cxgb4: add FLOWC based QoS offload")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
match_string() returns the array index of a matching string.
Use it instead of the open-coded implementation.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add optional support for syscfg clock in dwmac-stm32.c
Now Syscfg clock is activated automatically when syscfg
registers are used
Signed-off-by: Christophe Roullier <christophe.roullier@st.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Octeon's input ring-buffer entry has 14 bits-wide size field, so to account
for second possible VLAN header max_mtu must be further reduced.
Fixes: 109cc16526 ("ethernet/cavium: use core min/max MTU checking")
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Changing the MTU can be a frequent operation and it is already clear
when (or not) a MTU change is successful, demote prints to debug prints.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Timur Tabi <timur@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Changing a network device MTU can be a fairly frequent operation, and
failure to change the MTU is reflected to user-space properly, both by
an appropriate message as well as by looking at whether the device's MTU
matches the configuration.
Demote the prints to debug prints by using netdev_dbg(), making all
Intel wired LAN drivers consistent, since they used a mixture of PCI
device and network device prints before.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update on more short variant for getting real clock in ns.
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Aquantia is now part of Marvell, eventually we'll cease standalone
aquantia.com domain. Thus, change the maintainers file and some other
references to @marvell.com domain
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
atlantic hardware does support UDP hardware segmentation offload.
This allows user to specify one large contiguous buffer with data
which then will be split automagically into multiple UDP packets
of specified size.
Bulk sending of large UDP streams lowers CPU usage and increases
bandwidth.
We did estimations both with udpgso_bench_tx test tool and with modified
iperf3 measurement tool (4 streams, multithread, 200b packet size)
over AQC<->AQC 10G link. Flow control is disabled to prevent RX side
impact on measurements.
No UDP GSO:
iperf3 -c 10.0.1.2 -u -b0 -l 200 -P4 --multithread
UDP GSO:
iperf3 -c 10.0.1.2 -u -b0 -l 12600 --udp-lso 200 -P4 --multithread
Mode CPU iperf speed Line speed Packets per second
-------------------------------------------------------------
NO UDP GSO 350% 3.07 Gbps 3.8 Gbps 1,919,419
SW UDP GSO 200% 5.55 Gbps 6.4 Gbps 3,286,144
HW UDP GSO 90% 6.80 Gbps 8.4 Gbps 4,273,117
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We now differentiate requested and negotiated flow control
modes. Therefore `ethtool -A` now operates on local requested
FC values, and regular link settings shows the negotiated FC
settings.
Signed-off-by: Nikita Danilov <ndanilov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We are trying to follow the naming of the chip (atlantic), not
company. So replace some old namings.
Signed-off-by: Nikita Danilov <ndanilov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thats a pure checkpatck walkthrough the code with no functional
changes. Reverse christmas tree, spacing, etc.
Signed-off-by: Nikita Danilov <ndanilov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Here we add a number of ethtool private flags
to allow enabling various loopbacks on HW.
Thats useful for verification and bringup works.
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Device FW has a separate memory area where various
config fields are stored and could be used by the
driver.
Here we modify download/upload infrastructure to
allow accessing this area.
Lateron this will be used to configure various behaviours
Signed-off-by: Nikita Danilov <ndanilov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
`ethtool -p eth0` will blink leds helping identify
physical port.
Signed-off-by: Nikita Danilov <ndanilov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We add ethtool msglevel configuration and change some
printouts to use netdev_info set of functions.
Signed-off-by: Nikita Danilov <ndanilov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We now implement .driver.pm callbacks, these
allows driver to work correctly in hibernate
usecases, especially when used in conjunction with
WOL feature.
Before that driver only reacted to legacy .suspend/.resume
callbacks, that was a limitation in some cases.
Signed-off-by: Nikita Danilov <ndanilov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wake on PHY allows to configure device to wakeup host
as soon as PHY link status is changed to active.
Signed-off-by: Nikita Danilov <ndanilov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Here we improve FW interface structures layout
and prepare these for the wake phy feature implementation.
Signed-off-by: Nikita Danilov <ndanilov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the trap IDs used to report layer 3 exceptions.
Trapped packets are first reported to devlink and then injected to the
kernel's receive path. All the packets have 'offload_fwd_mark' set in
order to prevent them from potentially being forwarded by the bridge
again.
Signed-off-by: Amit Cohen <amitc@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, mlxsw does not differentiate between these two cases of
routes with invalid nexthops:
1. Nexthops whose nexthop device is a mlxsw upper (has a RIF), but whose
neighbour could not be resolved
2. Nexthops whose nexthop device is not a mlxsw upper (e.g., management
interface)
Up until now this did not matter and mlxsw trapped packets for both
cases using the same trap ID. However, packets that should have been
routed in hardware (case 1), but incurred a problem are considered
exceptions and should be reported to the user. The two cases should
therefore be split between two different trap IDs.
Allocate a new adjacency entry during initialization and upon the
insertion of the first route with an invalid mlxsw nexthop, program this
entry to discard packets. Packets hitting this entry will be reported
using new trap ID - "DISCARD_ROUTER3".
In the future, the entry could be written during initialization, but
currently firmware requires a valid RIF, which is not available at this
stage.
Signed-off-by: Amit Cohen <amitc@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, packets that cannot be routed in hardware (e.g., nexthop
device is not upper of mlxsw), are trapped to the kernel for forwarding.
Such packets are trapped using "RTR_INGRESS0" trap. This trap also traps
packets that hit reject routes (e.g., "unreachable") so that the kernel
will generate the appropriate ICMP error message for them.
Subsequent patch will need to only report to devlink packets that hit a
reject route, which is impossible as long as "RTR_INGRESS0" is
overloaded like that.
Solve this by using "RTR_INGRESS1" trap for packets that hit reject
routes.
Signed-off-by: Amit Cohen <amitc@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the trap IDs and trap group used to report layer 3 drops. Register
layer 3 packet traps and associated layer 3 trap group with devlink
during driver initialization.
Signed-off-by: Amit Cohen <amitc@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Return -EOPNOTSUPP instead of -EINVAL if the requested ioctl is not
implemented.
Signed-off-by: Michael Walle <michael@walle.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes the problem of the spin locks, originally
meant for the netpoll path of hns driver, causing deadlock in
the normal NAPI poll path. The issue happened due to the presence
of the stray leftover spin lock code related to the netpoll,
whose support was earlier removed from the HNS[1], got activated
due to enabling of NET_POLL_CONTROLLER switch.
Earlier background:
The netpoll handling code originally had this bug(as identified
by Marc Zyngier[2]) of wrong spin lock API being used which did
not disable the interrupts and hence could cause locking issues.
i.e. if the lock were first acquired in context to thread like
'ip' util and this lock if ever got later acquired again in
context to the interrupt context like TX/RX (Interrupts could
always pre-empt the lock holding task and acquire the lock again)
and hence could cause deadlock.
Proposed Solution:
1. If the netpoll was enabled in the HNS driver, which is not
right now, we could have simply used spin_[un]lock_irqsave()
2. But as netpoll is disabled, therefore, it is best to get rid
of the existing locks and stray code for now. This should
solve the problem reported by Marc.
[1] https://git.kernel.org/torvalds/c/4bd2c03be7
[2] https://patchwork.ozlabs.org/patch/1189139/
Fixes: 4bd2c03be7 ("net: hns: remove ndo_poll_controller")
Cc: lipeng <lipeng321@huawei.com>
Cc: Yisen Zhuang <yisen.zhuang@huawei.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: David S. Miller <davem@davemloft.net>
Reported-by: Marc Zyngier <maz@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Tested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a DPNI is connected to a MAC, export its associated counters.
Ethtool related functions are added in dpaa2_mac for returning the
number of counters, their strings and also their values.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
If there is an external PHY, pass the wake-on-lan request to the PHY.
Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
If there is an attached PHY try to handle the requested ioctl with its
handler, which allows the userspace to access PHY registers, for
example. This will make mii-diag and similar tools work.
Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix to return negative error code -ENOMEM from the error handling
case instead of 0, as done elsewhere in this function.
Fixes: 4a7f970f12 ("mlxsw: spectrum: Replace port_to_module array with array of structs")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rework SCHED API to allow offloading TC-MQPRIO QoS configuration.
The existing QUEUE based rate limiting throttles all queues sharing
a traffic class, to the specified max rate limit value. So, if
multiple queues share a traffic class, then all the queues get
the aggregate specified max rate limit.
So, introduce the new FLOWC based rate limiting, where multiple
queues can share a traffic class with each queue getting its own
individual specified max rate limit.
For example, if 2 queues are bound to class 0, which is rate limited
to 1 Gbps, then 2 queues using QUEUE based rate limiting, get the
aggregate output of 1 Gbps only. In FLOWC based rate limiting, each
queue gets its own output of max 1 Gbps each; i.e. 2 queues * 1 Gbps
rate limit = 2 Gbps.
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement Tx path for traffic flowing through software EOSW_TXQ
and EOHW_TXQ. Since multiple EOSW_TXQ can post packets to a single
EOHW_TXQ, protect the hardware queue with necessary spinlock. Also,
move common code used to generate TSO work request to a common
function.
Implement Rx path to handle Tx completions for successfully
transmitted packets.
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for configuring and managing ETHOFLD hardware queues.
Keep the queue count and MSI-X allocation scheme same as NIC queues.
ETHOFLD hardware queues are dynamically allocated/destroyed as
TC-MQPRIO Qdisc offload is enabled/disabled on the corresponding
interface, respectively.
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add logic for validation and configuration of TC-MQPRIO Qdisc
offload. Also, add support to manage EOSW_TXQ, which have 1-to-1
mapping with EOTIDs, and expose them to network stack.
Move common skb validation in Tx path to a separate function and
add minimal Tx path for ETHOFLD. Update Tx queue selection to return
normal NIC Txq to send traffic pattern that can't go through ETHOFLD
Tx path.
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simplify queue configuration and MSI-X allocation logic. Use a single
MSI-X information table for both NIC and ULDs. Remove hard-coded
MSI-X indices for firmware event queue and non data interrupts.
Instead, use the MSI-X bitmap to obtain a free MSI-X index
dynamically. Save each Rxq's index into the MSI-X information table,
within the Rxq structures themselves, for easier cleanup.
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
QoS offload needs Ethernet Offload (ETHOFLD) resources present in the
NIC. These resources are shared with other ULDs. So, query firmware
for the available number of traffic classes, as well as, start and
end indices (EOTID) of the ETHOFLD region.
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
phylink_set_port_modes will be compiled if CONFIG_PHYLINK enabled,
dpaa2_mac_validate will be compiled if CONFIG_FSL_DPAA2_ETH enabled,
it should select CONFIG_PHYLINK when dpaa2_mac_validate call
phylink_set_port_modes
drivers/net/ethernet/freescale/dpaa2/dpaa2-mac.o: In function `dpaa2_mac_validate':
dpaa2-mac.c:(.text+0x3a1): undefined reference to `phylink_set_port_modes'
drivers/net/ethernet/freescale/dpaa2/dpaa2-mac.o: In function `dpaa2_mac_connect':
dpaa2-mac.c:(.text+0x91a): undefined reference to `phylink_create'
dpaa2-mac.c:(.text+0x94e): undefined reference to `phylink_of_phy_connect'
dpaa2-mac.c:(.text+0x97f): undefined reference to `phylink_destroy'
drivers/net/ethernet/freescale/dpaa2/dpaa2-mac.o: In function `dpaa2_mac_disconnect':
dpaa2-mac.c:(.text+0xa9f): undefined reference to `phylink_disconnect_phy'
dpaa2-mac.c:(.text+0xab0): undefined reference to `phylink_destroy'
make: *** [vmlinux] Error 1
Fixes: 7194792308 ("dpaa2-eth: add MAC/PHY support through phylink")
Signed-off-by: Chenwandun <chenwandun@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since old firmware does not support HCLGE_OPC_PF_RST_DONE, it will
return -EOPNOTSUPP to the driver when received this command. So
for this case, it should just print a warning and return success
to the caller.
Fixes: 72e2fb0799 ("net: hns3: clear reset interrupt status in hclge_irq_handle()")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
100GbE Intel Wired LAN Driver Updates 2019-11-06
This series contains updates to ice driver only.
Scott adds ethtool -m support so that we can read eeprom data on SFP/OSFP
modules.
Anirudh updates the return value to properly reflect when SRIOV is not
supported.
Md Fahad updates the driver to handle a change in the NVM, where the
boot configuration section was moved to the Preserved Field Area (PFA)
of the NVM.
Paul resolves an issue when DCBx requests non-contiguous TCs, transmit
hangs could occur, so configure a default traffic class (TC0) in these
cases to prevent traffic hangs. Adds a print statement to notify the
user when unsupported modules are inserted.
Bruce fixes up the driver unload code flow to ensure we do not clear the
interrupt scheme until the reset is complete, otherwise a hardware error
may occur.
Dave updates the DCB initialization to set is_sw_lldp boolean when the
firmware has been detected to be in an untenable state. This will
ensure that the firmware is in a known state.
Michal saves off the PCI state and I/O BARs address after PCI bus reset
so that after the reset, device registers can be read. Also adds a NULL
pointer check to prevent a potential kernel panic.
Mitch resolves an issue where VF's on PF's other than 0 were not seeing
resets by using the per-PF VF ID instead of the absolute VF ID.
Krzysztof does some code cleanup to remove a unneeded wrapper and
reduces the code complexity.
Brett reduces confusion by changing the name of ice_vc_dis_vf() to
ice_vc_reset_vf() to better describe what the function is actually
doing.
v2: dropped patch 3 "ice: Add support for FW recovery mode detection"
from the origin al series, while Ani makes changes based on
community feedback to implement devlink into the changes.
v3: dropped patch 1 "ice: implement set_eeprom functionality" due to a
bug found and additional changes will be needed when Ani implements
devlink in the driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl3DQ0sACgkQSD+KveBX
+j4nKgf/XnUZGR4PHGcEfalc/p67J7jmi7SN5Xuts8uyUTgqMGni/SkWRl+S1mQy
8dl5kZK/JxBhAzzlVIcXpcmkfeXydcrm1NHMPCOMsnUrAWAEP0lE1lfLBkX1zWPi
OHb3+nziJHc0s4AI/Haca7JOj8HjBSOPpv3Udg6f53leizo31pK2FSiBASc5bRoX
EUNMD3eFAap7fjBMVLzO/JeQXxgjaqb9qU3AKopy34cg+AInZ+ZH2S++aoT7RG3J
MCPOnrSxLmhA4BUu5I81vfgFNxxPP65vI7NPIhSMY7+pOQJNPIc1p08XvIpQ9X2o
5o/rQ9tVxq2NZqcMB+b6tiHVFYu7Kw==
=OmLZ
-----END PGP SIGNATURE-----
Merge tag 'mlx5-fixes-2019-11-06' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahamees says:
====================
Mellanox, mlx5 fixes 2019-11-06
This series introduces some fixes to mlx5 driver.
Please pull and let me know if there is any problem.
No -stable this time.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Functions like phy_modify_paged() read the current page, on Realtek
PHY's this means reading the value of register 0x1f. Add special
handling for reading this register, similar to what we do already
in r8168g_mdio_write(). Currently we read a random value that by
chance seems to be 0 always.
Fixes: a2928d2864 ("r8169: use paged versions of phylib MDIO access functions")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert the phy_mode() function to return the if_mode through an
argument, similar to the new form of of_get_phy_mode().
This will help with handling errors in a common manner and also will fix
an always true condition.
Fixes: 0c65b2b90d ("net: of_get_phy_mode: Change API to solve int/unit warnings")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the DMA memory resource get failed case, the error is not
set and 0 will be returned. Fix it by removing redundant check
since devm_ioremap_resource() will handle it.
Fixes: 28ef9ebdb6 ("net: axienet: make use of axistream-connected attribute optional")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
IOC bit must be only set in the last descriptor. Move the logic up a
little bit to make sure it's set in the correct descriptor.
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When using addressing > 32 bits the TSO first descriptor only has the
header so we can't set the payload field for this descriptor. Let's
reset the variable so that buffer 2 value is zero.
Fixes: a993db88d1 ("net: stmmac: Enable support for > 32 Bits addressing in XGMAC")
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, stmmac_rx() is counting the number of descriptors but it
should count the number of packets as specified by the NAPI limit.
Fix this.
Fixes: ec222003bd ("net: stmmac: Prepare to add Split Header support")
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
MMC interrupts were being enabled, which is not what we want because it
will lead to a storm of interrupts that are not handled at all. Fix it
by disabling all MMC interrupts for XGMAC.
Fixes: b6cdf09f51 ("net: stmmac: xgmac: Implement MMC counters")
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When in AVB mode we need to disable flow control to prevent MAC from
pausing in TX side.
Fixes: ec6ea8e3ee ("net: stmmac: Add CBS support in XGMAC2")
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix incorrect precedence of operators. For reference: AV implies AV
Feature but RAV implies only RX side AV Feature. As we want full AV
features we need to check RAV.
Fixes: c2b69474d6 ("net: stmmac: xgmac: Correct RAVSEL field interpretation")
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When we change between Transmission Scheduling Algorithms, we need to
clear previous values so that the new chosen algorithm is correctly
selected.
Fixes: ec6ea8e3ee ("net: stmmac: Add CBS support in XGMAC2")
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Split Header length is only available when L34T == 0. Fix this by
correctly checking if L34T is zero before trying to get Header length.
Fixes: 67afd6d1cf ("net: stmmac: Add Split Header support and enable it in XGMAC cores")
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In L2 tests that filter packets by destination MAC address we need to
prevent false positives that can occur if we add an address that
collides with the existing ones.
To fix this, lets manually check if the new address to be added is
already present in the NIC and use a different one if so. For Hash
filtering this also envolves converting the address to the hash.
Fixes: 091810dbde ("net: stmmac: Introduce selftests support")
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The bitrev32 function returns an u32 var, not an int. Fix it.
Fixes: 0efedbf11f ("net: stmmac: xgmac: Fix XGMAC selftests")
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The bitrev32 function returns an u32 var, not an int. Fix it.
Fixes: 477286b53f ("stmmac: add GMAC4 core support")
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Function ptp_clock_register() returns ERR_PTR() and never returns
NULL. The NULL test should be removed.
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the device is not capable of supporting SR-IOV -ENODEV is being
returned; -EOPNOTSUPP is more appropriate.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
ice_vc_dis_vf() tells iavf that it's going to perform a reset
and then performs a software reset. This is misleading based on
the function name because the VF does not get disabled. So fix
this by changing the name to ice_vc_reset_vf().
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
ice_cleanup_hdrs() has been stripped of most of its content, it only serves
as a wrapper for eth_skb_pad(). We can get rid of it altogether and
simplify the codebase.
Signed-off-by: Krzysztof Kazimierczak <krzysztof.kazimierczak@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Print message to inform user of PCI link speed and width.
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Print message to inform user if unsupported module is inserted, and
extend the topology / configuration detection.
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The VF_MBX_ARQLEN register array is per-PF, not global, so we should not
use the absolute VF ID as an index. Instead, use the per-PF VF ID.
This fixes an issue with VFs on PFs other than 0 not seeing reset.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Without this check rebuild vsi can lead to kernel panic.
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Save state to correct recovery memory and I/O BARs address
after PCI bus reset. Without this after reset kernel can't
read device registers.
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Adjust ice_init_dcb to set the is_sw_lldp boolean
in the case where the FW has been detected to be
in an untenable state such that the driver
should forcibly make sure it is off.
This will ensure that the FW is in a known state.
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
As part of the driver unload flow, a PF reset is issued which may still
cause an interrupt to be generated by the device. Do not clear the
interrupt scheme until the reset is complete and there are no pending
transactions otherwise a hardware error may occur.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
If DCBx request non-contiguous TCs, then the driver will configure default
traffic class (TC0). This is done to prevent Tx hang since the driver
currently does not support non-contiguous TC.
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The Boot Configuration Section Block has been moved to the Preserved Field
Area (PFA) of NVM. Update the NVM reads that involves Boot Configuration
Section.
Signed-off-by: Md Fahad Iqbal Polash <md.fahad.iqbal.polash@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Implement ethtool -m support to read eeprom data from SFP/QSFP modules.
Signed-off-by: Scott W Taylor <scott.w.taylor@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The "read-modify-write register index" function is declared with a
confusing prototype: the "mask" and "reg" arguments are swapped.
Fortunately, this does not affect callers so far. Both arguments are
u32, and the wrapper macros (ocelot_rmw_ix etc) have the arguments in
the correct order (the one from ocelot_io.c).
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
lag_upper_info may be NULL on slave removal.
Fixes: dc96ee3730 ("net: mscc: ocelot: add bonding support")
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The check that the event is actually for this device should be moved
from the "port" handler to the net device handler.
Otherwise the port handler will deny bonding configuration for other
net devices in the same system (like enetc in the LS1028A) that don't
have the lag_upper_info->tx_type restriction that ocelot has.
Fixes: dc96ee3730 ("net: mscc: ocelot: add bonding support")
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For vlan push action, if eswitch flow source capability is enabled, flow
source value compared with MLX5_VPORT_UPLINK enum, to determine uplink
port. This lead to syndrome in dmesg if try to add vlan push action.
For example:
$ tc filter add dev vxlan0 ingress protocol ip prio 1 flower \
enc_dst_port 4789 \
action tunnel_key unset pipe \
action vlan push id 20 pipe \
action mirred egress redirect dev ens1f0_0
$ dmesg
...
[ 2456.883693] mlx5_core 0000:82:00.0: mlx5_cmd_check:756:(pid 5273): SET_FLOW_TABLE_ENTRY(0x936) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0xa9c090)
Use the correct enum value MLX5_FLOW_CONTEXT_FLOW_SOURCE_UPLINK.
Fixes: bb204dcf39fe ("net/mlx5e: Determine source port properly for vlan push action")
Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The rewrite data was no freed.
Fixes: 9db810ed2d ("net/mlx5: DR, Expose steering action functionality")
Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The value is already the calculation so remove the log prefix.
Fixes: e52c280240 ("net/mlx5: E-Switch, Add chains and priorities")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Eli Britstein <elibr@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The phy_init_hw() function may reset the PHY to a configuration
that does not match manual network settings stored in the phydev
structure. If the phy state machine is polled rather than event
driven this can create a timing hazard where the phy state machine
might alter the settings stored in the phydev structure from the
value read from the BMCR.
This commit follows invocations of phy_init_hw() by the bcmgenet
driver with invocations of the genphy_config_aneg() function to
ensure that the BMCR is written to match the settings held in the
phydev structure. This prevents the risk of manual settings being
accidentally altered.
Fixes: 1c1008c793 ("net: bcmgenet: add main driver file")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 1f51548627.
This commit improved the chances of the umac resetting cleanly by
ensuring that the PHY was restored to its normal operation prior
to resetting the umac. However, there were still cases when the
PHY might not be driving a Tx clock to the umac during this window
(e.g. when the PHY detects no link).
The previous commit now ensures that the unimac receives clocks
from the MAC during its reset window so this commit is no longer
needed. This commit also has an unintended negative impact on the
MDIO performance of the UniMAC MDIO interface because it is used
before the MDIO interrupts are reenabled, so it should be removed.
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As noted in commit 28c2d1a7a0 ("net: bcmgenet: enable loopback
during UniMAC sw_reset") the UniMAC must be clocked while sw_reset
is asserted for its state machines to reset cleanly.
The transmit and receive clocks used by the UniMAC are derived from
the signals used on its PHY interface. The bcmgenet MAC can be
configured to work with different PHY interfaces including MII,
GMII, RGMII, and Reverse MII on internal and external interfaces.
Unfortunately for the UniMAC, when configured for MII the Tx clock
is always driven from the PHY which places it outside of the direct
control of the MAC.
The earlier commit enabled a local loopback mode within the UniMAC
so that the receive clock would be derived from the transmit clock
which addressed the observed issue with an external GPHY disabling
it's Rx clock. However, when a Tx clock is not available this
loopback is insufficient.
This commit implements a workaround that leverages the fact that
the MAC can reliably generate all of its necessary clocking by
enterring the external GPHY RGMII interface mode with the UniMAC in
local loopback during the sw_reset interval. Unfortunately, this
has the undesirable side efect of the RGMII GTXCLK signal being
driven during the same window.
In most configurations this is a benign side effect as the signal
is either not routed to a pin or is already expected to drive the
pin. The one exception is when an external MII PHY is expected to
drive the same pin with its TX_CLK output creating output driver
contention.
This commit exploits the IEEE 802.3 clause 22 standard defined
isolate mode to force an external MII PHY to present a high
impedance on its TX_CLK output during the window to prevent any
contention at the pin.
The MII interface is used internally with the 40nm internal EPHY
which agressively disables its clocks for power savings leading to
incomplete resets of the UniMAC and many instabilities observed
over the years. The workaround of this commit is expected to put
an end to those problems.
Fixes: 1c1008c793 ("net: bcmgenet: add main driver file")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Until now the size of a Rx buffer was artificially limited
to 1536B (which happens to be the default, after reset, hardware
value for a Rx buffer). This approach however leaves unused
memory space for Rx packets, since the driver uses a paged
allocation scheme that reserves half a page for each Rx skb.
There's also the inconvenience that frames around 1536 bytes
can get scattered if the limit is slightly exceeded. This limit
can be exceeded even for standard MTU of 1500B traffic, for common
cases like stacked VLANs, or DSA tags.
To address these issues, let's just compute the buffer size
starting from the upper limit of 2KB (half a page) and
subtract the skb overhead and alignment restrictions.
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>