tc transparently maps the software priority number to hardware. Update
it to pass the major priority which is what most drivers expect. Update
drivers too so they do not need to lshift the priority field of the
flow_cls_common_offload object. The stmmac driver is an exception, since
this code assumes the tc software priority is fine, therefore, lshift it
just to be conservative.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add 2 new selftests for VLAN Insertion offloading. Tests are for inner
and outer VLAN offloading.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adds the logic to insert a given VLAN ID in a packet. This is offloaded
to HW and its descriptor based. For now, only XGMAC implements the
necessary callbacks.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for EEE in XGMAC cores by implementing the necessary
callbacks.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add 4 new tests:
- SA Insertion (register based)
- SA Insertion (descriptor based)
- SA Replacament (register based)
- SA Replacement (descriptor based)
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the support for Source Address Insertion and Replacement in XGMAC
cores. Two methods are supported: Descriptor based and register based.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the ethtool interface to dump the register map in XGMAC cores.
Changes from v2:
- Remove uneeded memset (Jakub)
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a counter that increments each time a packet with split header is
received.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the support for Split Header feature in the RX path and enable it in
XGMAC cores.
This does not impact neither beneficts bandwidth but it does reduces CPU
usage because without the feature all the entire packet is memcpy'ed,
while that with the feature only the header is.
With Split Header disabled 'perf stat -d' gives:
86870.624945 task-clock (msec) # 0.429 CPUs utilized
1073352 context-switches # 0.012 M/sec
1 cpu-migrations # 0.000 K/sec
213 page-faults # 0.002 K/sec
327113872376 cycles # 3.766 GHz (62.53%)
56618161216 instructions # 0.17 insn per cycle (75.06%)
10742205071 branches # 123.658 M/sec (75.36%)
584309242 branch-misses # 5.44% of all branches (75.19%)
17594787965 L1-dcache-loads # 202.540 M/sec (74.88%)
4003773131 L1-dcache-load-misses # 22.76% of all L1-dcache hits (74.89%)
1313301468 LLC-loads # 15.118 M/sec (49.75%)
355906510 LLC-load-misses # 27.10% of all LL-cache hits (49.92%)
With Split Header enabled 'perf stat -d' gives:
49324.456539 task-clock (msec) # 0.245 CPUs utilized
2542387 context-switches # 0.052 M/sec
1 cpu-migrations # 0.000 K/sec
213 page-faults # 0.004 K/sec
177092791469 cycles # 3.590 GHz (62.30%)
68555756017 instructions # 0.39 insn per cycle (75.16%)
12697019382 branches # 257.418 M/sec (74.81%)
442081897 branch-misses # 3.48% of all branches (74.79%)
20337958358 L1-dcache-loads # 412.330 M/sec (75.46%)
3820210140 L1-dcache-load-misses # 18.78% of all L1-dcache hits (75.35%)
1257719198 LLC-loads # 25.499 M/sec (49.73%)
685543923 LLC-load-misses # 54.51% of all LL-cache hits (49.86%)
Changes from v2:
- Reword commit message (Jakub)
Changes from v1:
- Add performance info (David)
- Add misssing dma_sync_single_for_device()
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Return the correct value when RX descriptor is not the last one.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to add Split Header support, stmmac_rx() needs to take into
account that packet may be split accross multiple descriptors.
Refactor the logic of this function in order to support this scenario.
Changes from v2:
- Fixup if condition detection (Jakub)
- Don't stop NAPI with unfinished packet (Jakub)
- Use napi_alloc_skb() (Jakub)
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
TX Timestamp in XGMAC comes from MAC instead of descriptors. Implement
this in a new callback.
Also, RX Timestamp in XGMAC must be cheked against corruption and we need
a barrier to make sure that descriptor fields are read correctly.
Changes from v2:
- Rework return code check (Jakub)
Changes from v1:
- Rework the get timestamp function (David)
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When calling debugfs functions, there is no need to ever check the
return value. The function can work or not, but the code logic should
never do something different based on this.
Because we don't care about the individual files, we can remove the
stored dentry for the files, as they are not needed to be kept track of
at all.
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Cc: Jose Abreu <joabreu@synopsys.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: netdev@vger.kernel.org
Cc: linux-stm32@st-md-mailman.stormreply.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a selftest for the Flexible RX Parser feature.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
XGMAC cores also support the Flexible RX Parser feature. Add the support
for it in the XGMAC core.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
XGMAC also supports Safety Features. This patch implements the
configuration and handling of this feature in XGMAC core.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a selftest for VLAN and Double VLAN Filtering in stmmac.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement the VLAN Hash Filtering feature in XGMAC core.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement the RSS functionality and add the corresponding callbacks in
XGMAC core.
Changes from v1:
- Do not use magic constants (Jakub)
- Use ethtool_rxfh_indir_default() (Jakub)
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement the TX Queue Priority callback in XGMAC core.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement the TX Queue Weight callback. In order for this to be active
we also need to set ETS algorithm when configuring Queue.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement the MMC counters feature in XGMAC core.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Do not try to return a fragment entry from TC list. Otherwise we may not
clean properly allocated entries.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When queues >= 4 we use different registers but we were not subtracting
the offset of 4. Fix this.
Found out by Coverity.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixup the XGMAC selftests by correctly finishing the implementation of
set_filter callback.
Result:
$ ethtool -t enp4s0
The test result is PASS
The test extra info:
1. MAC Loopback 0
2. PHY Loopback -95
3. MMC Counters -95
4. EEE -95
5. Hash Filter MC 0
6. Perfect Filter UC 0
7. MC Filter 0
8. UC Filter 0
9. Flow Control 0
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This variant of netif_napi_add() should be used from drivers
using NAPI to exclusively poll a TX queue.
Signed-off-by: Frode Isaksen <fisaksen@baylibre.com>
Tested-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We don't need dev_err() messages when platform_get_irq() fails now that
platform_get_irq() prints an error message itself when something goes
wrong. Let's remove these prints with a simple semantic patch.
// <smpl>
@@
expression ret;
struct platform_device *E;
@@
ret =
(
platform_get_irq(E, ...)
|
platform_get_irq_byname(E, ...)
);
if ( \( ret < 0 \| ret <= 0 \) )
{
(
-if (ret != -EPROBE_DEFER)
-{ ...
-dev_err(...);
-... }
|
...
-dev_err(...);
)
...
}
// </smpl>
While we're here, remove braces on if statements that only have one
statement (manually).
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Lorenzo Bianconi <lorenzo@kernel.org>
Cc: netdev@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
With recent changes that introduced support for Page Pool in stmmac, Jon
reported that NFS boot was no longer working on an ARM64 based platform
that had the IP behind an IOMMU.
As Page Pool API does not guarantee DMA syncing because of the use of
DMA_ATTR_SKIP_CPU_SYNC flag, we have to explicit sync the whole buffer upon
re-allocation because we are always re-using same pages.
In fact, ARM64 code invalidates the DMA area upon two situations [1]:
- sync_single_for_cpu(): Invalidates if direction != DMA_TO_DEVICE
- sync_single_for_device(): Invalidates if direction == DMA_FROM_DEVICE
So, as we must invalidate both the current RX buffer and the newly allocated
buffer we propose this fix.
[1] arch/arm64/mm/cache.S
Reported-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Fixes: 2af6106ae9 ("net: stmmac: Introducing support for Page Pool")
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Tested-by: Ezequiel Garcia <ezequiel@collabora.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit d01f449c00 ("of_net: add NVMEM support to of_get_mac_address")
added support for reading the MAC address from an nvmem-cell. This
required changing the logic to return an error pointer upon failure.
If stmmac is loaded before the nvmem provider driver then
of_get_mac_address() return an error pointer with -EPROBE_DEFER.
Propagate this error so the stmmac driver will be probed again after the
nvmem provider driver is loaded.
Default to a random generated MAC address in case of any other error,
instead of using the error pointer as MAC address.
Fixes: d01f449c00 ("of_net: add NVMEM support to of_get_mac_address")
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The stmmaceth clock is specified by the slave_bus and apb_pclk clocks in
the device tree bindings for snps,dwc-qos-ethernet-4.10 compatible nodes
of this IP.
The subdrivers for these bindings will be requesting the stmmac clock
correctly at a later point, so there is no need to request it here and
cause an error message to be printed to the kernel log.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Tegra EQOS driver already resets the MDIO bus at probe time via the
reset GPIO specified in the phy-reset-gpios device tree property. There
is no need to reset the bus again later on.
This avoids the need to query the device tree for the snps,reset GPIO,
which is not part of the Tegra EQOS device tree bindings. This quiesces
an error message from the generic bus reset code if it doesn't find the
snps,reset related delays.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some glue logic drivers support 1G without having GMAC/GMAC4/XGMAC.
Let's allow this speed by default.
Reported-by: Ondrej Jirman <megi@xff.cz>
Tested-by: Ondrej Jirman <megi@xff.cz>
Fixes: 5b0d7d7da6 ("net: stmmac: Add the missing speeds that XGMAC supports")
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need the memory to be zeroed upon allocation so use kcalloc()
instead.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
RX Descriptors are being cleaned after setting the buffers which may
lead to buffer addresses being wiped out.
Fix this by clearing earlier the RX Descriptors.
Fixes: 2af6106ae9 ("net: stmmac: Introducing support for Page Pool")
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arguments are supposed to be ordered high then low.
Fixes: 293e4365a1 ("stmmac: change descriptor layout")
Fixes: 9f93ac8d40 ("net-next: stmmac: Add dwmac-sun8i")
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch updates flow_block_cb_setup_simple() to use the flow block API.
Several drivers are also adjusted to use it.
This patch introduces the per-driver list of flow blocks to account for
blocks that are already in use.
Remove tc_block_offload alias.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Most drivers do the same thing to set up the flow block callbacks, this
patch adds a helper function to do this.
This preparation patch reduces the number of changes to adapt the
existing drivers to use the flow block callback API.
This new helper function takes a flow block list per-driver, which is
set to NULL until this driver list is used.
This patch also introduces the flow_block_command and
flow_block_binder_type enumerations, which are renamed to use
FLOW_BLOCK_* in follow up patches.
There are three definitions (aliases) in order to reduce the number of
updates in this patch, which go away once drivers are fully adapted to
use this flow block API.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1. get hash table size in hw feature reigster, and add support
for taller hash table(128/256) in dwmac4.
2. only clear GMAC_PACKET_FILTER bits used in this function,
to avoid side effect to functions of other bits.
stmmac selftests output log with flow control on:
ethtool -t eth0
The test result is PASS
The test extra info:
1. MAC Loopback 0
2. PHY Loopback -95
3. MMC Counters 0
4. EEE -95
5. Hash Filter MC 0
6. Perfect Filter UC 0
7. MC Filter 0
8. UC Filter 0
9. Flow Control 0
Signed-off-by: Biao Huang <biao.huang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The mac address array size is GMAC_MAX_PERFECT_ADDRESSES,
so the 'reg' should be less than it, or will affect other registers.
Signed-off-by: Biao Huang <biao.huang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mapping and unmapping DMA region is an high bottleneck in stmmac driver,
specially in the RX path.
This commit introduces support for Page Pool API and uses it in all RX
queues. With this change, we get more stable troughput and some increase
of banwidth with iperf:
- MAC1000 - 950 Mbps
- XGMAC: 9.22 Gbps
Changes from v3:
- Use page_pool_destroy() (Ilias)
Changes from v2:
- Uncoditionally call page_pool_free() (Jesper)
Changes from v1:
- Use page_pool_get_dma_addr() (Jesper)
- Add a comment (Jesper)
- Add page_pool_free() call (Jesper)
- Reintroduce sync_single_for_device (Arnd / Ilias)
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit a993db88d1 ("net: stmmac: Enable support for > 32 Bits
addressing in XGMAC"), introduced support for > 32 bits addressing in
XGMAC but the conversion of descriptors to dma_addr_t was left out.
As some devices assing coherent memory in regions > 32 bits we need to
set lower and upper value of descriptors address when initializing DMA
channels.
Luckly, this was working for me because I was assigning CMA to < 4GB
address space for performance reasons.
Fixes: a993db88d1 ("net: stmmac: Enable support for > 32 Bits addressing in XGMAC")
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for coalescing RX path by specifying number of frames which
don't need to have interrupt on completion bit set.
This is only available when RX Watchdog is enabled.
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
DWMAC4 is capable to support clause 45 mdio communication.
This patch enable the feature on stmmac_mdio_write() and
stmmac_mdio_read() by following phy_write_mmd() and
phy_read_mmd() mdiobus read write implementation format.
Reviewed-by: Li, Yifan <yifan2.li@intel.com>
Signed-off-by: Kweh Hock Leong <hock.leong.kweh@intel.com>
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: Voon Weifeng <weifeng.voon@intel.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings says:
"This is the wrong place to change the queue mapping.
stmmac_xmit() is called with a specific TX queue locked,
and accessing a different TX queue results in a data race
for all of that queue's state.
I think this commit should be reverted upstream and in all
stable branches. Instead, the driver should implement the
ndo_select_queue operation and override the queue mapping there."
Fixes: c5acdbee22 ("net: stmmac: Send TSO packets always from Queue 0")
Suggested-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enable GMAC v4.xx and beyond to support 16KiB buffer.
Signed-off-by: Weifeng Voon <weifeng.voon@intel.com>
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>