Conform the following warning:
WARNING: ENOSYS means 'invalid syscall nr' and nothing else.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
napi_complete_done() allows to opt-in for gro_flush_timeout,
added back in linux-3.19, commit 3b47d30396
("net: gro: add a per device gro flush timer")
This allows for more efficient GRO aggregation without
sacrifying latencies.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In spite of switching to paged allocation of Rx buffers, the driver still
called dma_unmap_single() in the Rx queues tear-down path.
The DMA region unmapping code in free_skb_rx_queue() basically predates
the introduction of paged allocation to the driver. While being refactored,
it apparently hasn't reflected the change in the DMA API usage by its
counterpart gfar_new_page().
As a result, setting an interface to the DOWN state now yields the following:
# ip link set eth2 down
fsl-gianfar ffe24000.ethernet: DMA-API: device driver frees DMA memory with wrong function [device address=0x000000001ecd0000] [size=40]
------------[ cut here ]------------
WARNING: CPU: 1 PID: 189 at lib/dma-debug.c:1123 check_unmap+0x8e0/0xa28
CPU: 1 PID: 189 Comm: ip Tainted: G O 4.9.5 #1
task: dee73400 task.stack: dede2000
NIP: c02101e8 LR: c02101e8 CTR: c0260d74
REGS: dede3bb0 TRAP: 0700 Tainted: G O (4.9.5)
MSR: 00021000 <CE,ME> CR: 28002222 XER: 00000000
GPR00: c02101e8 dede3c60 dee73400 000000b6 dfbd033c dfbd36c4 1f622000 dede2000
GPR08: 00000007 c05b1634 1f622000 00000000 22002484 100a9904 00000000 00000000
GPR16: 00000000 db4c849c 00000002 db4c8480 00000001 df142240 db4c84bc 00000000
GPR24: c0706148 c0700000 00029000 c07552e8 c07323b4 dede3cb8 c07605e0 db535540
NIP [c02101e8] check_unmap+0x8e0/0xa28
LR [c02101e8] check_unmap+0x8e0/0xa28
Call Trace:
[dede3c60] [c02101e8] check_unmap+0x8e0/0xa28 (unreliable)
[dede3cb0] [c02103b8] debug_dma_unmap_page+0x88/0x9c
[dede3d30] [c02dffbc] free_skb_resources+0x2c4/0x404
[dede3d80] [c02e39b4] gfar_close+0x24/0xc8
[dede3da0] [c0361550] __dev_close_many+0xa0/0xf8
[dede3dd0] [c03616f0] __dev_close+0x2c/0x4c
[dede3df0] [c036b1b8] __dev_change_flags+0xa0/0x174
[dede3e10] [c036b2ac] dev_change_flags+0x20/0x60
[dede3e30] [c03e130c] devinet_ioctl+0x540/0x824
[dede3e90] [c0347dcc] sock_ioctl+0x134/0x298
[dede3eb0] [c0111814] do_vfs_ioctl+0xac/0x854
[dede3f20] [c0111ffc] SyS_ioctl+0x40/0x74
[dede3f40] [c000f290] ret_from_syscall+0x0/0x3c
--- interrupt: c01 at 0xff45da0
LR = 0xff45cd0
Instruction dump:
811d001c 7c66482e 813d0020 9061000c 807f000c 5463103a 7cc6182e 3c60c052
386309ac 90c10008 4cc63182 4826b845 <0fe00000> 4bfffa60 3c80c052 388402c4
---[ end trace 695ae6d7ac1d0c47 ]---
Mapped at:
[<c02e22a8>] gfar_alloc_rx_buffs+0x178/0x248
[<c02e3ef0>] startup_gfar+0x368/0x570
[<c036aeb4>] __dev_open+0xdc/0x150
[<c036b1b8>] __dev_change_flags+0xa0/0x174
[<c036b2ac>] dev_change_flags+0x20/0x60
Even though the issue was discovered in 4.9 kernel, the code in question
is identical in the current net and net-next trees.
Fixes: 75354148ce ("gianfar: Add paged allocation and Rx S/G")
Signed-off-by: Arseny Solokha <asolokha@kb.kras.ru>
Acked-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The SGMII (internal PHY) can report decode errors via an interrupt. It
can also report autonegotiation status changes, but we don't need to track
those. The SGMII can recover automatically from most decode errors, so
we only reset the interface if we get multiple consecutive errors.
It's possible for bogus decode errors to be reported while the link is
being brought up. The interrupt is registered when the interface is
opened, and it's enabled after the link is up.
Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The EMAC driver does not support wake-on-lan, but there is still
code left-over that partially enables it. Remove that code and a few
macros that support it.
Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
emac_mac_start() uses information from the external PHY to program
the MAC, so it makes no sense to call it before the link is up.
Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Regardless of how the external PHY is configured, the internal PHY
(the "SGMII" block) is capable of configuring the SGMII link automatically.
When the external PHY link comes up, regardless of how it is configured,
the SGMII link is configured automatically.
Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The PHY driver is attached only when the driver calls
phy_connect_direct(). Calling phy_attached_print() to display
information about the PHY driver prior to that point is meaningless.
The interface can be brought down, a new PHY driver can be loaded,
and the interface then brought back up. This is the correct time
to display information about the attached driver.
Since phy_attached_print() also prints information about the
interrupt, that needs to be set as well.
Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
init_ring(), refill_rx_ring() and start_tx() don't check
if mapping dma memory succeed.
The patch adds the checks and failure handling.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
The limitation to 10/100Mbit speeds on R-Car Gen3 is valid for R-Car H3
ES1.0 only. Check for the exact SoC model to allow 1Gbps on newer
revisions of R-Car H3, and on R-Car M3-W.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch enables tx and rx clock internal delay modes (TDM and RDM).
This is to address a failure in the case of 1Gbps communication using the
by salvator-x board with the KSZ9031RNX phy. This has been reported to
occur with both the r8a7795 (H3) and r8a7796 (M3-W) SoCs.
With this change APSR internal delay modes are enabled for
"rgmii-id", "rgmii-rxid" and "rgmii-txid" phy modes as follows:
phy mode | ASPR delay mode
-----------+----------------
rgmii-id | TDM and RDM
rgmii-rxid | RDM
rgmii-txid | TDM
Signed-off-by: Kazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for 32 bit GEM in
64 bit system. It checks capability at runtime
and uses appropriate buffer descriptor.
Signed-off-by: Rafal Ozieblo <rafalo@cadence.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The macro is returning ETIME which means various checks to see if
the returned err is less than zero never work. I believe a -ETIME
should be returned instead.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The check on err < 0 is redundant and can be removed. Detected
by CoverityScan, CID#1398318 ("Logically Dead Code")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The check on err < 0 is redundant and can be removed. Detected
by CoverityScan, CID#1398321 ("Logically Dead Code")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
DW GMAC databook says the following about bits in "Register 15 (Interrupt
Mask Register)":
--------------------------->8-------------------------
When set, this bit __disables_the_assertion_of_the_interrupt_signal__
because of the setting of XXX bit in Register 14 (Interrupt
Status Register).
--------------------------->8-------------------------
In fact even if we mask one bit in the mask register it doesn't prevent
corresponding bit to appear in the status register, it only disables
interrupt generation for corresponding event.
But currently we expect a bit different behavior: status bits to be in
sync with their masks, i.e. if mask for bit A is set in the mask
register then bit A won't appear in the interrupt status register.
This was proven to be incorrect assumption, see discussion here [1].
That misunderstanding causes unexpected behaviour of the GMAC, for
example we were happy enough to just see bogus messages about link
state changes.
So from now on we'll be only checking bits that really may trigger an
interrupt.
[1] https://lkml.org/lkml/2016/11/3/413
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Fabrice Gasnier <fabrice.gasnier@st.com>
Cc: Joachim Eastwood <manabian@gmail.com>
Cc: Phil Reid <preid@electromag.com.au>
Cc: David Miller <davem@davemloft.net>
Cc: Alexandre Torgue <alexandre.torgue@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On dcbnl callback getpgtccfgtx, the driver should check the ets
capability before ets query command is sent to firmware.
It is valid to return from this void function without changing in/out
parameters, as these parameters are initialized to
DCB_ATTR_VALUE_UNDEFINED.
Fixes: 3a6a931dfb ("net/mlx5e: Support DCBX CEE API")
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Modifying TIR hash should change selected fields bitmask in addition to
the function and key.
Formerly, Only on ethool mlx5e_set_rxfh "ethtoo -X" we would not set this
field resulting in zeroing of its value, which means no packet fields are
used for RX RSS hash calculation thus causing all traffic to arrive in
RQ[0].
On driver load out of the box we don't have this issue, since the TIR
hash is fully created from scratch.
Tested:
ethtool -X ethX hkey <new key>
ethtool -X ethX hfunc <new func>
ethtool -X ethX equal <new indirection table>
All cases are verified with TCP Multi-Stream traffic over IPv4 & IPv6.
Fixes: bdfc028de1 ("net/mlx5e: Fix ethtool RX hash func configuration change")
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
We don't need to modify our TIRs unless the user requested a change in
the hash function/key, for example when changing indirection only.
Tested:
# Modify TIRs hash is needed
ethtool -X ethX hkey <new key>
ethtool -X ethX hfunc <new func>
# Modify TIRs hash is not needed
ethtool -X ethX equal <new indirection table>
All cases are verified with TCP Multi-Stream traffic over IPv4 & IPv6.
Fixes: bdfc028de1 ("net/mlx5e: Fix ethtool RX hash func configuration change")
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When tunneling is used, some virtualizations systems set the (mlx5e) uplink
device to be stacked under upper devices such as bridge or ovs internal
port, where the VTEP IP address used for the encapsulation is set on
that upper device.
In order to support such use-cases, we also deal with a setup where the
egress mirred device isn't representing a port on the HW e-switch to where
the ingress device belongs. We use eswitch service function which returns
the uplink and set it as the egress device of the tc encap rule.
Fixes: a54e20b4fc ("net/mlx5e: Add basic TC tunnel set action for SRIOV offloads")
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
We must re-enable RoCE on the e-switch management port (PF) only after destroying
the FDB in its switchdev/offloaded mode. Otherwise, when encapsulation is supported,
this re-enablement will fail.
Also, it's more natural and symmetric to disable RoCE on the PF before we create
the FDB under switchdev mode, so do that as well and revert if getting into error
during the mode change later.
Fixes: 9da34cd34e ('net/mlx5: Disable RoCE on the e-switch management [..]')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Make sure to return error when we failed retrieving the FDB steering
name space. Also, while around, correctly print the error when mode
change revert fails in the warning message.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reported-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When we fail to retrieve a hardware steering name-space, the returned error
code should say that this operation is not supported. Align the various
places in the driver where this call is made to this convention.
Also, make sure to warn when we fail to retrieve a SW (ANCHOR) name-space.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
As ENOTSUPP is specific to NFS, change the return error value to
EOPNOTSUPP in various places in the mlx5 driver.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Suggested-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
8000 series adapters support filtering VXLAN, NVGRE and GENEVE traffic
based on inner fields, and when the NIC recognises such traffic, it
does not match unencapsulated traffic filters any more. So add catch-
all filters for encapsulated traffic on supporting platforms.
Although recognition of VXLAN and GENEVE is based on UDP ports, and thus
will not occur until the driver (on the primary PF) notifies the
firmware of UDP ports to use, NVGRE will always be recognised, hence
without this patch 8000 series adapters will drop all NVGRE traffic.
Partly based on patches by Jon Cooper <jcooper@solarflare.com>.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rationalise several debug-or-warnings printks using netif_cond_dbg
to make output more consistent.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the NIC is switched from full-featured to low-latency, encapsulated
filters are no longer available, and this causes errors. This patch
removes those filters from the filter table on restore.
Also, if filters which are removed by the above, or which we fail to
insert when restoring filters, were UC, MC or broadcast default
filters, invalidate the corresponding vlan->default_filters entry.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
"swiotlb buffer is full" errors occur after repeated initialisation of a
device - f.e. suspend/resume or ip link set up/down. This is because memory
mapped using dma_map_single() in ravb_ring_format() and ravb_start_xmit()
is not released. Resolve this problem by unmapping descriptors when
freeing rings.
Fixes: c156633f13 ("Renesas Ethernet AVB driver proper")
Signed-off-by: Kazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com>
[simon: reworked]
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Accessing skb after submitting to input queue can cause
access to stale pointers if the skb ends up being transmitted
and freed by that time.
Signed-off-by: Satanand Burla <satananda.burla@cavium.com>
Signed-off-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
PIO buffer allocation can fail for two valid reasons:
- we've run out of them (results in -ENOSPC)
- the NIC configuration doesn't support them (results in -EPERM)
Since both these failures are expected netif_err is excessive.
Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For DLMs and SLMs on 80/81/83xx, many lane configurations
across different boards are coming up. Also kernel doesn't have
any way to identify board type/info and since firmware does,
just get rid of figuring out lane to serdes config and take
whatever has been programmed by low level firmware.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adds support to set Rx/Tx queue sizes from ethtool. Fixes
an issue with retrieving queue size. Also sets SQ's CQ_LIMIT
based on configured Tx queue size such that HW doesn't process
SQEs when there is no sufficient space in CQ.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the binding was defined, I was not aware that mt2701 was an earlier
version of the SoC. For sake of consistency, the ethernet driver should
use mt2701 inside the compat string as this is the earliest SoC with the
ethernet core.
The ethernet driver is currently of no real use until we finish and
upstream the DSA driver. There are no users of this binding yet. It should
be safe to fix this now before it is too late and we need to provide
backward compatibility for the mt7623-eth compat string.
Reported-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: John Crispin <john@phrozen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
bnxt_get_port_module_status() calls bnxt_update_link() which expects
RTNL to be held. In bnxt_sp_task() that does not hold RTNL, we need to
call it with a prior call to bnxt_rtnl_lock_sp() and the call needs to
be moved to the end of bnxt_sp_task().
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bnxt_update_link() is called from multiple code paths. Most callers,
such as open, ethtool, already hold RTNL. Only the caller bnxt_sp_task()
does not. So it is a bug to take RTNL inside bnxt_update_link().
Fix it by removing the RTNL inside bnxt_update_link(). The function
now expects the caller to always hold RTNL.
In bnxt_sp_task(), call bnxt_rtnl_lock_sp() before calling
bnxt_update_link(). We also need to move the call to the end of
bnxt_sp_task() since it will be clearing the BNXT_STATE_IN_SP_TASK bit.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In bnxt_sp_task(), we set a bit BNXT_STATE_IN_SP_TASK so that bnxt_close()
will synchronize and wait for bnxt_sp_task() to finish. Some functions
in bnxt_sp_task() require us to clear BNXT_STATE_IN_SP_TASK and then
acquire rtnl_lock() to prevent race conditions.
There are some bugs related to this logic. This patch refactors the code
to have common bnxt_rtnl_lock_sp() and bnxt_rtnl_unlock_sp() to handle
the RTNL and the clearing/setting of the bit. Multiple functions will
need the same logic. We also need to move bnxt_reset() to the end of
bnxt_sp_task(). Functions that clear BNXT_STATE_IN_SP_TASK must be the
last functions to be called in bnxt_sp_task(). The common scheme will
handle the condition properly.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This work adds a number of tracepoints to paths that are either
considered slow-path or exception-like states, where monitoring or
inspecting them would be desirable.
For bpf(2) syscall, tracepoints have been placed for main commands
when they succeed. In XDP case, tracepoint is for exceptions, that
is, f.e. on abnormal BPF program exit such as unknown or XDP_ABORTED
return code, or when error occurs during XDP_TX action and the packet
could not be forwarded.
Both have been split into separate event headers, and can be further
extended. Worst case, if they unexpectedly should get into our way in
future, they can also removed [1]. Of course, these tracepoints (like
any other) can be analyzed by eBPF itself, etc. Example output:
# ./perf record -a -e bpf:* sleep 10
# ./perf script
sock_example 6197 [005] 283.980322: bpf:bpf_map_create: map type=ARRAY ufd=4 key=4 val=8 max=256 flags=0
sock_example 6197 [005] 283.980721: bpf:bpf_prog_load: prog=a5ea8fa30ea6849c type=SOCKET_FILTER ufd=5
sock_example 6197 [005] 283.988423: bpf:bpf_prog_get_type: prog=a5ea8fa30ea6849c type=SOCKET_FILTER
sock_example 6197 [005] 283.988443: bpf:bpf_map_lookup_elem: map type=ARRAY ufd=4 key=[06 00 00 00] val=[00 00 00 00 00 00 00 00]
[...]
sock_example 6197 [005] 288.990868: bpf:bpf_map_lookup_elem: map type=ARRAY ufd=4 key=[01 00 00 00] val=[14 00 00 00 00 00 00 00]
swapper 0 [005] 289.338243: bpf:bpf_prog_put_rcu: prog=a5ea8fa30ea6849c type=SOCKET_FILTER
[1] https://lwn.net/Articles/705270/
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The first seven patches from Or Gerlitz in this series further enhances
the mlx5 SRIOV switchdev mode to support offloading IPv6 tunnels using the
TC tunnel key set (encap) and unset (decap) actions.
Or Gerlitz says:
========================
As part of doing this change, few cleanups are done in the IPv4 code,
later we move to use the full tunnel key info provided to the driver as
the key for our internal hashing which is used to identify cases where
the same tunnel is used for encapsulating multiple flows. As done in the
IPv4 case, the control path for offloading IPv6 tunnels uses route/neigh
lookups and construction of the IPv6 tunnel headers on the encap path and
matching on the outer hears in the decap path.
The last patch of the series enlarges the HW FDB size for the switchdev mode,
so it has now room to contain offloaded flows as many as min(max number
of HW flow counters supported, max HW table size supported).
========================
Next to Or's series you can find several patches handling several topics.
From Mohamad, add support for SRIOV VF min rate guarantee by using the
TSAR BW share weights mechanism.
From Or, Two patches to enable Eth VFs to query their min-inline value for
user-space.
for that we move a mlx5 low level min inline helper function from mlx5
ethernet driver into the core driver and then use it in mlx5_ib to expose
the inline mode to rdma applications through libmlx5.
From Kamal Heib, Reduce memory consumption on kdump kernel.
From Shaker Daibes, code reuse in CQE compression control logic
Thanks,
Saeed.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJYh7FNAAoJEEg/ir3gV/o+TjsIAL1e92+5eutBS9ZvhMARi+Tc
c2V9V8bG8W1RWWTvx1G0aU4nNjWsr5L8Q8gzqpwhrQITBfgpWd+hlnxQCucyhxC3
AC1qQ+AKREe/C+25D+WJRq34/61ZHEH2rbKZvpZ1O8SuicVPbcvJ9eM+wOEDxwwX
u5C5kWQ0HRtCcnFiiOYkB+0CQPH7m3+ZzZek+jDowrexHMSE+yl8ZNtaSTX9c9QN
bE2cPiCVZd7ufKPIwY8LWHBryyl7sh5P+NqzD633OeiqP/pkZsW9A+czyt+d330f
6XTKOS1PCD+TfHE0sZJT4VMCjICMHrOFbNRZuwcxJQ6NfmwIJZfskX4NLbyGQTI=
=vF7U
-----END PGP SIGNATURE-----
Merge tag 'mlx5-updates-2017-01-24' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2017-24-01
The first seven patches from Or Gerlitz in this series further enhances
the mlx5 SRIOV switchdev mode to support offloading IPv6 tunnels using the
TC tunnel key set (encap) and unset (decap) actions.
Or Gerlitz says:
========================
As part of doing this change, few cleanups are done in the IPv4 code,
later we move to use the full tunnel key info provided to the driver as
the key for our internal hashing which is used to identify cases where
the same tunnel is used for encapsulating multiple flows. As done in the
IPv4 case, the control path for offloading IPv6 tunnels uses route/neigh
lookups and construction of the IPv6 tunnel headers on the encap path and
matching on the outer hears in the decap path.
The last patch of the series enlarges the HW FDB size for the switchdev mode,
so it has now room to contain offloaded flows as many as min(max number
of HW flow counters supported, max HW table size supported).
========================
Next to Or's series you can find several patches handling several topics.
From Mohamad, add support for SRIOV VF min rate guarantee by using the
TSAR BW share weights mechanism.
From Or, Two patches to enable Eth VFs to query their min-inline value for
user-space.
for that we move a mlx5 low level min inline helper function from mlx5
ethernet driver into the core driver and then use it in mlx5_ib to expose
the inline mode to rdma applications through libmlx5.
From Kamal Heib, Reduce memory consumption on kdump kernel.
From Shaker Daibes, code reuse in CQE compression control logic
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
If requesting msi-x interrupts fails in alx_request_irq we fall back to
a single tx queue and msi or legacy interrupts.
Currently the adapter stops working in this case and we get tx watchdog
timeouts. For reasons unknown the adapter gets confused when we load the
dma adresses to the chip in alx_init_ring_ptrs twice: the first time with
multiple queues and the second time in the fallback case with a single
queue.
To fix this move the the call to alx_reinit_rings (which calls
alx_init_ring_ptrs) after alx_request_irq. At this time it is clear how
much tx queues we have and which dma addresses we use.
Fixes: d768319cd4 ("alx: enable multiple tx queues")
Signed-off-by: Tobias Regnery <tobias.regnery@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If requesting msi-x interrupts fails we should fall back to msi or
legacy interrupts. However alx_realloc_ressources don't call
alx_init_intr, so we fail to set the right number of tx queues.
This results in watchdog timeouts and a nonfunctional adapter.
Fixes: d768319cd4 ("alx: enable multiple tx queues")
Signed-off-by: Tobias Regnery <tobias.regnery@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The condition to free the descriptor memory is wrong, we want to free the
memory if it is set and not if it is unset. Invert the test to fix this
issue.
Fixes: b0999223f224b ("alx: add ability to allocate and free alx_napi structures")
Signed-off-by: Tobias Regnery <tobias.regnery@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When using the ibmveth driver in a KVM/QEMU based VM, it currently
always prints out a scary error message like this when it is started:
ibmveth 71000003 (unregistered net_device): unable to change
checksum offload settings. 1 rc=-2 ret_attr=71000003
This happens because the driver always tries to enable the checksum
offloading without checking for the availability of this feature first.
QEMU does not support checksum offloading for the spapr-vlan device,
thus we always get the error message here.
According to the LoPAPR specification, the "ibm,illan-options" property
of the corresponding device tree node should be checked first to see
whether the H_ILLAN_ATTRIUBTES hypercall and thus the checksum offloading
feature is available. Thus let's do this in the ibmveth driver, too, so
that the error message is really only limited to cases where something
goes wrong, and does not occur if the feature is just missing.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Modify the drivers/net/ethernet/{Makefile,Kconfig} file to make them a
part of the network drivers build.
Signed-off-by: Alexander Loktionov <Alexander.Loktionov@aquantia.com>
Signed-off-by: Dmitrii Tarakanov <Dmitrii.Tarakanov@aquantia.com>
Signed-off-by: Pavel Belous <Pavel.Belous@aquantia.com>
Signed-off-by: Dmitry Bezrukov <Dmitry.Bezrukov@aquantia.com>
Signed-off-by: David M. VomLehn <vomlehn@texas.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add definitions that support receive side scaling.
Signed-off-by: Alexander Loktionov <Alexander.Loktionov@aquantia.com>
Signed-off-by: Dmitrii Tarakanov <Dmitrii.Tarakanov@aquantia.com>
Signed-off-by: Pavel Belous <Pavel.Belous@aquantia.com>
Signed-off-by: Dmitry Bezrukov <Dmitry.Bezrukov@aquantia.com>
Signed-off-by: David M. VomLehn <vomlehn@texas.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the driver interfaces required for support by the ethtool utility.
Signed-off-by: Alexander Loktionov <Alexander.Loktionov@aquantia.com>
Signed-off-by: Dmitrii Tarakanov <Dmitrii.Tarakanov@aquantia.com>
Signed-off-by: Pavel Belous <Pavel.Belous@aquantia.com>
Signed-off-by: Dmitry Bezrukov <Dmitry.Bezrukov@aquantia.com>
Signed-off-by: David M. VomLehn <vomlehn@texas.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add functions to interface with the hardware and some utility functions.
Signed-off-by: Alexander Loktionov <Alexander.Loktionov@aquantia.com>
Signed-off-by: Dmitrii Tarakanov <Dmitrii.Tarakanov@aquantia.com>
Signed-off-by: Pavel Belous <Pavel.Belous@aquantia.com>
Signed-off-by: Dmitry Bezrukov <Dmitry.Bezrukov@aquantia.com>
Signed-off-by: David M. VomLehn <vomlehn@texas.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add common functions for Atlantic hardware abstraction layer.
Signed-off-by: Alexander Loktionov <Alexander.Loktionov@aquantia.com>
Signed-off-by: Dmitrii Tarakanov <Dmitrii.Tarakanov@aquantia.com>
Signed-off-by: Pavel Belous <Pavel.Belous@aquantia.com>
Signed-off-by: Dmitry Bezrukov <Dmitry.Bezrukov@aquantia.com>
Signed-off-by: David M. VomLehn <vomlehn@texas.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add functions that handle the PCI bus interface.
Signed-off-by: Alexander Loktionov <Alexander.Loktionov@aquantia.com>
Signed-off-by: Dmitrii Tarakanov <Dmitrii.Tarakanov@aquantia.com>
Signed-off-by: Pavel Belous <Pavel.Belous@aquantia.com>
Signed-off-by: Dmitry Bezrukov <Dmitry.Bezrukov@aquantia.com>
Signed-off-by: David M. VomLehn <vomlehn@texas.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add functions to manululate the vector of receive and transmit rings.
Signed-off-by: Alexander Loktionov <Alexander.Loktionov@aquantia.com>
Signed-off-by: Dmitrii Tarakanov <Dmitrii.Tarakanov@aquantia.com>
Signed-off-by: Pavel.Belous <Pavel.Belous@aquantia.com>
Signed-off-by: Dmitry Bezrukov <Dmitry.Bezrukov@aquantia.com>
Signed-off-by: David M. VomLehn <vomlehn@texas.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add Atlantic A0 and B0 specific functions.
Signed-off-by: Alexander Loktionov <Alexander.Loktionov@aquantia.com>
Signed-off-by: Dmitrii Tarakanov <Dmitrii.Tarakanov@aquantia.com>
Signed-off-by: Pavel Belous <Pavel.Belous@aquantia.com>
Signed-off-by: Dmitry Bezrukov <Dmitry.Bezrukov@aquantia.com>
Signed-off-by: David M. VomLehn <vomlehn@texas.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for code specific to the Atlantic NIC.
Signed-off-by: Alexander Loktionov <Alexander.Loktionov@aquantia.com>
Signed-off-by: Dmitrii Tarakanov <Dmitrii.Tarakanov@aquantia.com>
Signed-off-by: Pavel Belous <Pavel.Belous@aquantia.com>
Signed-off-by: Dmitry Bezrukov <Dmitry.Bezrukov@aquantia.com>
Signed-off-by: David M. VomLehn <vomlehn@texas.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add definitions of functions that interface directly with the hardware.
Signed-off-by: Alexander Loktionov <Alexander.Loktionov@aquantia.com>
Signed-off-by: Dmitrii Tarakanov <Dmitrii.Tarakanov@aquantia.com>
Signed-off-by: Pavel.Belous <Pavel.Belous@aquantia.com>
Signed-off-by: Dmitry Bezrukov <Dmitry.Bezrukov@aquantia.com>
Signed-off-by: David M. VomLehn <vomlehn@texas.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add code to support the transmit and receive ring buffers.
Signed-off-by: Alexander Loktionov <Alexander.Loktionov@aquantia.com>
Signed-off-by: Dmitrii Tarakanov <Dmitrii.Tarakanov@aquantia.com>
Signed-off-by: Pavel Belous <Pavel.Belous@aquantia.com>
Signed-off-by: Dmitry Bezrukov <Dmitry.Bezrukov@aquantia.com>
Signed-off-by: David M. VomLehn <vomlehn@texas.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add files containing the functions and definitions used in common in
different functional areas.
Signed-off-by: Alexander Loktionov <Alexander.Loktionov@aquantia.com>
Signed-off-by: Dmitrii Tarakanov <Dmitrii.Tarakanov@aquantia.com>
Signed-off-by: Pavel Belous <Pavel.Belous@aquantia.com>
Signed-off-by: Dmitry Bezrukov <Dmitry.Bezrukov@aquantia.com>
Signed-off-by: David M. VomLehn <vomlehn@texas.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patches to create the make and configuration files.
Signed-off-by: Alexander Loktionov <Alexander.Loktionov@aquantia.com>
Signed-off-by: Dmitrii Tarakanov <Dmitrii.Tarakanov@aquantia.com>
Signed-off-by: Pavel Belous <Pavel.Belous@aquantia.com>
Signed-off-by: Dmitry Bezrukov <Dmitry.Bezrukov@aquantia.com>
Signed-off-by: David M. VomLehn <vomlehn@texas.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
After commit 51b7b1c34e (KSZ8851-SNL: Add ethtool support for
EEPROM via eeprom_93cx6, 2011-11-21) this structure member is
unused. Delete it.
Signed-off-by: Stephen Boyd <stephen.boyd@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is intended for code reuse of mlx5e_modify_rx_cqe_compression
function.
Signed-off-by: Shaker Daibes <shakerd@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reduce memory consumption on kdump kernel by decreasing the number of
channels to 1 and the size of RQs and SQs to the minimal values.
Signed-off-by: Kamal Heib <kamalh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
So we can use that from the IB driver too in downstream patches.
This patch doesn't change any functionality.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add support for SRIOV VF min rate guarantee by using the TSAR BW share
weights mechanism.
The TSAR BW share vport attribute represents the weight of that vport
among the other vports weights which means that the actual vport BW
percentage is the same vport weight percentage among the total vports
weights sum.
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The E-Switch FDB size was hard coded to 8k. Change it to be
min(max eswitch table size, max flow counters * num flow groups)
where the max values are read from the firmware and the number of
flow groups is hard-coded as before this change.
We don't know upfront the division of flows to group. This setup allows
each group to be of size up to the where we want to support (we mandate
pairing of flows with counters for offloading). Thus, we don't expect
multiple occurences for a group which in turn adds steering hops.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Tested-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add the missing parts for offloading IPv6 tunnels. This includes
route and neigh lookups and construnction of the IPv6 tunnel headers.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Use more fields out of the tunnel key (e.g the tunnel source IP address)
provided by upper layers for the route lookup done on the encap offload path.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Currently we use subset of the input tunnel key fields (id, ip daddr,
dst port) which are provided by upper layers to indentify flows that should
go through the same encapsulation and maintain the HW encapsulation table.
This is redundant and can get us wrong.
Instead, keep a copy of the ip tunnel info provided by the user
through TC and have the tunnel key part as the key to our internal hash.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Move around some settings of variables as pre-step to make things
more robust and clear for the ipv6 case in down-stream patch.
This patch doesn't change any functionality.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Enhance the parsing of offloaded TC rules to set HW matching on outer
IPv6 encapsulation headers. This effectively adds support for TC tunnel
key release action (decapsulation) of SRIOV offloads over IPv6 tunnels.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The current code is allocating the max encap size supported by
the firmware and not the size requested by the caller, fix that.
Also, spare a warning when the size of the encapsulation headers
is bigger from what is supported by the firmware.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.
As I don't have the hardware, I'd be very pleased if
someone may test this patch.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Acked-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using the MPSC register, add the functions that configure port-based
packet sampling in hardware and the necessary datatypes in the
mlxsw_sp_port struct. In addition, add the necessary trap for sampled
packets and integrate with matchall offloading to allow offloading of the
sample tc action.
The current offload support is for the tc command:
tc filter add dev <DEV> parent ffff: \
matchall skip_sw \
action sample rate <RATE> group <GROUP> [trunc <SIZE>]
Where only ingress qdiscs are supported, and only a combination of
matchall classifier and sample action will lead to activating hardware
packet sampling.
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The MPSC register allows to configure ingress packet sampling on specific
port of the mlxsw device. The sampled packets are then trapped via
PKT_SAMPLE trap.
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mlxsw_sp_nexthop_group_mac_update() is called in one of two cases:
1) When the MAC of a nexthop needs to be updated
2) When the size of a nexthop group has changed
In the second case the adjacency entries for the nexthop group need to
be reallocated from the adjacency table. In this case we must write to
the entries the MAC addresses of all the nexthops that should be
offloaded and not only those whose MAC changed. Otherwise, these entries
would be filled with garbage data, resulting in packet loss.
Fixes: a7ff87acd9 ("mlxsw: spectrum_router: Implement next-hop routing")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Prior to this patch we were using a hardcoded RGMII TX clock delay of
2ns (= 1/4 cycle of the 125MHz RGMII TX clock). This value works for
many boards, but unfortunately not for all (due to the way the actual
circuit is designed, sometimes because the TX delay is enabled in the
PHY, etc.). Making the TX delay on the MAC side configurable allows us
to support all possible hardware combinations.
This allows fixing a compatibility issue on some boards, where the
RTL8211F PHY is configured to generate the TX delay. We can now turn
off the TX delay in the MAC, because otherwise we would be applying the
delay twice (which results in non-working TX traffic).
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Tested-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Declare net_device_ops structure as const as it is only stored in
the netdev_ops field of a net_device structure. This field is of type
const, so net_device_ops structures having same properties can be made
const too.
Done using Coccinelle:
@r1 disable optional_qualifier@
identifier i;
position p;
@@
static struct net_device_ops i@p={...};
@ok1@
identifier r1.i;
position p;
struct net_device ndev;
@@
ndev.netdev_ops=&i@p
@bad@
position p!={r1.p,ok1.p};
identifier r1.i;
@@
i@p
@depends on !bad disable optional_qualifier@
identifier r1.i;
@@
+const
struct net_device_ops i;
File size before:
text data bss dec hex filename
6201 744 0 6945 1b21 ethernet/xilinx/xilinx_emaclite.o
File size after:
text data bss dec hex filename
6745 192 0 6937 1b19 ethernet/xilinx/xilinx_emaclite.o
Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Declare net_device_ops structure as const as it is only stored in
the netdev_ops field of a net_device structure. This field is of type
const, so net_device_ops structures having same properties can be made
const too.
Done using Coccinelle:
@r1 disable optional_qualifier@
identifier i;
position p;
@@
static struct net_device_ops i@p={...};
@ok1@
identifier r1.i;
position p;
struct net_device ndev;
@@
ndev.netdev_ops=&i@p
@bad@
position p!={r1.p,ok1.p};
identifier r1.i;
@@
i@p
@depends on !bad disable optional_qualifier@
identifier r1.i;
@@
+const
struct net_device_ops i;
File size before:
text data bss dec hex filename
4821 744 0 5565 15bd ethernet/moxa/moxart_ether.o
File size after:
text data bss dec hex filename
5373 192 0 5565 15bd ethernet/moxa/moxart_ether.o
Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During reset, functions emac_mac_down() and emac_mac_up() are called,
so we don't want to free and claim the IRQ unnecessarily. Move those
operations to open/close.
Signed-off-by: Timur Tabi <timur@codeaurora.org>
Reviewed-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The EMAC has an internal PHY that is often called the "SGMII". This
SGMII is also connected to an external PHY, which is managed by phylib.
These dual PHYs often cause confusion. In this case, the data structure
for managing the SGMII was mis-named and located in the wrong header file.
Structure emac_phy is renamed to emac_sgmii to clearly indicate it applies
to the internal PHY only. It also also moved from emac_phy.h (which
supports the external PHY) to emac_sgmii.h (where it belongs).
To keep the changes minimal, only the structure name is changed, not
the names of any variables of that type.
Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 4cace675d6 ("bnx2x: Alloc 4k fragment for each rx ring buffer
element") added extra put_page() and get_page() calls on arches where
PAGE_SIZE=4K like x86
Reorder things to avoid this overhead.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Cc: Yuval Mintz <Yuval.Mintz@cavium.com>
Cc: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The xgbe_init() routine returns a return code indicating success or
failure, but the return code is not checked. Add code to xgbe_init()
to issue a message when failures are seen and add code to check the
xgbe_init() return code.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A newer version of the hardware is using the same PCI ids for the network
device but has altered register definitions for determining the window
settings for the indirect PCS access. Add support to check for this
hardware and if found use the new register values.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add supporf for the SYSTEMPORT Lite Ethernet controller, this piece of hardware
is largely based on the full-blown SYSTEMPORT and differs in the following:
- no full-blown UniMAC, instead we have the MagicPacket matching from UniMAC at
same offset, and a GMII Interface Block (GIB) for the MAC-level stuff, since
we are always interfaced to an Ethernet switch which is fully Ethernet compliant
shortcuts could be made
- 16 transmit queues, whose interrupts are moved into the first Level-2 interrupt
controller bank
- slight TDMA offset change (a register was inserted after TDMA_STATUS, *sigh*)
- 256 RX descriptors (512 words) and 256 TX descriptors (not visible)
As a consequence of these two things, update the code paths accordingly to
differentiate the full-blown from the light version.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation for adding SYSTEMPORT Lite, which has twice as less transmit
queues than SYSTEMPORT make sure we do allocate TX rings based on the
systemport,txq property to get an appropriate memory footprint.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes the LS mask when setting EEE timer.
LS field is 10 bits long and not 11 as currently.
Signed-off-by: Joao Pinto <jpinto@synopsys.com>
Reported-By: Rayagond Kokatanur <rayagond@vayavyalabs.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To make the code clearer, use rb_entry() instead of container_of() to
deal with rbtree.
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This series includes some updates for mlx5 core and mlx5e netdevice driver.
From Leon, a small fix that remove an unnecessary print.
From Eli Cohen, a fix to the FW version printout in case of internal error.
From Eugenia Emantayev, two patches, the 1st adds mlx5 1pps (pulse per
second) mlx5 infrastructure support and the 2nd adds the necessary bits
for mlx5e ptp logic and structures.
From Mohamad, add support for s-tagged packet receive when in promiscuous
mode.
Form Gal Pressman, MCAM (Management capabilities mask register) and PCAM
(Ports capabilities mask register) registers infrastructure, those
registers are needed in order to query the different statistics registers
support in FW, in order for the driver to enable/disable query and
reporting them back to user. On top of this infrastructure we've exposed
new set of statistics groups:
- MPCNT: Physical layer statistical counters (For symbol errors)
- PPCNT: PCIe performance counters
In addition to the statistics capabilities series we've moved the mlx5 HCA
capabilities fields to a dedicated struct under the driver private data.
At the end a small patch to update & query statistics in the most desired
order.
Thanks,
Saeed.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJYgT2XAAoJEEg/ir3gV/o++x8H/2adUTYToQVH9T9KdeHYcpYj
LtN36/8WFL5MliMpK0DcmuKe9k45ukN5bJGUDhwndJBsHJledBoFw3C6k4vZl0Qw
NiP4t165xmwYQrqI75KVeeGqNWl6LanozZzJVsOM48mSjOXClPnz5BFR4UgL5gTh
q60VmqpSeBjT0EQfT18s1DZCdUY6UUK1XgmgNnFsHUhO/iWVPlNEwItblC2N/YWA
p7lGUAJmAQvDN2sejzz0ElcCieY8yA+cZHgalW0KZK961RCwIl1GECw5xEcLLGxN
O88jaDvTuJpZtl0IOsBi9dwZHx64dx1a+wkFGv+GA6eTgiQ5kPgb2Jdhy26jf9g=
=Hy/5
-----END PGP SIGNATURE-----
Merge tag 'mlx5-updates-2017-01-19' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5 and mlx5e updates 2017-01-19
This series includes some updates for mlx5 core and mlx5e netdevice driver.
From Leon, a small fix that remove an unnecessary print.
From Eli Cohen, a fix to the FW version printout in case of internal error.
From Eugenia Emantayev, two patches, the 1st adds mlx5 1pps (pulse per
second) mlx5 infrastructure support and the 2nd adds the necessary bits
for mlx5e ptp logic and structures.
From Mohamad, add support for s-tagged packet receive when in promiscuous
mode.
Form Gal Pressman, MCAM (Management capabilities mask register) and PCAM
(Ports capabilities mask register) registers infrastructure, those
registers are needed in order to query the different statistics registers
support in FW, in order for the driver to enable/disable query and
reporting them back to user. On top of this infrastructure we've exposed
new set of statistics groups:
- MPCNT: Physical layer statistical counters (For symbol errors)
- PPCNT: PCIe performance counters
In addition to the statistics capabilities series we've moved the mlx5 HCA
capabilities fields to a dedicated struct under the driver private data.
At the end a small patch to update & query statistics in the most desired
order.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
A driver using dev_alloc_page() must not reuse a page allocated from
emergency memory reserve.
Otherwise all packets using this page will be immediately dropped,
unless for very specific sockets having SOCK_MEMALLOC bit set.
This issue might be hard to debug, because only a fraction of received
packets would be dropped.
Fixes: 4415a0319f ("net/mlx5e: Implement RX mapped page cache for page recycle")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After adding cpsw_set_ringparam ethtool op, better to carry out
common parts of similar ops splitting descriptors in runtime. It
allows to reuse these parts and shows what the ops actually do.
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
No need to duplicate the same function in rx handler to get info
if any interface is running.
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
No need to create additional vars to identify if interface is running.
So simplify code by removing redundant var and checking usage counter
instead.
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
No need to disable interrupts if no open devices,
they are disabled anyway.
Even no need to disable interrupts if some ndev is opened, In this
case shared resources are not touched, only parameters of ndev shell,
so no reason to disable them also. Removed lines have proved it.
So, no need in redundant check and interrupt disable.
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Common res usage is possible only in case an interface is
running. In case of not dual emac here can be only one interface,
so while ndo_open and switch mode, only one interface can be opened,
thus if open is called no any interface is running ... and no common
res are used. So remove check on dual emac, it will simplify
code/understanding and will match the name it's called.
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
gcc-7 and probably earlier versions get confused by this function
and print a harmless warning:
drivers/net/ethernet/broadcom/bcm63xx_enet.c: In function 'bcm_enet_open':
drivers/net/ethernet/broadcom/bcm63xx_enet.c:1130:3: error: 'phydev' may be used uninitialized in this function [-Werror=maybe-uninitialized]
This adds an initialization for the 'phydev' variable when it is unused
and changes the check to test for that NULL pointer to make it clear
that we always pass a valid pointer here.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
struct qed_ll2_info is rather large, so putting it on the stack
can cause an overflow, as this warning tries to tell us:
drivers/net/ethernet/qlogic/qed/qed_ll2.c: In function 'qed_ll2_start':
drivers/net/ethernet/qlogic/qed/qed_ll2.c:2159:1: error: the frame size of 1056 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
qed_ll2_start_ooo() already uses a dynamic allocation for the structure
to work around that problem, and we could do the same in qed_ll2_start()
as well as qed_roce_ll2_start(), but since the structure is only
used to pass a couple of initialization values here, it seems nicer
to replace it with a different structure.
Lacking any idea for better naming, I'm adding 'struct qed_ll2_conn',
which now contains all the initialization data, and this now simply
gets copied into struct qed_ll2_info rather than assigning all members
one by one.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The two new variables are only used inside of an #ifdef and cause
harmless warnings when that is disabled:
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c: In function 'init_one':
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:4646:9: error: unused variable 'port_vec' [-Werror=unused-variable]
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:4646:6: error: unused variable 'v' [-Werror=unused-variable]
This adds another #ifdef around the declarations.
Fixes: 96fe11f27b ("cxgb4: Implement ndo_get_phys_port_id for mgmt dev")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 04aeb56a17 ("net/mlx4_en: allocate non 0-order pages for RX
ring with __GFP_NOMEMALLOC") added code that appears to be not needed at
that time, since mlx4 never used __GFP_MEMALLOC allocations anyway.
As using memory reserves is a must in some situations (swap over NFS or
iSCSI), this patch adds this flag.
Note that this driver does not reuse pages (yet) so we do not have to
add anything else.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 3e88449344.
With commit 529ed12752 ("net: phy: phy drivers should not set
SUPPORTED_[Asym_]Pause"), phylib now handles automatically enabling
pause frame support in the PHY, and the MAC driver should follow suit.
Since the EMAC driver driver does this, we no longer need to force
pause frames support.
Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reorder update stats flow to update most important counters last,
to get more accurate results.
New update order:
- PCIe counters
- Port counters
- Vport counters
- Queue counters
- Software counters
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Gal Pressman <galp@mellanox.com>
The caps structure consists of hca caps and port/management caps,
all under one roof.
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Use ethtool -S to query physical layer statistical counters including:
- rx_symbol_errors_phy: Number of symbol errors that were not corrected
by FEC correction algorithm or that FEC was not active on this interface.
- rx_corrected_bits_phy: Number of corrected bits according to active
FEC (RS/FC).
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
On load_one, we now cache our capabilities registers internally, similar
to QUERY_HCA_CAP. Capabilities can later be queried using macros
introduced in this patch.
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Introduced registers will expose capabilities of new registers and
features related to port/management.
Driver will query MCAM and PCAM in order to avoid failing on old
firmwares with lack of support.
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Today when the driver enter to promiscuous mode or vlan
filter is disabled, we add flow rule to receive any c-taggd
packets, therefore s-tagged packets are dropped.
In order to receive s-tagged packets as well we need to add
flow rule to receive any s-tagged packet.
Fixes: 7cb21b794b ('net/mlx5e: Rename en_flow_table.c to en_fs.c')
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
This patch enables the 1PPS IN and 1PPS OUT support according
to the advertised HCA capability. Single pin may be configured
to one of the above mutual exclusive functions via standard
Linux tools and APIs. For example, testptp open source application.
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Implement query and set functionality for MTPPS and MTPPSE registers.
MTPPS (Management Pulse Per Second) provides the device PPS capabilities,
configures the PPS in and out modules and holds the PPS in time stamp.
Query MTPPS is supported only when HCA_CAP.pps is set and modify is supported
when HCA_CAP.pps_modify is set.
MTPPSE (Management Pulse Per Second Event) configures the different event
generation modes for PPS. Supported when HCA_CAP.pps is set.
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Firmware representation of the firmware version on the health buffer has
changed for newer device. The representation in the initialization
segment does not and will not change. In addition, we print the health
buffer firmware version as a raw hex number.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Infiniband part of mlx5 driver can be compiled as a module
or as a part of bzImage (compiled in). In the second case,
the call to request_module will return an error -ENOENT.
It will cause to a misleading print "failed request module
on mlx5_ib".
This patch removes this print, In order to comply with mlx4.
Fixes: f66f049fb7 ("net/mlx5_core: Request the mlx5 IB module on driver load")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
A driver using dev_alloc_page() must not reuse a page that had to
use emergency memory reserve.
Otherwise all packets using this page will be immediately dropped,
unless for very specific sockets having SOCK_MEMALLOC bit set.
This issue might be hard to debug, because only a fraction of the RX
ring buffer would suffer from drops.
Fixes: 75354148ce ("gianfar: Add paged allocation and Rx S/G")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Claudiu Manoil <claudiu.manoil@freescale.com>
Acked-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch does the following:
- MACB/GEM-PTP interface
- registers and bitfields for TSU
- capability flags to enable PTP per platform basis
Signed-off-by: Andrei Pistirica <andrei.pistirica@microchip.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A cleanup removed the only user of this variable
mlx5/core/en_ethtool.c: In function 'mlx5e_set_channels':
mlx5/core/en_ethtool.c:546:6: error: unused variable 'ncv' [-Werror=unused-variable]
Let's remove the declaration as well.
Fixes: 639e9e9416 ("net/mlx5e: Remove unnecessary checks when setting num channels")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The network stack no longer uses the last_rx member of struct net_device
since the bonding driver switched to use its own private last_rx in
commit 9f24273837 ("bonding: use last_arp_rx in slave_last_rx()").
However, some drivers still (ab)use the field for their own purposes and
some driver just update it without actually using it.
Previously, there was an accompanying comment for the last_rx member
added in commit 4dc89133f4 ("net: add a comment on netdev->last_rx")
which asked drivers not to update is, unless really needed. However,
this commend was removed in commit f8ff080dac ("bonding: remove
useless updating of slave->dev->last_rx"), so some drivers added later
on still did update last_rx.
Remove all usage of last_rx and switch three drivers (sky2, atp and
smc91c92_cs) which actually read and write it to use their own private
copy in netdev_priv.
Compile-tested with allyesconfig and allmodconfig on x86 and arm.
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Jay Vosburgh <j.vosburgh@gmail.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Cc: Mirko Lindner <mlindner@marvell.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Acked-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the TPA GRO code path, initialize the tcp_opt_len variable to 0 so
that it will be correct for packets without TCP timestamps. The bug
caused the SKB fields to be incorrectly set up for packets without
TCP timestamps, leading to these packets being rejected by the stack.
Reported-by: Andy Gospodarek <andrew.gospodarek@broadocm.com>
Acked-by: Andy Gospodarek <andrew.gospodarek@broadocm.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Check "ch" on NULL first, then get ctlr.
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Relax ordering(RO) is one feature of 82599 NIC, to enable this feature can
enhance the performance for some cpu architecure, such as SPARC and so on.
Currently it only supports one special cpu architecture(SPARC) in 82599
driver to enable RO feature, this is not very common for other cpu architecture
which really needs RO feature.
This patch add one common config CONFIG_ARCH_WANT_RELAX_ORDER to set RO feature,
and should define CONFIG_ARCH_WANT_RELAX_ORDER in sparc Kconfig firstly.
Signed-off-by: Mao Wenan <maowenan@huawei.com>
Reviewed-by: Alexander Duyck <alexander.duyck@gmail.com>
Reviewed-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Boundaries checks for the number of RX and TX should be checked by the
caller and not in the driver.
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Boundaries checks for the number of RX, TX, other and combined channels
should be checked by the caller and not in the driver.
In addition, remove wrong memset on get channels as it overrides the cmd
field in the requester struct.
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds more info to stmicro' Kconfig files in order to be clearer
that the driver can be used by ethernet cards based on 10/100/1000/EQOS
Synopsys IP Cores.
EQOS was also added stmmac/Kconfig Kconfig, since dwmac4 is in fact EQoS,
one of Synopsys Ethernet IPs. More info at:
https://www.synopsys.com/dw/ipdir.php?ds=dwc_ether_qos
Signed-off-by: Joao Pinto <jpinto@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds bpf_xdp_adjust_head() support to mlx5e.
1. rx_headroom is added to struct mlx5e_rq. It uses
an existing 4 byte hole in the struct.
2. The adjusted data length is checked against
MLX5E_XDP_MIN_INLINE and MLX5E_SW2HW_MTU(rq->netdev->mtu).
3. The macro MLX5E_SW2HW_MTU is moved from en_main.c to en.h.
MLX5E_HW2SW_MTU is also moved to en.h for symmetric reason
but it is not a must.
v2:
- Keep the xdp specific logic in mlx5e_xdp_handle()
- Update dma_len after the sanity checks in mlx5e_xmit_xdp_frame()
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the needlessly global struct ethtool_ops ethoc_ethtool_ops static
to fix a sparse warning.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ensures that we report the key and indirection table the NIC is using,
rather than (if setting them failed earlier) what we wanted it to use.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use eth_zero_addr to assign zero address to the given address array
instead of memset when the second argument in memset is address
of zero. Also, it makes the code clearer
Signed-off-by: Shyam Saini <mayhs11saini@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function stmmac_dt_phy provides several possibilities for initializing
plat->mdio_node, all of which have the effect of increasing the reference
count of the assigned value. This field is not updated elsewhere, so the
value is live until the end of the lifetime of plat (devm_allocated), just
after the end of stmmac_remove_config_dt. Thus, add an of_node_put on
plat->mdio_node in stmmac_remove_config_dt. It is possible that the field
mdio_node is never initialized, but of_node_put is NULL-safe, so it is also
safe to call of_node_put in that case.
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.
The callback set_link_ksettings no longer update the value
of advertising, as the struct ethtool_link_ksettings is
defined as const.
As I don't have the hardware, I'd be very pleased if
someone may test this patch.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.
As I don't have the hardware, I'd be very pleased if
someone may test this patch.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tests showed that when whole bandwidth is consumed, the latency for
various kind of traffic can reach high values. With saturated
link (e.g. with iperf from target to host) simple ping could take
significant amount of time. BQL proved to improve this situation
when implemented in mvneta driver. Measurements of ping latency
for 3 link speeds:
Speed | Latency w/o BQL | Latency with BQL
10 | 7-14 ms | 3.5 ms
100 | 2-12 ms | 0.6 ms
1000 | often timeout | up to 2ms
Decreasing latency as above result in sligt performance cost - 4kpps
(-1.4%) when pushing 64B packets via two bridged interfaces of Armada 38x.
For 1500B packets in the same setup, the mpstat tool showed +8% of
CPU occupation (default affinity, second CPU idle). Even though this
cost seems reasonable to take, considering other improvements.
This commit adds byte queue limit mechanism for the mvneta driver.
Signed-off-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Basing on xmit_more flag of the skb, TX descriptors can be concatenated
before flushing. This commit delay Tx descriptor flush if the queue is
running and if there is more skb's to send.
A maximum allowed number of descriptors for flushing at once due to
MVNETA_TXQ_UPDATE_REG(q) reqisters limitation, is 255. Because of that
a new macro was added (MVNETA_TXQ_DEC_SENT_MASK) in order to ensure that
concatenated amount of descriptor does not exceed that value.
Signed-off-by: Simon Guinot <simon.guinot@sequanux.org>
Signed-off-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When running SRIOV, warnings for SRQ LIMIT events flood the Hypervisor's
message log when (correct, normally operating) apps use SRQ LIMIT events
as a trigger to post WQEs to SRQs.
Add more information to the existing debug printout for SRQ_LIMIT, and
output the warning messages only for the SRQ CATAS ERROR event.
Fixes: acba2420f9 ("mlx4_core: Add wrapper functions and comm channel and slave event support to EQs")
Fixes: e0debf9cb5 ("mlx4_core: Reduce warning message for SRQ_LIMIT event to debug level")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Save the qp context flags byte containing the flag disabling vlan stripping
in the RESET to INIT qp transition, rather than in the INIT to RTR
transition. Per the firmware spec, the flags in this byte are active
in the RESET to INIT transition.
As a result of saving the flags in the incorrect qp transition, when
switching dynamically from VGT to VST and back to VGT, the vlan
remained stripped (as is required for VST) and did not return to
not-stripped (as is required for VGT).
Fixes: f0f829bf42 ("net/mlx4_core: Add immediate activate for VGT->VST->VGT")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In function mlx4_cq_completion() and mlx4_cq_event(), the
radix_tree_lookup requires a rcu_read_lock.
This is mandatory: if another core frees the CQ, it could
run the radix_tree_node_rcu_free() call_rcu() callback while
its being used by the radix tree lookup function.
Additionally, in function mlx4_cq_event(), since we are adding
the rcu lock around the radix-tree lookup, we no longer need to take
the spinlock. Also, the synchronize_irq() call for the async event
eliminates the need for incrementing the cq reference count in
mlx4_cq_event().
Other changes:
1. In function mlx4_cq_free(), replace spin_lock_irq with spin_lock:
we no longer take this spinlock in the interrupt context.
The spinlock here, therefore, simply protects against different
threads simultaneously invoking mlx4_cq_free() for different cq's.
2. In function mlx4_cq_free(), we move the radix tree delete to before
the synchronize_irq() calls. This guarantees that we will not
access this cq during any subsequent interrupts, and therefore can
safely free the CQ after the synchronize_irq calls. The rcu_read_lock
in the interrupt handlers only needs to protect against corrupting the
radix tree; the interrupt handlers may access the cq outside the
rcu_read_lock due to the synchronize_irq calls which protect against
premature freeing of the cq.
3. In function mlx4_cq_event(), we change the mlx_warn message to mlx4_dbg.
4. We leave the cq reference count mechanism in place, because it is
still needed for the cq completion tasklet mechanism.
Fixes: 6d90aa5cf1 ("net/mlx4_core: Make sure there are no pending async events when freeing CQ")
Fixes: 225c7b1fee ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Don't use netdev_info and friends before the net_device is registered.
This avoids ugly messages like
"meson8b-dwmac c9410000.ethernet (unnamed net_device) (uninitialized):
Enable RX Mitigation via HW Watchdog Timer"
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As found by Olof's build bot, we gain a harmless warning about a
potential uninitialized variable reference in mlx5:
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c: In function 'parse_tc_fdb_actions':
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:769:13: warning: 'out_dev' may be used uninitialized in this function [-Wmaybe-uninitialized]
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:811:21: note: 'out_dev' was declared here
This was introduced through the addition of an 'IS_ERR/PTR_ERR' pair
that gcc is unfortunately unable to completely figure out.
The problem being gcc cannot tell that if(IS_ERR()) in
mlx5e_route_lookup_ipv4() is equivalent to checking if(err) later,
so it assumes that 'out_dev' is used after the 'return PTR_ERR(rt)'.
The PTR_ERR_OR_ZERO() case by comparison is fairly easy to detect
by gcc, so it can't get that wrong, so it no longer warns.
Hadar Hen Zion already attempted to fix the warning earlier by adding fake
initializations, but that ended up not fully addressing all warnings, so
I'm reverting it now that it is no longer needed.
Link: http://arm-soc.lixom.net/buildlogs/mainline/v4.10-rc3-98-gcff3b2c/
Fixes: a42485eb0e ("net/mlx5e: TC ipv4 tunnel encap offload error flow fixes")
Fixes: a757d108dc ("net/mlx5e: Fix kbuild warnings for uninitialized parameters")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 8000 series SFC NICs have 4K PIO buffers, rather than the 2K of
the 7000 series. Rather than having a hard-coded PIO buffer size
(ER_DZ_TX_PIOBUF_SIZE), read it from the GET_CAPABILITIES_V2 MCDI
response.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If an option descriptor has been sent on a queue but not followed by a
packet, there will have been no completion event, so the read and write
counts won't match and we'll think we can't do PIO. This combines with
the fact that we have two TX queues (for en/disable checksum offload),
and that both must be empty for PIO to happen.
This patch adds a separate "packet_write_count" that tracks the most
recent write_count we expect to see a completion event for; this excludes
option descriptors but _includes_ PIO descriptors (even though they look
like option descriptors). This is then used, rather than write_count,
in efx_nic_tx_is_empty().
We only bother to maintain packet_write_count on EF10, since on Siena
(a) there are no option descriptors and it always equals write_count, and
(b) there's no PIO, so we don't need it anyway.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Perform an emergency shutdown of the adapter and stop it from
continuing any further communication on the ports or DMA to the
host. This is typically used when the adapter and/or firmware
have crashed and we want to prevent any further accidental
communication with the rest of the world. This will also force
the port Link Status to go down -- if register writes work --
which should help our peers figure out that we're down.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During interface opening MAC address stored in netdev->dev_addr is
programmed in the HW with exception of BE3 VFs where the initial
MAC is programmed by parent PF. This is OK when MAC address is not
changed when an interfaces is down. In this case the requested MAC is
stored to netdev->dev_addr and later is stored into HW during opening.
But this is not done for all BE3 VFs so the NIC HW does not know
anything about this change and all traffic is filtered.
This is the case of bonding if fail_over_mac == 0 where the MACs of
the slaves are changed while they are down.
The be2net behavior is too restrictive because if a BE3 VF has
the FILTMGMT privilege then it is able to modify its MAC without
any restriction.
To solve the described problem the driver should take care about these
privileged BE3 VFs so the MAC is programmed during opening. And by
contrast unpriviled BE3 VFs should not be allowed to change its MAC
in any case.
Cc: Sathya Perla <sathya.perla@broadcom.com>
Cc: Ajit Khaparde <ajit.khaparde@broadcom.com>
Cc: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Cc: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Ivan Vecera <cera@cera.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
BE3 VFs without FILTMGMT privilege are not allowed to modify its MAC,
VLAN table and UC/MC lists. So don't try to delete MAC on such VFs.
Cc: Sathya Perla <sathya.perla@broadcom.com>
Cc: Ajit Khaparde <ajit.khaparde@broadcom.com>
Cc: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Cc: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Ivan Vecera <cera@cera.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Return value from be_mcc_notify_wait() contains a base completion status
together with an additional status. The base_status() macro need to be
used to access base status.
Fixes: e3a7ae2 be2net: Changing MAC Address of a VF was broken
Cc: Sathya Perla <sathya.perla@broadcom.com>
Cc: Ajit Khaparde <ajit.khaparde@broadcom.com>
Cc: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Cc: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Ivan Vecera <cera@cera.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
The #warning was present 10 years ago when the driver first got merged.
As the platform is rather obsolete by now, it seems very unlikely that
the warning will cause anyone to fix the code properly.
kernelci.org reports the warning for every build in the meantime, so
I think it's better to just turn it into a code comment to reduce
noise.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use eth_zero_addr to assign zero address to the given address array
instead of memset when the second argument in memset is address
of zero which makes the code clearer and also add header
file linux/etherdevice.h
Signed-off-by: Shyam Saini <mayhs11saini@gmail.com>
Acked-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Disable BH around the call to napi_schedule() to avoid following warning
[ 52.095499] NOHZ: local_softirq_pending 08
[ 52.421291] NOHZ: local_softirq_pending 08
[ 52.608313] NOHZ: local_softirq_pending 08
Fixes: 8d59de8f7b ("net/mlx4_en: Process all completions in RX rings after port goes up")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Erez Shitrit <erezsh@mellanox.com>
Cc: Eugenia Emantayev <eugenia@mellanox.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The break statement should be indented one more tab.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This driver is no longer necessary since it was merged into stmmac.
Acked-by: Lars Persson <larper@axis.com>
Signed-off-by: Joao Pinto <jpinto@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The region set by the call to memset, immediately overwritten by
the subsequent call to memcpy and thus makes the memset redundant.
Also remove the memset((&info, 0, sizeof(info)) on line 398 because
info is memcpy()'ed to before being used in the loop and it isn't
used outside of the loop.
Signed-off-by: Shyam Saini <mayhs11saini@gmail.com>
Reviewed-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>