The implementation of QP paravirtualization back in linux-3.7 included
some code that looks very dubious, and gcc-6 has grown smart enough
to warn about it:
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'verify_qp_parameters':
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:3154:5: error: statement is indented as if it were guarded by... [-Werror=misleading-indentation]
if (optpar & MLX4_QP_OPTPAR_ALT_ADDR_PATH) {
^~
drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:3144:4: note: ...this 'if' clause, but it is not
if (slave != mlx4_master_func_num(dev))
>From looking at the context, I'm reasonably sure that the indentation
is correct but that it should have contained curly braces from the
start, as the update_gid() function in the same patch correctly does.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 54679e1482 ("mlx4: Implement QP paravirtualization and maintain phys_pkey_cache for smp_snoop")
Signed-off-by: David S. Miller <davem@davemloft.net>
The device_reset() function may fail, so we have to check
its return value, e.g. to make deferred probing work correctly.
gcc warns about it because of the warn_unused_result attribute:
drivers/net/ethernet/mediatek/mtk_eth_soc.c: In function 'mtk_probe':
drivers/net/ethernet/mediatek/mtk_eth_soc.c:1679:2: error: ignoring return value of 'device_reset', declared with attribute warn_unused_result [-Werror=unused-result]
This adds the trivial error check to propagate the return value
to the generic platform device probe code.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Device drivers should not mess with the DMA mask directly,
but instead call dma_set_mask() etc if needed.
In case of the mtk_eth_soc driver, the mask already gets set
correctly when the device is created, and setting it again
is against the documented API.
This removes the incorrect setting.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
dma_alloc_coherent() expects a dma_addr_t pointer as its argument,
not an 'unsigned int', and gcc correctly warns about broken
code in the mtk_init_fq_dma function:
drivers/net/ethernet/mediatek/mtk_eth_soc.c: In function 'mtk_init_fq_dma':
drivers/net/ethernet/mediatek/mtk_eth_soc.c:463:13: error: passing argument 3 of 'dma_alloc_coherent' from incompatible pointer type [-Werror=incompatible-pointer-types]
This changes the type of the local variable to dma_addr_t.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adjusted nicvf structure such that all elements used in hot
path like napi, xmit e.t.c fall into same cache line. This reduced
no of cache misses and resulted in ~2% increase in no of packets
handled on a core.
Also modified elements with :1 notation to boolean, to be
consistent with other element definitions.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of calling get_page() for every receive buffer carved out
of page, set page's usage count at the end, to reduce no of atomic
calls.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that the hardware buffer management framework had been introduced,
let's use it.
Tested-by: Sebastian Careba <nitroshift@yahoo.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Buffer manager (BM) is a dedicated hardware unit that can be used by all
ethernet ports of Armada XP and 38x SoC's. It allows to offload CPU on RX
path by sparing DRAM access on refilling buffer pool, hardware-based
filling of descriptor ring data and better memory utilization due to HW
arbitration for using 'short' pools for small packets.
Tests performed with A388 SoC working as a network bridge between two
packet generators showed increase of maximum processed 64B packets by
~20k (~555k packets with BM enabled vs ~535 packets without BM). Also
when pushing 1500B-packets with a line rate achieved, CPU load decreased
from around 25% without BM to 20% with BM.
BM comprise up to 4 buffer pointers' (BP) rings kept in DRAM, which
are called external BP pools - BPPE. Allocating and releasing buffer
pointers (BP) to/from BPPE is performed indirectly by write/read access
to a dedicated internal SRAM, where internal BP pools (BPPI) are placed.
BM hardware controls status of BPPE automatically, as well as assigning
proper buffers to RX descriptors. For more details please refer to
Functional Specification of Armada XP or 38x SoC.
In order to enable support for a separate hardware block, common for all
ports, a new driver has to be implemented ('mvneta_bm'). It provides
initialization sequence of address space, clocks, registers, SRAM,
empty pools' structures and also obtaining optional configuration
from DT (please refer to device tree binding documentation). mvneta_bm
exposes also a necessary API to mvneta driver, as well as a dedicated
structure with BM information (bm_priv), whose presence is used as a
flag notifying of BM usage by port. It has to be ensured that mvneta_bm
probe is executed prior to the ones in ports' driver. In case BM is not
used or its probe fails, mvneta falls back to use software buffer
management.
A sequence executed in mvneta_probe function is modified in order to have
an access to needed resources before possible port's BM initialization is
done. According to port-pools mapping provided by DT appropriate registers
are configured and the buffer pools are filled. RX path is modified
accordingly. Becaues the hardware allows a wide variety of configuration
options, following assumptions are made:
* using BM mechanisms can be selectively disabled/enabled basing
on DT configuration among the ports
* 'long' pool's single buffer size is tied to port's MTU
* using 'long' pool by port is obligatory and it cannot be shared
* using 'short' pool for smaller packets is optional
* one 'short' pool can be shared among all ports
This commit enables hardware buffer management operation cooperating with
existing mvneta driver. New device tree binding documentation is added and
the one of mvneta is updated accordingly.
[gregory.clement@free-electrons.com: removed the suspend/resume part]
Signed-off-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bulk free of SKBs happen transparently by the API call napi_consume_skb().
The napi budget parameter is needed by napi_consume_skb() to detect
if called from netpoll.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bulk free of SKBs happen transparently by the API call napi_consume_skb().
The napi budget parameter is usually needed by napi_consume_skb()
to detect if called from netpoll. In this patch it has an extra meaning.
For mlx4 driver, the mlx4_en_stop_port() call is done outside
NAPI/softirq context, and cleanup the entire TX ring via
mlx4_en_free_tx_buf(). The code mlx4_en_free_tx_desc() for
freeing SKBs are shared with NAPI calls.
To handle this shared use the zero budget indication is reused,
and handled appropriately in napi_consume_skb(). To reflect this,
variable is called napi_mode for the function call that needed
this distinction.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For pcie nic, after setting link speed and there is no link driver does not need
to do phy reset until link up.
For some pcie nics, to do this will also reset phy speed down counter and prevent
phy from auto speed down.
This patch fix the issue reported in following link.
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1547151
Signed-off-by: Chunhao Lin <hau@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Firmware now tells us that the reset is done by passing a magic value
via register. Use it to shorten the wait in case this is supported.
With old firmware, we still wait until the timeout is reached.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On AT91 SoCs, the User Register (USRIO) exposes a switch to configure the
"Reduced" or "Traditional" version of the Media Independent Interface
(RMII vs. MII or RGMII vs. GMII).
As on the older EMAC version, on GMAC, this switch is set by default to the
non-reduced type of interface, so use the existing capability and extend it to
GMII as well. We then keep the current logic in the macb_init() function.
The capabilities of sama5d2, sama5d4 and sama5d3 GEM interface are updated in
the macb_config structure to be able to properly enable them with a traditional
interface (GMII or MII).
Reported-by: Romain HENRIET <romain.henriet@l-acoustics.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is OF-DPA specific, used only there, similar to
ofdpa_port->ageing_time. So move it to OF-DPA code.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the scenario where slowpath configuration isn't passing due to
various pause configurations affecting the chip, the theoretical time
required in worst-case-scenario to empty hw fifos sufficiently to
guarantee that slowpath configuration would flow is currently
insufficient.
This increases such a drain request to the theoretical maximum.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Handle a new message from the MFW, one that indicate that the transciever
state has changed, and log that into the system logs.
Signed-off-by: Zvi Nachmani <Zvi.Nachmani@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Driver interaction with the managemnt firmware is done via mailbox
commands which the management firmware periodically sample, as well
as placing of additional data in set places in the shared memory.
Each PF has a single designated mailbox address, and all flows that
require messaging to the management should use it.
This patch does 2 things:
1. It re-defines the critical section surrounding the mailbox sending -
that section should include the setting of the shared memory as well as
the sending of the command [otherwise a race might send a command with
the data of a different command].
2. It moves the locking scheme from using mutices into using spinlocks.
This lays the groundwork for sending MFW commands from non-sleepable
contexts.
Signed-off-by: Tomer Tayar <Tomer.Tayar@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When device is configured for Multi-function mode, some older management
firmware might incorrectly notify interfaces of link changes while they
haven't requested the physical link configuration to be set.
This can create bizzare race conditions where unloading interfaces are
getting notified that the link is up.
Let the driver compensate - store the logical requested state of the link
and don't propagate notifications after protocol driver explicitly
requires the link to be unset.
Signed-off-by: Sudarsana Reddy Kalluru <sudarsana.kalluru@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Don't hide varibles used by the logging macros.
Miscellanea:
o Use the more common ##__VA_ARGS__ extension
o Add missing newlines to formats
o Realign arguments
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In rocker, ageing time is a per-port attribute, so the next time the FDB
cleanup timer fires should be set according to the lowest ageing time.
This will later allow us to delete the BR_MIN_AGEING_TIME macro, which was
added to guarantee minimum ageing time in the bridge layer, thereby breaking
existing behavior.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit c62987bbd8 ("bridge: push bridge setting ageing_time down to
switchdev") added a check for minimum and maximum ageing time, but this
breaks existing behaviour where one can set ageing time to 0 for a
non-learning bridge.
Push this check down to the driver and allow the check in the bridge
layer to be removed. Currently ageing time 0 is refused by the driver,
but we can later add support for this functionality.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce offloading of skbedit mark action.
For example, to mark with 0x1234, all TCP (ip_proto 6) packets arriving
to interface ens9:
# tc qdisc add dev ens9 ingress
# tc filter add dev ens9 protocol ip parent ffff: \
flower ip_proto 6 \
indev ens9 \
action skbedit mark 0x1234
Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Parse tc_cls_flower_offload into device specific commands and program
the hardware to classify and act accordingly.
For example, to drop ICMP (ip_proto 1) packets from specific smac, dmac,
src_ip, src_ip, arriving to interface ens9:
# tc qdisc add dev ens9 ingress
# tc filter add dev ens9 protocol ip parent ffff: \
flower ip_proto 1 \
dst_mac 7c:fe:90:69:81:62 src_mac 7c:fe:90:69:81:56 \
dst_ip 11.11.11.11 src_ip 11.11.11.12 indev ens9 \
action drop
Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Extend ndo_setup_tc() to support ingress tc offloading. Will be used by
later patches to offload tc flower filter.
Feature is off by default and could be enabled by issuing:
# ethtool -K eth0 hw-tc-offload on
Offloads flow table is dynamically created when first filter is
added.
Rules are saved in a hash table that is maintained by the consumer (for
example - the flower offload in the next patch).
When last filter is removed and no filters exist in the hash table, the
offload flow table is destroyed.
Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the vlan and main flow tables to use priority 1. This will allow
the upcoming TC offload logic to use a higher priority (0) for the
offload steering table.
Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Restricting handle to TC_H_ROOT breaks the old instantiation of mqprio
to setup a hardware qdisc. This patch relaxes the test, to only check the
type.
Fixes: 08fb1da ("net/mlx5e: Support DCBNL IEEE ETS")
Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to handle flow table entry destinations only if the action
associated with the rule is forwarding (MLX5_FLOW_CONTEXT_ACTION_FWD_DEST).
Fixes: 26a8145390 ('net/mlx5_core: Introduce flow steering firmware commands')
Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the Makefile and Kconfig required to make the driver build.
Signed-off-by: John Crispin <blogic@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add ethernet support for MediaTek SoCs from the MT7623 family. These have
dual GMAC. Depending on the exact version, there might be a built-in
Gigabit switch (MT7530). The core does not have the typical DMA ring setup.
Instead there is a linked list that we add descriptors to. There is only
one linked list that both MACs use together. There is a special field
inside the TX descriptors called the VQID. This allows us to assign packets
to different internal queues. By using a separate id for each MAC we are
able to get deterministic results for BQL. Additionally we need to
provide the core with a block of scratch memory that is the same size as
the RX ring and data buffer. This is really needed to make the HW datapath
work. Although the driver does not support this yet, we still need to
assign the memory and tell the core about it for RX to work.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Michael Lee <igvtee@gmail.com>
Signed-off-by: John Crispin <blogic@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
'commit 55482edc25
("qede: Add slowpath/fastpath support and enable hardware GRO")'
introduces below error when compiling net-next with "make ARCH=x86_64"
drivers/built-in.o: In function `qede_rx_int':
qede_main.c:(.text+0x6101a0): undefined reference to `tcp_gro_complete'
Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
o While the driver is in the middle of a MB completion processing
and it receives a spurious MB interrupt, it is mistaken as a good MB
completion interrupt leading to premature completion of the next MB
request. Fix the driver to guard against this by checking the current
state of MB processing and ignore the spurious interrupt.
Also added a stats counter to record this condition.
Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
o atomic_t usage is incorrect as we are not implementing
any atomicity.
Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Queue Set Configuration code was always reserving room for a
Forwarded interrupt Queue even in the cases where we weren't using it.
Figure out how many Ports and Queue Sets we can support. This depends on
knowing our Virtual Function Resources and may be called a second time
if we fall back from MSI-X to MSI Interrupt Mode. This change fixes that
problem.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This avoids a race condition where a system that has network devices set up
to be automatically configured and we get the first Port Link Status
message from the firmware on the Asynchronous Firmware Event Queue before
we've enabled interrupts. If that happens, we end up losing the interrupt
and never realizing that the links has actually come up.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Iff dma_map_single() fails, 'rxdesc' should point to the last filled RX
descriptor, so that it can be marked as the last one, however the driver
would have already advanced it by that time. In order to fix that, only
fill an RX descriptor once all the data for it is ready.
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In a low memory situation, if netdev_alloc_skb() fails on a first RX ring
loop iteration in sh_eth_ring_format(), 'rxdesc' is still NULL. Avoid
kernel oops by adding the 'rxdesc' check after the loop.
Reported-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add pci_error_handler callbacks to support for pcie advanced error
recovery.
Signed-off-by: Satish Baddipadige <sbaddipa@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Include the more useful port statistics in ethtool -S for the PF device.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Include some of the port error counters (e.g. crc) in ->ndo_get_stats64()
for the PF device.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gather periodic port statistics if the device is PF and link is up. This
is triggered in bnxt_timer() every one second to request firmware to DMA
the counters.
Signed-off-by: Michael Chan <michael.chan@broadocm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow all autoneg speeds aupported by firmware to be advertised. If
the advertising parameter is 0, then all supported speeds will be
advertised.
Remove BNXT_ALL_COPPER_ETHTOOL_SPEED which is no longer used as all
supported speeds can be advertised.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The supported bits and advertising bits in ethtool have the same
definitions. The same is true for the firmware bits. So use the
common function to handle the conversion for both supported and
advertising bits.
v2: Don't use parentheses on function return.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
And report actual pause settings to ETHTOOL_GPAUSEPARAM to let ethtool
resolve the actual pause settings.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Include the conversion of pause bits and add one extra call layer so
that the same refactored function can be reused to get the link partner
advertisement bits.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Several cases of overlapping changes, as well as one instance
(vxlan) of a bug fix in 'net' overlapping with code movement
in 'net-next'.
Signed-off-by: David S. Miller <davem@davemloft.net>
It will always be passed if the soc is tested the loopback cases. This
patch will fix this bug.
Signed-off-by: Kejian Yan <yankejian@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
According to Documentation/power/devices.txt
The driver should not use device_set_wakeup_enable() which is the policy
for user to decide.
Using device_init_wakeup() to initialize dev->power.should_wakeup and
dev->power.can_wakeup on driver initialization.
And use device_may_wakeup() on suspend to decide if WoL function should
be enabled on NIC.
Reported-by: Diego Viola <diego.viola@gmail.com>
Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Otherwise it might be back on resume right after going to suspend in
some hardware.
Reported-by: Diego Viola <diego.viola@gmail.com>
Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org>
Signed-off-by: David S. Miller <davem@davemloft.net>