This change updates the mlx5 interface to create mkey
on the device.
The updates in the command mailbox include increasing the
access mode type field to 5 bits in order to support additional
types such as MLX5_MKC_ACCESS_MODE_MEMIC which represents device
memory access type and will be used when registering MR on allocated
device memory.
All the places that use the old access mode format are adjusted as
well.
Signed-off-by: Ariel Levkovich <lariel@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This patch adds querying of device memory capabilities by the mlx5_core
driver during initialization.
Device memory capabilities is a new capability type and structure
which contains the necessary data that is needed for future device
memory allocation.
The presence of this new capabilities struct is indicated in the
general capabilities struct which is queried first by the driver.
If the presence bit is set, the driver will also query the new
capabilities struct and save it in the device context.
Signed-off-by: Ariel Levkovich <lariel@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Add two new parameters: max_burst_sz and typical_pkt_size (both
in bytes) to rate limit configurations.
max_burst_sz: The device will schedule bursts of packets for an
SQ connected to this rate, smaller than or equal to this value.
Value 0x0 indicates packet bursts will be limited to the device
defaults. This field should be used if bursts of packets must be
strictly kept under a certain value.
typical_pkt_size: When the rate limit is intended for a stream of
similar packets, stating the typical packet size can improve the
accuracy of the rate limiter. The expected packet size will be
the same for all SQs associated with the same rate limit index.
Ethernet driver is updated according to this change, but these two
parameters will be kept as 0 due to lacking of proper way to get the
configurations from user space which requires to change
ndo_set_tx_maxrate interface.
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
MLX4_USER_DEV_CAP_LARGE_CQE (via mlx4_ib_alloc_ucontext_resp.dev_caps)
and MLX4_IB_QUERY_DEV_RESP_MASK_CORE_CLOCK_OFFSET (via
mlx4_uverbs_ex_query_device_resp.comp_mask) are copied directly to
userspace and form part of the uAPI.
Move them to the uapi header where they belong.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Due to bug fixes found by the syzkaller bot and taken into the for-rc
branch after development for the 4.17 merge window had already started
being taken into the for-next branch, there were fairly non-trivial
merge issues that would need to be resolved between the for-rc branch
and the for-next branch. This merge resolves those conflicts and
provides a unified base upon which ongoing development for 4.17 can
be based.
Conflicts:
drivers/infiniband/hw/mlx5/main.c - Commit 42cea83f95
(IB/mlx5: Fix cleanup order on unload) added to for-rc and
commit b5ca15ad7e (IB/mlx5: Add proper representors support)
add as part of the devel cycle both needed to modify the
init/de-init functions used by mlx5. To support the new
representors, the new functions added by the cleanup patch
needed to be made non-static, and the init/de-init list
added by the representors patch needed to be modified to
match the init/de-init list changes made by the cleanup
patch.
Updates:
drivers/infiniband/hw/mlx5/mlx5_ib.h - Update function
prototypes added by representors patch to reflect new function
names as changed by cleanup patch
drivers/infiniband/hw/mlx5/ib_rep.c - Update init/de-init
stage list to match new order from cleanup patch
Signed-off-by: Doug Ledford <dledford@redhat.com>
Currently ESN is not supported with IPSec device offload.
This patch adds ESN support to IPsec device offload.
Implementing new xfrm device operation to synchronize offloading device
ESN with xfrm received SN. New QP command to update SA state at the
following:
ESN 1 ESN 2 ESN 3
|-----------*-----------|-----------*-----------|-----------*
^ ^ ^ ^ ^ ^
^ - marks where QP command invoked to update the SA ESN state
machine.
| - marks the start of the ESN scope (0-2^32-1). At this point move SA
ESN overlap bit to zero and increment ESN.
* - marks the middle of the ESN scope (2^31). At this point move SA
ESN overlap bit to one.
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: Yossef Efraim <yossefe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
New function for getting driver internal sa entry from xfrm state.
All checks are done in one function.
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
In order to add a context to the FPGA, we need to get both the software
transform context (which includes the keys, etc) and the
source/destination IPs (which are included in the steering
rule). Therefore, we register new set of firmware like commands for
the FPGA. Each time a rule is added, the steering core infrastructure
calls the FPGA command layer. If the rule is intended for the FPGA,
it combines the IPs information with the software transformation
context and creates the respective hardware transform.
Afterwards, it calls the standard steering command layer.
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The current code has one layer that executed FPGA commands and
the Ethernet part directly used this code. Since downstream patches
introduces support for IPSec in mlx5_ib, we need to provide some
abstractions. This patch refactors the accel code into one layer
that creates a software IPSec transformation and another one which
creates the actual hardware context.
The internal command implementation is now hidden in the FPGA
core layer. The code also adds the ability to share FPGA hardware
contexts. If two contexts are the same, only a reference count
is taken.
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
This patch adds V2 command support.
New fpga devices support extended features (udp encap, esn etc...), this
features require new hardware sadb format therefore we have a new version
of commands to manipulate it.
Signed-off-by: Yossef Efraim <yossefe@mellanox.com>
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Current hardware decrypts and authenticates incoming ESP packets.
Subsequently, the software extracts the nexthdr field, truncates the
trailer and adjusts csum accordingly.
With this patch and a capable device, the trailer is being removed
by the hardware and the nexthdr field is conveyed via PET. This way
we avoid both the need to access the trailer (cache miss) and to
compute its relative checksum, which significantly improve
the performance.
Experiment shows that trailer removal improves the performance by
2Gbps, (netperf). Both forwarding and host-to-host configurations.
Signed-off-by: Yossi Kuperman <yossiku@mellanox.com>
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The current code assume only SA QP commands.
Refactor in order to pave the way for new QP commands:
1. Generic cmd response format.
2. SA cmd checks are in dedicated functions.
3. Aligned debug prints.
Signed-off-by: Yossi Kuperman <yossiku@mellanox.com>
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Fix build break of mlx5_accel_ipsec_device_caps is not defined when
MLX5_ACCEL is not selected, use MLX5_IPSEC_DEV instead which handles
such case.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reported-by: Doug Ledford <dledford@redhat.com>
Previously, deleting a flow steering entry only got the index.
Since the FPGA implementation of FTE's deletion might need to dig
inside the FTE itself, we would like to get the FTE's context.
Changing the interface to pass the FTE context.
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
fte objects contain the match value and action. Currently, extending
the actions require in adding them both to the API and fs_fte.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Currently, we don't support egress flow steering namespace in mlx5
flow steering core implementation. However, when we want to encrypt
a packet, we model it as a flow steering rule in the egress path.
To overcome this, we add an empty egress namespace to flow steering.
This namespace is initialized only when ipsec support exists.
In the future, this will grow to a full blown full steering
implementation, resembling the ingress path.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The shim layer allows each namespace to define possibly different
functionality for add/delete/update commands. The shim layer
introduced here, will be used to support flow steering with the FPGA.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The has_tag member will indicate whether a tag action was specified
in flow specification.
A flow tag 0 = MLX5_FS_DEFAULT_FLOW_TAG is assumed a valid flow tag
that is currently used by mlx5 RDMA driver, whereas in HW flow_tag = 0
means that the user doesn't care about flow_tag. HW always provide
a flow_tag = 0 if all flow tags requested on a specific flow are 0.
So we need a way (in the driver) to differentiate between a user really
requesting flow_tag = 0 and a user who does not care, in order to be
able to report conflicting flow tags on a specific flow.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Aviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Some flow steering namespace initialization (i.e. egress namespace)
might depend on FPGA capabilities. Changing the initialization order
such that the FPGA will be initialized before flow steering.
Flow steering fs cmds initialization might depend on
IPSec capabilities. Changing the initialization order such
that the IPSec will be initialized before flow steering as well.
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
This is already done by xfrm layer between state_dev_del callback
to state_dev_free callback.
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Generally, FPGA IPSec commands must always complete.
We want to wait for one minute for them to complete gracefully also
when killing a process.
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
All other mlx5_events report the port number as 1 based, which is how FW
reports it in the port event EQE. Reporting 0 for this event causes
mlx5_ib to not raise a fatal event notification to registered clients
due to a seemingly invalid port.
All switch cases in mlx5_ib_event that go through the port check are
supposed to set the port now, so just do it once at variable
declaration.
Fixes: 89d44f0a6c73("net/mlx5_core: Add pci error handlers to mlx5_core driver")
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Up until this point it wasn't possible to activate IB representors
when switching to switchdev mode, remove this limitation.
We trigger reload of the PF IB interface in order to make sure that
already allocated resources are invalid and new resources will be opened
correctly with all the limitations of switchdev mode applied (only raw
packet capabilities, without RoCE). We also move the remove/add to a
place where the E-Switch mode is set/unset to better control when to
trigger this action, this will allow the IB side to start in the correct
mode.
For better code reuse, create a function which reloads an interface and
export it.
Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Under switchdev mode we insert an eswitch miss rule causing any
unmatched traffic to be sent towards the PF vport. This miss rule can
be optimized if we break it to two, one case is for multicast traffic and
the other for unicast.
Breaking the miss rule into two (unicast and multicast) allows the firmware
to program the hardware in a more efficient way.
Using ConncetX-5 Ex with IXIA and testpmd (which use IB representors):
IXIA -> NIC -> PF -> IB representor -> NIC -> VF:
- Without this optimization: 9.2 MPPS.
- With this optimization: 18 MPPS.
VF -> NIC -> IB representor-> PF -> NIC -> IXIA:
- Without this optimization: 17 MPPS.
- With this optimization: 23.4 MPPS.
Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The max FTE number should be the max number of SQs that can be opened.
Ethernet representors open one SQ each. Once we add IB representor this
will increase (depends on the user). For now lets start with 31
per IB representor and if needed increase in the future.
This increase only affects the number of FTEs in the slow path FDB,
offloaded rules (done via TC on the fast path portion of the FDB)
aren't affected.
Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
In preparation for IB representors, move representors structs to a global
scope, also expose functions needed for registration, unregistration,
eswitch mode and creating a flow rule to direct traffic from SQs to the
right VF.
Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add a callback interface to get a protocol device (per representor type).
The Ethernet representors will expose their netdev via this interface.
This functionality can be later used by IB representor in order to find the
corresponding net device representor.
Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The current implementation of create CQ requires contiguous
memory, such requirement is problematic once the memory is
fragmented or the system is low in memory, it causes for
failures in dma_zalloc_coherent().
This patch implements new scheme of fragmented CQ to overcome
this issue by introducing new type: 'struct mlx5_frag_buf_ctrl'
to allocate fragmented buffers, rather than contiguous ones.
Base the Completion Queues (CQs) on this new fragmented buffer.
It fixes following crashes:
kworker/29:0: page allocation failure: order:6, mode:0x80d0
CPU: 29 PID: 8374 Comm: kworker/29:0 Tainted: G OE 3.10.0
Workqueue: ib_cm cm_work_handler [ib_cm]
Call Trace:
[<>] dump_stack+0x19/0x1b
[<>] warn_alloc_failed+0x110/0x180
[<>] __alloc_pages_slowpath+0x6b7/0x725
[<>] __alloc_pages_nodemask+0x405/0x420
[<>] dma_generic_alloc_coherent+0x8f/0x140
[<>] x86_swiotlb_alloc_coherent+0x21/0x50
[<>] mlx5_dma_zalloc_coherent_node+0xad/0x110 [mlx5_core]
[<>] ? mlx5_db_alloc_node+0x69/0x1b0 [mlx5_core]
[<>] mlx5_buf_alloc_node+0x3e/0xa0 [mlx5_core]
[<>] mlx5_buf_alloc+0x14/0x20 [mlx5_core]
[<>] create_cq_kernel+0x90/0x1f0 [mlx5_ib]
[<>] mlx5_ib_create_cq+0x3b0/0x4e0 [mlx5_ib]
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
EQ structure and API is private to mlx5_core driver only, external
drivers should not have access or the means to manipulate EQ objects.
Remove redundant exports and move API functions out of the linux/mlx5
include directory into the driver's mlx5_core.h private include file.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Gal Pressman <galp@mellanox.com>
Since CQ tree is now per EQ, CQ completion and event forwarding became
specific implementation of EQ logic, this patch moves that logic to eq.c
and makes those functions static.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Gal Pressman <galp@mellanox.com>
Now as the CQ table is per EQ, add an API to hold/put CQ to be used from
eq.c in downstream patch.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Gal Pressman <galp@mellanox.com>
Add API to add/del CQ to/from EQs CQ table to be used in cq.c upon CQ
creation/destruction, as CQ table is now private to eq.c.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Gal Pressman <galp@mellanox.com>
If a hardware event is targeting a CQ, that CQ should exist.
Add unlikely to error handling flows.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Gal Pressman <galp@mellanox.com>
Before this patch the driver had one CQ database protected via one
spinlock, this spinlock is meant to synchronize between CQ
adding/removing and CQ IRQ interrupt handling.
On a system with large number of CPUs and on a work load that requires
lots of interrupts, this global spinlock becomes a very nasty hotspot
and introduces a contention between the active cores, which will
significantly hurt performance and becomes a bottleneck that prevents
seamless cpu scaling.
To solve this we simply move the CQ database and its spinlock to be per
EQ (IRQ), thus per core.
Tested with:
system: 2 sockets, 14 cores per socket, hyperthreading, 2x14x2=56 cores
netperf command: ./super_netperf 200 -P 0 -t TCP_RR -H <server> -l 30 -- -r 300,300 -o -s 1M,1M -S 1M,1M
WITHOUT THIS PATCH:
Average: CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
Average: all 4.32 0.00 36.15 0.09 0.00 34.02 0.00 0.00 0.00 25.41
Samples: 2M of event 'cycles:pp', Event count (approx.): 1554616897271
Overhead Command Shared Object Symbol
+ 14.28% swapper [kernel.vmlinux] [k] intel_idle
+ 12.25% swapper [kernel.vmlinux] [k] queued_spin_lock_slowpath
+ 10.29% netserver [kernel.vmlinux] [k] queued_spin_lock_slowpath
+ 1.32% netserver [kernel.vmlinux] [k] mlx5e_xmit
WITH THIS PATCH:
Average: CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
Average: all 4.27 0.00 34.31 0.01 0.00 18.71 0.00 0.00 0.00 42.69
Samples: 2M of event 'cycles:pp', Event count (approx.): 1498132937483
Overhead Command Shared Object Symbol
+ 23.33% swapper [kernel.vmlinux] [k] intel_idle
+ 1.69% netserver [kernel.vmlinux] [k] mlx5e_xmit
Tested-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Gal Pressman <galp@mellanox.com>
Having these checks in ibmvnic_xmit causes problems with VLAN
tagging and balance-alb/tlb bonding modes. The restriction they
imposed can be removed.
Signed-off-by: John Allen <jallen@linux.vnet.ibm.com>
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For dwmac4, GMAC_INT_DEFAULT_ENABLE already includes
GMAC_INT_PMT_EN, so it is redundant to check if hw->pmt
is set, and if so, setting the bit again.
For dwmac1000, GMAC_INT_DEFAULT_MASK does not include
GMAC_INT_DISABLE_PMT, so it is redundant to check if
hw->pmt is set, and if so, clearing an already cleared bit.
Improve code readability by removing this redundant code.
Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
GMAC_INT_DEFAULT_MASK is written to the interrupt enable register.
In previous versions of the IP (e.g. dwmac1000), this register was
instead an interrupt mask register.
To improve clarity and reflect reality, rename GMAC_INT_DEFAULT_MASK
to GMAC_INT_DEFAULT_ENABLE.
Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The interrupt status register in both dwmac1000 and dwmac4 ignores
interrupt enable (for dwmac4) / interrupt mask (for dwmac1000).
Therefore, if we want to check only the bits that can actually trigger
an irq, we have to filter the interrupt status register manually.
Commit 0a764db103 ("stmmac: Discard masked flags in interrupt status
register") fixed this for dwmac1000. Fix the same issue for dwmac4.
Just like commit 0a764db103 ("stmmac: Discard masked flags in
interrupt status register"), this makes sure that we do not get
spurious link up/link down prints.
Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When allocating RX or TX buffer pools, the driver needs to provide a
unique mapping ID to firmware for each pool. This value is assigned
using a counter which is incremented after a new pool is created. The
ID can be an integer ranging from 1-255. When migrating to a device
that requests a different number of queues, this value was not being
reset properly. As a result, after enough migrations, the counter
exceeded the upper bound and pool creation failed. This is fixed by
resetting the counter to one in this case.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann says:
====================
pull-request: bpf 2018-02-09
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Two fixes for BPF sockmap in order to break up circular map references
from programs attached to sockmap, and detaching related sockets in
case of socket close() event. For the latter we get rid of the
smap_state_change() and plug into ULP infrastructure, which will later
also be used for additional features anyway such as TX hooks. For the
second issue, dependency chain is broken up via map release callback
to free parse/verdict programs, all from John.
2) Fix a libbpf relocation issue that was found while implementing XDP
support for Suricata project. Issue was that when clang was invoked
with default target instead of bpf target, then various other e.g.
debugging relevant sections are added to the ELF file that contained
relocation entries pointing to non-BPF related sections which libbpf
trips over instead of skipping them. Test cases for libbpf are added
as well, from Jesper.
3) Various misc fixes for bpftool and one for libbpf: a small addition
to libbpf to make sure it recognizes all standard section prefixes.
Then, the Makefile in bpftool/Documentation is improved to explicitly
check for rst2man being installed on the system as we otherwise risk
installing empty man pages; the man page for bpftool-map is corrected
and a set of missing bash completions added in order to avoid shipping
bpftool where the completions are only partially working, from Quentin.
4) Fix applying the relocation to immediate load instructions in the
nfp JIT which were missing a shift, from Jakub.
5) Two fixes for the BPF kernel selftests: handle CONFIG_BPF_JIT_ALWAYS_ON=y
gracefully in test_bpf.ko module and mark them as FLAG_EXPECTED_FAIL
in this case; and explicitly delete the veth devices in the two tests
test_xdp_{meta,redirect}.sh before dismantling the netnses as when
selftests are run in batch mode, then workqueue to handle destruction
might not have finished yet and thus veth creation in next test under
same dev name would fail, from Yonghong.
6) Fix test_kmod.sh to check the test_bpf.ko module path before performing
an insmod, and fallback to modprobe. Especially the latter is useful
when having a device under test that has the modules installed instead,
from Naresh.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The Cavium thunder nicvf driver supports rx/tx rings of up to 65536 entries per.
The number of entires are stored in the q_len member of struct q_desc_mem. The
problem is that q_len being a u16, results in 65536 becoming 0.
In getting pointers to descriptors in the rings, the driver uses q_len minus 1
as a mask after incrementing the pointer, in order to go back to the beginning
and not go past the end of the ring.
With the q_len set to 0 the mask is no longer correct and the driver does go
beyond the end of the ring, causing various ills. Usually the first thing that
shows up is a "NETDEV WATCHDOG: enP2p1s0f1 (nicvf): transmit queue 7 timed out"
warning.
This patch remedies the problem by changing q_len to a u32.
Signed-off-by: Dean Nelson <dnelson@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While handling a driver reset we get a H_CLOSED return trying
to send a CRQ event. When this occurs we need to queue up another
reset attempt. Without doing this we see instances where the driver
is left in a closed state because the reset failed and there is no
further attempts to reset the driver.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
DKMS and similar out-of-tree module replacement services use
module version to make sure the out-of-tree software is not
older than the module shipped with the kernel. We use the
kernel version in ethtool -i output, put it into MODULE_VERSION
as well.
Reported-by: Jan Gutter <jan.gutter@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Most FWs limit the number of TSO segments a frame can produce
to 64. This is for fairness and efficiency (of FW datapath)
reasons. If a frame with larger number of segments is submitted
the FW will drop it.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
All netdevs which can accept TC offloads must implement
.ndo_set_features(). nfp_reprs currently do not do that, which
means hw-tc-offload can be turned on and off even when offloads
are active.
Whether the offloads are active is really a question to nfp_ports,
so remove the per-app tc_busy callback indirection thing, and
simply count the number of offloaded items in nfp_port structure.
Fixes: 8a2768732a ("nfp: provide infrastructure for offloading flower based TC filters")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Tested-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>