Adds basic XDP support i.e attaching a BPF program to an
interface. Also takes care of allocating separate Tx queues
for XDP path and for network stack packet transmission.
This patch doesn't support handling of any of the XDP actions,
all are treated as XDP_PASS i.e packets will be handed over to
the network stack.
Changes also involve allocating one receive buffer per page in XDP
mode and multiple in normal mode i.e when no BPF program is attached.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adds support for page recycling for allocating receive buffers
to reduce cost of refilling RBDR ring. Also got rid of using
compound pages when pagesize is 4K, only order-0 pages now.
Only page is recycled, DMA mappings still needs to be done for
every receive buffer allocated due to following constraints
- Cannot have just one receive buffer per 64KB page.
- There is just one buffer ring shared across 8 Rx queues, so
buffers of same page can go to any Rx queue.
- HW gives buffer address where packet has been DMA'ed and not
the index into buffer ring.
This makes it not possible to resue DMA mapping info. So unfortunately
have to go through costly mapping route for every buffer.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ACPI support has been added to ARM IOMMU driver in 4.10 kernel
and that has resulted in VNIC interfaces throwing translation
faults when kernel is booted with ACPI as driver was not using
DMA API. This patch fixes the issue by using DMA API which inturn
will create translation tables when IOMMU is enabled.
Also VNIC doesn't have a seperate receive buffer ring per receive
queue, so there is no 1:1 descriptor index matching between CQE_RX
and the index in buffer ring from where a buffer has been used for
DMA'ing. Unlike other NICs, here it's not possible to maintain dma
address to virt address mappings within the driver. This leaves us
no other choice but to use IOMMU's IOVA address conversion API to
get buffer's virtual address which can be given to network stack
for processing.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adds support to set Rx/Tx queue sizes from ethtool. Fixes
an issue with retrieving queue size. Also sets SQ's CQ_LIMIT
based on configured Tx queue size such that HW doesn't process
SQEs when there is no sufficient space in CQ.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Transmit queue timeout issue is seen in two cases
- Due to a race condition btw setting stop_queue at xmit()
and checking for stopped_queue in NAPI poll routine, at times
transmission from a SQ comes to a halt. This is fixed
by using barriers and also added a check for SQ free descriptors,
incase SQ is stopped and there are only CQE_RX i.e no CQE_TX.
- Contrary to an assumption, a HW errata where HW doesn't stop transmission
even though there are not enough CQEs available for a CQE_TX is
not fixed in T88 pass 2.x. This results in a Qset error with
'CQ_WR_FULL' stalling transmission. This is fixed by adjusting
RXQ's RED levels for CQ level such that there is always enough
space left for CQE_TXs.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch enables moving average calculation of Rx pkt's resources
and configures RED and backpressure levels for both CQ and RBDR.
Also initialize SQ's CQ_LIMIT properly.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes multiple issues
1. Convert all driver statistics to percpu counters for accuracy.
2. To avoid multiple CQEs posted by a TSO packet appended to HW,
TSO pkt's SQE has 'post_cqe' not set but a dummy SQE is added
for getting HW transmit completion notification. This dummy
SQE has 'dont_send' set and HW drops the pkt pointed to in this
thus Tx drop counter increases. This patch fixes this by subtracting
SW tx tso counter from HW Tx drop counter for actual packet drop counter.
3. Reset all individual queue's and VNIC HW stats when interface is going down.
4. Getrid off unnecessary counters in hot path.
5. Bringout all CQE error stats i.e both Rx and Tx.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Programming LMAC credits taking 9K frame size by default is incorrect
as for an interface which is one of the many on the same BGX/QLM
no of credits available will be less as Tx FIFO will be divided
across all interfaces. So let's say a BGX with 40G interface and another
BGX with multiple 10G, bandwidth of 10G interfaces will be effected when
traffic is running on both 40G and 10G interfaces simultaneously.
This patch fixes this issue by programming credits based on netdev's MTU.
Also fixed configuring MTU to HW and added CQE counter for pkts which
exceed this value.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
81xx has only 4 CPUs, so it doesn't make sense to initialize
entire Qset i.e 8 queues by default. Made changes to queue
initialization to init queues equal to number of CPUs or
8 queues whichever is lesser. Also this will be applicable to
VMs with VNIC VF attached and having less VCPUs
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Counting rx packets for every CQE_RX in CQ irq handler is incorrect.
Synchronization is missing when multiple queues are receiving packets
simultaneously. Like transmit packet stats use HW stats here.
Also removed unused 'cqe_type' parameter in nicvf_rcv_pkt_handler().
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This feature is introduced in pass-2 chip and with this CQ interrupt
coalescing will work based on both timer and count.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since we have moved on to using allocated pages to carve receive
buffers instead of netdev_alloc_skb() there is no need to store
any pointers for later retrieval. Earlier we had to store
skb and skb->data pointers which later are used to handover
received packet to network stack.
This will avoid an unnecessary cache miss as well.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Properly set CQ timer threshold and also set it to 2us.
With previous incorrect settings it was set to 0.5us which is too less.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rework interrupt handler to avoid checking IRQ affinity of
CQ interrupts. Now separate handlers are registered for each IRQ
including RBDR. Register interrupt handlers for only those
which are being used. Add nicvf_dump_intr_status() and use it
in irq handlers.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch configures HW to strip 802.1Q header if found in a
receiving packet. The stripped VLAN ID and TCI information is
passed on to software via CQE_RX. Also sets netdev's 'vlan_features'
so that other HW offload features can be used for tagged packets.
This offload feature can be enabled or disabled via ethtool.
Network stack normally ignores RPS for 802.1Q packets and hence low
throughput. With this offload enabled throughput for tagged packets
will be almost same as normal packets.
Note: This patch doesn't enable HW VLAN insertion for transmit packets.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Added ethtool support to dump receive packet error statistics reported
in CQE. Also made some small fixes
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With earlier configured value sufficient number of CQEs are not
being reserved for transmitted packets. Hence under heavy incoming
traffic load, receive notifications will take away most of the CQ
thus transmit notifications will be lost resulting in tx skbs not
being freed.
Finally SQ will be full and it will be stopped, watchdog timer
will kick in. After this fix receive notifications will not take
morethan half of CQ reserving the rest for transmit notifications.
Also changed CQ & SQ sizes from 16k to 4k.
This is also due to the receive notifications taking first half of
CQ under heavy load and time taken by NAPI to clear transmit notifications
will increase with higher queue sizes. Again results in SQ being stopped.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for the Cavium ThunderX network controller.
The driver is on the pci bus and thus requires the Thunder PCIe host
controller driver to be enabled.
Signed-off-by: Maciej Czekaj <mjc@semihalf.com>
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@caviumnetworks.com>
Signed-off-by: Aleksey Makarov <aleksey.makarov@caviumnetworks.com>
Signed-off-by: Tomasz Nowicki <tomasz.nowicki@linaro.org>
Signed-off-by: Robert Richter <rrichter@cavium.com>
Signed-off-by: Kamil Rytarowski <kamil@semihalf.com>
Signed-off-by: Thanneeru Srinivasulu <tsrinivasulu@caviumnetworks.com>
Signed-off-by: Sruthi Vangala <svangala@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>