Commit Graph

261 Commits

Author SHA1 Message Date
David S. Miller
6f6e434aa2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
S390 bpf_jit.S is removed in net-next and had changes in 'net',
since that code isn't used any more take the removal.

TLS data structures split the TX and RX components in 'net-next',
put the new struct members from the bug fix in 'net' into the RX
part.

The 'net-next' tree had some reworking of how the ERSPAN code works in
the GRE tunneling code, overlapping with a one-line headroom
calculation fix in 'net'.

Overlapping changes in __sock_map_ctx_update_elem(), keep the bits
that read the prog members via READ_ONCE() into local variables
before using them.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-21 16:01:54 -04:00
Manish Chopra
8a8633978b qede: Add build_skb() support.
This patch makes use of build_skb() throughout in driver's receieve
data path [HW gro flow and non HW gro flow]. With this, driver can
build skb directly from the page segments which are already mapped
to the hardware instead of allocating new SKB via netdev_alloc_skb()
and memcpy the data which is quite costly.

This really improves performance (keeping same or slight gain in rx
throughput) in terms of CPU utilization which is significantly reduced
[almost half] in non HW gro flow where for every incoming MTU sized
packet driver had to allocate skb, memcpy headers etc. Additionally
in that flow, it also gets rid of bunch of additional overheads
[eth_get_headlen() etc.] to split headers and data in the skb.

Tested with:
system: 2 sockets, 4 cores per socket, hyperthreading, 2x4x2=16 cores
iperf [server]: iperf -s
iperf [client]: iperf -c <server_ip> -t 500 -i 10 -P 32

HW GRO off – w/o build_skb(), throughput: 36.8 Gbits/sec

Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
Average:     all    0.59    0.00   32.93    0.00    0.00   43.07    0.00    0.00   23.42

HW GRO off - with build_skb(), throughput: 36.9 Gbits/sec

Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
Average:     all    0.70    0.00   31.70    0.00    0.00   25.68    0.00    0.00   41.92

HW GRO on - w/o build_skb(), throughput: 36.9 Gbits/sec

Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
Average:     all    0.86    0.00   24.14    0.00    0.00    6.59    0.00    0.00   68.41

HW GRO on - with build_skb(), throughput: 37.5 Gbits/sec

Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
Average:     all    0.87    0.00   23.75    0.00    0.00    6.19    0.00    0.00   69.19

Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 17:06:53 -04:00
Michal Kalderon
91dfd02b23 qede: Fix ref-cnt usage count
Rebooting while qedr is loaded with a VLAN interface present
results in unregister_netdevice waiting for the usage count
to become free.
The fix is that rdma devices should be removed before unregistering
the netdevice, to assure all references to ndev are decreased.

Fixes: cee9fbd8e2 ("qede: Add qedr framework")
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-13 20:26:24 -04:00
David S. Miller
b2d6cee117 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
The bpf syscall and selftests conflicts were trivial
overlapping changes.

The r8169 change involved moving the added mdelay from 'net' into a
different function.

A TLS close bug fix overlapped with the splitting of the TLS state
into separate TX and RX parts.  I just expanded the tests in the bug
fix from "ctx->conf == X" into "ctx->tx_conf == X && ctx->rx_conf
== X".

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-11 20:53:22 -04:00
Michal Kalderon
090477e4ac qede: Fix gfp flags sent to rdma event node allocation
A previous commit 4609adc271 ("qede: Fix qedr link update")
added a flow that could allocate rdma event objects from an
interrupt path (link notification). Therefore the kzalloc call
should be done with GFP_ATOMIC.

fixes: 4609adc271 ("qede: Fix qedr link update")
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Sudarsana Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-10 15:22:35 -04:00
Sudarsana Reddy Kalluru
0bc5fe8572 qed*: Refactor mf_mode to consist of bits.
`mf_mode' field indicates the multi-partitioning mode the device is
configured to. This method doesn't scale very well, adding a new MF mode
requires going over all the existing conditions, and deciding whether those
are needed for the new mode or not.
The patch defines a set of bit-fields for modes which are derived according
to the mode info shared by the MFW and all the configuration would be made
according to those. To add a new mode, there would be a single place where
we'll need to go and choose which bits apply and which don't.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-07 23:46:10 -04:00
Shahed Shaikh
809c45a091 qed* : Add new TLV to request PF to update MAC in bulletin board
There may be a need for VF driver to request PF to explicitly update its
bulletin with a MAC address.
e.g. When user assigns a MAC address to VF while VF is still down,
and PF's bulletin board contains different MAC address, in this case,
when VF's interface is brought up, it gets loaded with MAC address from
bulletin board which is not desirable.

To handle this corner case, we need a new TLV to request PF to update
its bulletin board with suggested MAC.

This request will be honored only for trusted VFs.

Signed-off-by: Shahed Shaikh <shahed.shaikh@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-20 11:26:37 -04:00
Shahed Shaikh
7425d8220f qed* : use trust mode to allow VF to override forced MAC
As per existing behavior, when PF sets a MAC address for a VF
(also called as forced MAC), VF is not allowed to change its
MAC address afterwards.
This puts the limitation on few use cases such as bonding of VFs,
where bonding driver asks VF to change its MAC address.

This patch uses a VF trust mode to allow VF to change its MAC address
in spite PF has set a forced MAC for that VF.

Signed-off-by: Shahed Shaikh <shahed.shaikh@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-20 11:26:36 -04:00
David S. Miller
c0b458a946 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Minor conflicts in drivers/net/ethernet/mellanox/mlx5/core/en_rep.c,
we had some overlapping changes:

1) In 'net' MLX5E_PARAMS_LOG_{SQ,RQ}_SIZE -->
   MLX5E_REP_PARAMS_LOG_{SQ,RQ}_SIZE

2) In 'net-next' params->log_rq_size is renamed to be
   params->log_rq_mtu_frames.

3) In 'net-next' params->hard_mtu is added.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01 19:49:34 -04:00
Manish Chopra
58f101bf87 qede: Do not drop rx-checksum invalidated packets.
Today, driver drops received packets which are indicated as
invalid checksum by the device. Instead of dropping such packets,
pass them to the stack with CHECKSUM_NONE indication in skb.

Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29 14:32:15 -04:00
Sudarsana Reddy Kalluru
ccfa110cfe qede: Ethtool flash update support.
The patch adds ethtool callback implementation for flash update.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-29 14:29:56 -04:00
Manish Chopra
b9fc828deb qede: Fix barrier usage after tx doorbell write.
Since commit c5ad119fb6
("net: sched: pfifo_fast use skb_array") driver is exposed
to an issue where it is hitting NULL skbs while handling TX
completions. Driver uses mmiowb() to flush the writes to the
doorbell bar which is a write-combined bar, however on x86
mmiowb() does not flush the write combined buffer.

This patch fixes this problem by replacing mmiowb() with wmb()
after the write combined doorbell write so that writes are
flushed and synchronized from more than one processor.

V1->V2:
-------
This patch was marked as "superseded" in patchwork.
(Not really sure for what reason).Resending it as v2.

Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Manish Chopra <manish.chopra@cavium.com>

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27 13:11:46 -04:00
Colin Ian King
3f2176dd7f qede: fix spelling mistake: "registeration" -> "registration"
Trivial fix to spelling mistakes in DP_ERR error message text and
comments

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-21 18:09:46 -04:00
Michal Kalderon
4609adc271 qede: Fix qedr link update
Link updates were not reported to qedr correctly.
Leading to cases where a link could be down, but qedr
would see it as up.
In addition, once qede was loaded, link state would be up,
regardless of the actual link state.

Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-16 12:15:24 -04:00
David S. Miller
7f0b800048 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2018-01-07

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Add a start of a framework for extending struct xdp_buff without
   having the overhead of populating every data at runtime. Idea
   is to have a new per-queue struct xdp_rxq_info that holds read
   mostly data (currently that is, queue number and a pointer to
   the corresponding netdev) which is set up during rxqueue config
   time. When a XDP program is invoked, struct xdp_buff holds a
   pointer to struct xdp_rxq_info that the BPF program can then
   walk. The user facing BPF program that uses struct xdp_md for
   context can use these members directly, and the verifier rewrites
   context access transparently by walking the xdp_rxq_info and
   net_device pointers to load the data, from Jesper.

2) Redo the reporting of offload device information to user space
   such that it works in combination with network namespaces. The
   latter is reported through a device/inode tuple as similarly
   done in other subsystems as well (e.g. perf) in order to identify
   the namespace. For this to work, ns_get_path() has been generalized
   such that the namespace can be retrieved not only from a specific
   task (perf case), but also from a callback where we deduce the
   netns (ns_common) from a netdevice. bpftool support using the new
   uapi info and extensive test cases for test_offload.py in BPF
   selftests have been added as well, from Jakub.

3) Add two bpftool improvements: i) properly report the bpftool
   version such that it corresponds to the version from the kernel
   source tree. So pick the right linux/version.h from the source
   tree instead of the installed one. ii) fix bpftool and also
   bpf_jit_disasm build with bintutils >= 2.9. The reason for the
   build breakage is that binutils library changed the function
   signature to select the disassembler. Given this is needed in
   multiple tools, add a proper feature detection to the
   tools/build/features infrastructure, from Roman.

4) Implement the BPF syscall command BPF_MAP_GET_NEXT_KEY for the
   stacktrace map. It is currently unimplemented, but there are
   use cases where user space needs to walk all stacktrace map
   entries e.g. for dumping or deleting map entries w/o having to
   close and recreate the map. Add BPF selftests along with it,
   from Yonghong.

5) Few follow-up cleanups for the bpftool cgroup code: i) rename
   the cgroup 'list' command into 'show' as we have it for other
   subcommands as well, ii) then alias the 'show' command such that
   'list' is accepted which is also common practice in iproute2,
   and iii) remove couple of newlines from error messages using
   p_err(), from Jakub.

6) Two follow-up cleanups to sockmap code: i) remove the unused
   bpf_compute_data_end_sk_skb() function and ii) only build the
   sockmap infrastructure when CONFIG_INET is enabled since it's
   only aware of TCP sockets at this time, from John.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-07 21:26:31 -05:00
Jesper Dangaard Brouer
c0124f327e xdp/qede: setup xdp_rxq_info and intro xdp_rxq_info_is_reg
The driver code qede_free_fp_array() depend on kfree() can be called
with a NULL pointer. This stems from the qede_alloc_fp_array()
function which either (kz)alloc memory for fp->txq or fp->rxq.
This also simplifies error handling code in case of memory allocation
failures, but xdp_rxq_info_unreg need to know the difference.

Introduce xdp_rxq_info_is_reg() to handle if a memory allocation fails
and detect this is the failure path by seeing that xdp_rxq_info was
not registred yet, which first happens after successful alloaction in
qede_init_fp().

Driver hook points for xdp_rxq_info:
 * reg  : qede_init_fp
 * unreg: qede_free_fp_array

Tested on actual hardware with samples/bpf program.

V2: Driver have no proper error path for failed XDP RX-queue info reg, as
qede_init_fp() is a void function.

Cc: everest-linux-l2@cavium.com
Cc: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-05 15:21:21 -08:00
Tomer Tayar
41e87c91f4 qed*: Advance drivers' version to 8.33.0.20
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Chad Dupuis <Chad.Dupuis@cavium.com>
Signed-off-by: Manish Rangankar <Manish.Rangankar@cavium.com>
Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-02 13:59:16 -05:00
Tomer Tayar
da09091732 qed*: Utilize FW 8.33.1.0
Advance the qed* drivers to use firmware 8.33.1.0:
Modify core driver (qed) to utilize the new FW and initialize the device
with it. This is the lion's share of the patch, and includes changes to FW
interface files, device initialization flows, FW interaction flows, and
debug collection flows.
Modify Ethernet driver (qede) to make use of new FW in fastpath.
Modify RoCE/iWARP driver (qedr) to make use of new FW in fastpath.
Modify FCoE driver (qedf) to make use of new FW in fastpath.
Modify iSCSI driver (qedi) to make use of new FW in fastpath.

Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Yuval Bason <Yuval.Bason@cavium.com>
Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Manish Chopra <Manish.Chopra@cavium.com>
Signed-off-by: Chad Dupuis <Chad.Dupuis@cavium.com>
Signed-off-by: Manish Rangankar <Manish.Rangankar@cavium.com>
Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-02 13:59:16 -05:00
Tomer Tayar
21dd79e82f qed*: HSI renaming for different types of HW
This patch renames defines and structures in the FW HSI files to allow a
distinction between different types of HW.

Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Chad Dupuis <Chad.Dupuis@cavium.com>
Signed-off-by: Manish Rangankar <Manish.Rangankar@cavium.com>
Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-02 13:59:15 -05:00
Michael Chan
18c602dee4 qede: Use NETIF_F_GRO_HW.
Advertise NETIF_F_GRO_HW and set edev->gro_disable according to the
feature flag.  Add qede_fix_features() to drop NETIF_F_GRO_HW if
XDP is running or MTU does not support GRO_HW or GRO is not set.
qede_change_mtu() also checks and disables GRO_HW if MTU is not
supported.

Cc: Ariel Elior <Ariel.Elior@cavium.com>
Cc: everest-linux-l2@cavium.com
Acked-by: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-19 10:38:37 -05:00
Jakub Kicinski
bd0b2e7fe6 net: xdp: make the stack take care of the tear down
Since day one of XDP drivers had to remember to free the program
on the remove path.  This leads to code duplication and is error
prone.  Make the stack query the installed programs on unregister
and if something is installed, remove the program.  Freeing of
program attached to XDP generic is moved from free_netdev() as well.

Because the remove will now be called before notifiers are
invoked, BPF offload state of the program will not get destroyed
before uninstall.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-03 00:27:57 +01:00
Jakub Kicinski
f4e63525ee net: bpf: rename ndo_xdp to ndo_bpf
ndo_xdp is a control path callback for setting up XDP in the
driver.  We can reuse it for other forms of communication
between the eBPF stack and the drivers.  Rename the callback
and associated structures and definitions.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-05 22:26:18 +09:00
Daniel Borkmann
de8f3a83b0 bpf: add meta pointer for direct access
This work enables generic transfer of metadata from XDP into skb. The
basic idea is that we can make use of the fact that the resulting skb
must be linear and already comes with a larger headroom for supporting
bpf_xdp_adjust_head(), which mangles xdp->data. Here, we base our work
on a similar principle and introduce a small helper bpf_xdp_adjust_meta()
for adjusting a new pointer called xdp->data_meta. Thus, the packet has
a flexible and programmable room for meta data, followed by the actual
packet data. struct xdp_buff is therefore laid out that we first point
to data_hard_start, then data_meta directly prepended to data followed
by data_end marking the end of packet. bpf_xdp_adjust_head() takes into
account whether we have meta data already prepended and if so, memmove()s
this along with the given offset provided there's enough room.

xdp->data_meta is optional and programs are not required to use it. The
rationale is that when we process the packet in XDP (e.g. as DoS filter),
we can push further meta data along with it for the XDP_PASS case, and
give the guarantee that a clsact ingress BPF program on the same device
can pick this up for further post-processing. Since we work with skb
there, we can also set skb->mark, skb->priority or other skb meta data
out of BPF, thus having this scratch space generic and programmable
allows for more flexibility than defining a direct 1:1 transfer of
potentially new XDP members into skb (it's also more efficient as we
don't need to initialize/handle each of such new members). The facility
also works together with GRO aggregation. The scratch space at the head
of the packet can be multiple of 4 byte up to 32 byte large. Drivers not
yet supporting xdp->data_meta can simply be set up with xdp->data_meta
as xdp->data + 1 as bpf_xdp_adjust_meta() will detect this and bail out,
such that the subsequent match against xdp->data for later access is
guaranteed to fail.

The verifier treats xdp->data_meta/xdp->data the same way as we treat
xdp->data/xdp->data_end pointer comparisons. The requirement for doing
the compare against xdp->data is that it hasn't been modified from it's
original address we got from ctx access. It may have a range marking
already from prior successful xdp->data/xdp->data_end pointer comparisons
though.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 13:36:44 -07:00
Rahul Verma
bf5a94bfe2 qed: Read per queue coalesce from hardware
Retrieve the actual coalesce value from hardware for every Rx/Tx
queue, instead of Rx/Tx coalesce value cached during set coalesce.

Signed-off-by: Rahul Verma <Rahul.Verma@cavium.com>
Signed-off-by: Yuval Mintz <yuval.mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-27 00:05:22 -07:00
Rahul Verma
477f2d1460 qed: Add support for vf coalesce configuration.
This patch add the ethtool support to set RX/Tx coalesce
value to the VF associated Rx/Tx queues.

Signed-off-by: Rahul Verma <Rahul.Verma@cavium.com>
Signed-off-by: Yuval Mintz <yuval.mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-27 00:05:22 -07:00
Sudarsana Reddy Kalluru
c3dc48f786 qede: Add ethtool support for Energy efficient ethernet.
The patch adds ethtool callback implementations for querying/configuring
the Energy Efficient Ethernet (EEE) parameters.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <yuval.mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-27 00:05:22 -07:00
Chopra, Manish
3f2a2b8b7a qed/qede: Add setter APIs support for RX flow classification
This patch adds support for adding and deleting rx flow
classification rules. Using this user can classify RX flow
constituting of TCP/UDP 4-tuples [src_ip/dst_ip and src_port/dst_port]
to be steered on a given RX queue

Signed-off-by: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: Yuval Mintz <yuval.mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-27 00:05:22 -07:00
Chopra, Manish
ec9b8dbd82 qede: Add getter APIs support for RX flow classification
This patch adds support for ethtool getter APIs to query
RX flow classification rules.

Signed-off-by: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: Yuval Mintz <yuval.mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-27 00:05:22 -07:00
Michal Kalderon
bbfcd1e8e1 qed*: Set rdma generic functions prefix
Rename the functions common to both iWARP and RoCE to have a prefix of
_rdma_ instead of _roce_.

Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20 12:34:08 -04:00
Michal Kalderon
b262a06e64 qed*: qede_roce.[ch] -> qede_rdma.[ch]
Once we have iWARP support, the qede portion of the qedr<->qede would
serve all the RDMA protocols - so rename the file to be appropriate
to its function.

While we're at it, we're also moving a couple of inclusions to it into
.h files and adding includes to make sure it contains all type
definitions it requires.

Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20 12:34:07 -04:00
Mintz, Yuval
1a4a69751f qed: Chain support for external PBL
iWARP would require the chains to allocate/free their PBL memory
independently, so add the infrastructure to provide it externally.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20 12:34:06 -04:00
Martin KaFai Lau
22e0d75f43 bpf: qede: Report bpf_prog ID during XDP_QUERY_PROG
Add support to qede to report bpf_prog ID during XDP_QUERY_PROG.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Mintz Yuval <Yuval.Mintz@cavium.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-16 11:58:37 -04:00
Johannes Berg
59ae1d127a networking: introduce and use skb_put_data()
A common pattern with skb_put() is to just want to memcpy()
some data into the new space, introduce skb_put_data() for
this.

An spatch similar to the one for skb_put_zero() converts many
of the places using it:

    @@
    identifier p, p2;
    expression len, skb, data;
    type t, t2;
    @@
    (
    -p = skb_put(skb, len);
    +p = skb_put_data(skb, data, len);
    |
    -p = (t)skb_put(skb, len);
    +p = skb_put_data(skb, data, len);
    )
    (
    p2 = (t2)p;
    -memcpy(p2, data, len);
    |
    -memcpy(p, data, len);
    )

    @@
    type t, t2;
    identifier p, p2;
    expression skb, data;
    @@
    t *p;
    ...
    (
    -p = skb_put(skb, sizeof(t));
    +p = skb_put_data(skb, data, sizeof(t));
    |
    -p = (t *)skb_put(skb, sizeof(t));
    +p = skb_put_data(skb, data, sizeof(t));
    )
    (
    p2 = (t2)p;
    -memcpy(p2, data, sizeof(*p));
    |
    -memcpy(p, data, sizeof(*p));
    )

    @@
    expression skb, len, data;
    @@
    -memcpy(skb_put(skb, len), data, len);
    +skb_put_data(skb, data, len);

(again, manually post-processed to retain some comments)

Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-16 11:48:37 -04:00
Mintz, Yuval
e7b80dece8 qede: VF XDP support
This introduces 2 changes needed for XDP to be supported for VFs:

 a. On VF-side, publish the NDO based on qed outputs

 b. On PF-side, request qed to allocate sufficient cids per-VF
    to allow the child vfs to support it

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-04 23:08:31 -04:00
Mintz, Yuval
f604b17d7f qed*: L2 interface to use the SB structures directly
Part of an effort of a cleaner seperation between qed and the protocol
drivers, the L2 interface is to use the SB structure for initialization
purposes opaquely.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-04 23:08:30 -04:00
Mintz, Yuval
712c3cbf19 qed: Replace set_id() api with set_name()
Current API between qed and protocol modules allows passing an
additional private string - but it doesn't get utilized by qed
anywhere.

Clarify the API by removing it and renaming it 'set_name'.

CC: Manish Rangankar <Manish.Rangankar@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-24 15:17:19 -04:00
Mintz, Yuval
6bc9f234ff qede: Log probe of PCI device
Replace meaningless logged print ('Ending successfully qede probe')
with a single-liner containing interesting information about probed
device.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-24 15:17:19 -04:00
Manish Chopra
48848a0690 qede: Fix sparse warnings
Solves the following warning in qede -
 - Several cases of missing cpu_to_le16() conversions
 - Adds 'static' to one function declaration
 - Removes dcbnl operation that's currently getting populated twice

Signed-off-by: Manish Chopra <Manish.Chopra@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-24 15:17:19 -04:00
Miroslav Lichvar
e341257548 net: ethernet: update drivers to handle HWTSTAMP_FILTER_NTP_ALL
Include HWTSTAMP_FILTER_NTP_ALL in net_hwtstamp_validate() as a valid
filter and update drivers which can timestamp all packets, or which
explicitly list unsupported filters instead of using a default case, to
handle the filter.

CC: Richard Cochran <richardcochran@gmail.com>
CC: Willem de Bruijn <willemb@google.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-21 13:37:32 -04:00
Sudarsana Reddy Kalluru
9ac4c546ef qede: Support 1G advertisment.
Some variants of adapters support the 1G speed capability. Need to
allow the configuration of 1G speed if adapter supports it.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-21 12:56:53 -04:00
Mintz, Yuval
2e7022d64e qede: qedr closure after setting state
This is benign, but it makes more sense to start the close sequence
only after changing the internal state [in case it would once care].

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-21 12:56:53 -04:00
Mintz, Yuval
492a1d9811 qede: Don't use an internal MAC field
Driver maintains its primary MAC in a private field which
gets updated when ndo_dev_set_mac() gets called.

However, there are flows where the primary MAC of the device can change
without said NDO being called [bond device in TLB mode configuring
slaves' addresses], resulting in a configuration where there's a mismatch
between what's apparent to user [the netdevice's value] and what's
configured in the HW [the private value].

As we don't have any real motivation of maintaining this
private field, simply remove it and start using the netdevice's
field instead.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-21 12:56:53 -04:00
Sudarsana Reddy Kalluru
71851ea5fa qede: Add missing Status-block free
When destroying the datapath channels, qede doesn't notify qed of the
released status blocks which were acquired during the initialization.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-21 12:56:53 -04:00
Sudarsana Reddy Kalluru
5a052d62ab qede: Honor user request for Tx buffers
Driver always allocates the maximal number of tx-buffers irrespective of
actual Tx ring config.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-21 12:56:53 -04:00
Mintz, Yuval
ba798b5b6d qede: Allow WoL to activate by default
When management firmware declares that the device is WoL-capable,
the default driver behavior would be to allow the management firmware
to take the decision of whether it's actually needed or not.

Problem is ethtool interface doesn't have a 'default' kind
of option, and user would see the interface WoL as disabled,
which doesn't accurately reflect the actual configuration.
More-so, if the user actually wants to explicitly disable WoL he'd have
to first enable it [otherwise ethtool would block the command].

Instead of allowing management to make the decision, enable WoL by
default on all devices capable of it.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-21 12:56:53 -04:00
Mintz, Yuval
be47c55557 qede: Split PF/VF ndos.
PFs and VFs share the same structure of NDOs today,
and the VFs explicitly fails the ndo_xdp() callback stating
it doesn't support XDP.

This results in lots of:

  [qede_xdp:1032(enp131s2)]VFs don't support XDP
  ------------[ cut here ]------------
  WARNING: CPU: 4 PID: 1426 at net/core/rtnetlink.c:1637 rtnl_dump_ifinfo+0x354/0x3c0
  ...
  Call Trace:
    ? __alloc_skb+0x9b/0x1d0
    netlink_dump+0x122/0x290
    netlink_recvmsg+0x27d/0x430
    sock_recvmsg+0x3d/0x50
  ...

As every dump request for the VF interface info would fail due to
rtnl_xdp_fill() returning an error code.

To resolve this, introduce a subset of the NDOs meant for the VF
in a seperate structure and register that one instead for VFs,
and omit the ndo_xdp initialization.

Fixes: 40b8c45492 ("qede: Prevent VFs from using XDP")
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-09 11:24:22 -04:00
Suddarsana Reddy Kalluru
92c43eb416 qede: Fix XDP memory leak on unload
When (re|un)loading, Tx-queues belonging to XDP would not get freed.

Fixes: cb6aeb0792 ("qede: Add support for XDP_TX")
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-09 11:24:22 -04:00
sudarsana.kalluru@cavium.com
161adb046b qede: Fix possible misconfiguration of advertised autoneg value.
Fail the configuration of advertised speed-autoneg value if the config
update is not supported.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-04 12:31:03 -04:00
sudarsana.kalluru@cavium.com
f870a3c672 qed*: Fix possible overflow for status block id field.
Value for status block id could be more than 256 in 100G mode, need to
update its data type from u8 to u16.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-04 12:31:02 -04:00
sudarsana.kalluru@cavium.com
8d3f87d8cd qed*: Fix issues in the ptp filter config implementation.
PTP hardware filter configuration performed by the driver for a given
user requested config is not correct for some of the PTP modes.
Following changes are needed for PTP config-filter implementation.
 1. NIG_REG_TX_PTP_EN register - Bits 0/1/2 respectively enables
    TimeSync/"V1 frame format support"/"V2 frame format support" on
    the TX side. Set the associated bits based on the user request.
 2. ptp4l application fails to operate in Peer Delay mode. Following
    changes are needed to fix this,
    a. Driver should enable (set to 0) DA #1-related bits for IPv4,
       IPv6 and MAC destination addresses in these registers:
         NIG_REG_TX_LLH_PTP_RULE_MASK
         NIG_REG_LLH_PTP_RULE_MASK
    b. NIG_REG_LLH_PTP_PARAM_MASK/NIG_REG_TX_LLH_PTP_PARAM_MASK should
       be set to 0x0 in all modes.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-02 15:33:01 -04:00
sudarsana.kalluru@cavium.com
461eec1201 qede: Fix concurrency issue in PTP Tx path processing.
PTP Tx timestamping data structures are not protected against the
concurrent access in the Tx paths. Protecting the same using atomic
bit locks.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-02 15:33:01 -04:00
sudarsana.kalluru@cavium.com
035744975a qede: Add support for PTP resource locking.
The patch adds necessary changes to the driver to use qed resource
locking functionality. Currently the ptp initialization is spread
between driver probe/open implementations, associated APIs are
qede_ptp_register_phc()/qede_ptp_start(). Clubbed this functionality
into single API qed_ptp_enable() to simplify the usage of qed resource
locking implementation. The new API will be invoked in the probe path.
Similarly the ptp clean-up code is moved to qede_ptp_disable() which
gets invoked in the driver unload path.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-27 16:51:34 -04:00
Chopra, Manish
97379f15c2 qed/qede: Add UDP ports in bulletin board
This patch adds support for UDP ports in bulletin board
to notify UDP ports change to the VFs

Signed-off-by: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: Yuval Mintz <yuval.mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-25 11:49:31 -04:00
Chopra, Manish
327a2b750c qede: Configure UDP ports in local context.
This patch configures UDP ports locally instead of
configuring them in deferred context which would be
helpful in synchronizing UDP ports configuration for VFs
which will be enabled in further patches.

Signed-off-by: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: Yuval Mintz <yuval.mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-25 11:49:31 -04:00
Chopra, Manish
369bfd4ec7 qede: Disable tunnel offloads for non offloaded UDP ports
This patch disables tunnel offloads via ndo_features_check()
if given UDP port is not offloaded to hardware. This in turn
allows to run multiple tunnel interfaces using different UDP ports.

Signed-off-by: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: Yuval Mintz <yuval.mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-25 11:49:31 -04:00
Chopra, Manish
19489c7f0d qed/qede: Enable tunnel offloads based on hw configuration
This patch enables tunnel feature offloads based on hw configuration
at initialization time instead of enabling them always.

Signed-off-by: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: Yuval Mintz <yuval.mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-25 11:49:31 -04:00
sudarsana.kalluru@cavium.com
05a79f925d qed: Support dcbnl IEEE selector field.
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-24 12:19:55 -04:00
Dan Carpenter
f6ca26f268 qede: allocate enough data for ->arfs_fltr_bmap
We've got the number of longs, yes, but we should multiply by
sizeof(long) to get the number of bytes needed.

Fixes: e4917d46a6 ("qede: Add aRFS support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20 16:23:23 -04:00
Chopra, Manish
e4917d46a6 qede: Add aRFS support
This patch adds support for aRFS for TCP and UDP
protocols with IPv4/IPv6.

Signed-off-by: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: Yuval Mintz <yuval.mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17 13:06:18 -04:00
Mintz, Yuval
059eeb07e1 qede: Support XDP adjustment of headers
In case an XDP program is attached, reserve XDP_PACKET_HEADROOM
bytes at the beginning of the packet for the program to play
with.

Modify the XDP logic in the driver to fill-in the missing bits
and re-calculate offsets and length after the program has finished
running to properly reflect the current status of the packet.

We can then go and remove the limitation of not supporting XDP programs
where xdp_adjust_head is set.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-07 06:26:14 -07:00
Mintz, Yuval
15ed8a47ff qede: Add support for ingress headroom
Driver currently doesn't support any headroom; The only 'available'
space it has in the head of the buffer is due to the placement
offset.
In order to allow [later] support of XDP adjustment of headroom,
modify the the ingress flow to properly handle a scenario where
the packets would have such.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-07 06:26:14 -07:00
Mintz, Yuval
40b8c45492 qede: Prevent VFs from using XDP
Current implementation of VFs is very tight in regard to queue
resources. VFs support for XDP would require quite a bit of additional
infrastructure in qede and qed [sharing of queue-zones between queues,
more VF cids, mapping of the doorbell bar, etc.].

For now, prevent XDP programs from being attached to VFs.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-07 06:26:14 -07:00
Mintz, Yuval
89e1afc447 qede: Correct XDP forward unmapping
Driver is currently using dma_unmap_single() with the address it
passed to device for the purpose of forwarding, but the XDP
transmission buffer was originally a page allocated for the rx-queue.
The mapped address is likely to differ from the original mapped
address due to the placement offset.

This difference is going to get even bigger once we support headroom.

Cache the original mapped address of the page, and use it for unmapping
of the buffer when completion arrives for the XDP forwarded packet.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-07 06:26:14 -07:00
Mintz, Yuval
10a0176e4e qede: Update receive statistic once per NAPI
Currently, each time an ingress packet is passed to networking stack
the driver increments a per-queue SW statistic.
As we want to have additional fields in the first cache-line of the
Rx-queue struct, change flow so this statistic would be updated once per
NAPI run. We will later push the statistic to a different cache line.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-07 06:26:14 -07:00
Mintz, Yuval
9c79ddaa0f qed*: Add support for QL41xxx adapters
This adds the necessary infrastructure changes for initializing
and working with the new series of QL41xxx adapaters.

It also adds 2 new PCI device-IDs to qede:
  - 0x8070 for QL41xxx PFs
  - 0x8090 for VFs spawning from QL41xxx PFs

Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-14 11:43:30 -07:00
Mintz, Yuval
be086e7c53 qed*: Utilize Firmware 8.15.3.0
This patch advances the qed* drivers into using the newer firmware -
This solves several firmware bugs, mostly related [but not limited to]
various init/deinit issues in various offloaded protocols.

It also introduces a major 4-Cached SGE change in firmware, which can be
seen in the storage drivers' changes.

In addition, this firmware is required for supporting the new QL41xxx
series of adapters; While this patch doesn't add the actual support,
the firmware contains the necessary initialization & firmware logic to
operate such adapters [actual support would be added later on].

Changes from Previous versions:
-------------------------------
 - V2 - fix kbuild-test robot warnings

Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Manish Rangankar <Manish.Rangankar@cavium.com>
Signed-off-by: Chad Dupuis <Chad.Dupuis@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-13 15:33:09 -07:00
Mintz, Yuval
65ed2ffd64 qed*: Fix link indication race
Driver changes the link properties via communication with
the management firmware, and re-reads the resulting link status
when it receives an indication that the link has changed.
However, there are certain scenarios where such indications
might be missing, and so driver also re-reads the current link
results without attention in several places. Specifically, it
does so during load and when resetting the link.

This creates a race where driver might reflect incorrect
link status - e.g., when explicit reading of the link status is
switched by attention with the changed configuration.

Correct this flow by a lock syncronizing the handling of the
link indications [both explicit requests and attention].

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-20 17:11:54 -05:00
Sudarsana Reddy Kalluru
afe981d664 qede: Prevent index problems in loopback test
Driver currently utilizes the same loop variable in two
nested loops.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-20 17:11:54 -05:00
Mintz, Yuval
885185dfc4 qede: Free netdevice only after stoping slowpath
qed needs to be informed of the removal of the qede interface
prior to its actual removal, as qede has some registered callbacks
that might get called async to the removal flow.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-20 17:11:54 -05:00
Mintz, Yuval
0e0b80a9a7 qede: Initialize lock and slowpath workqueue early
Need to make sure the slowpath workqueue and the qede lock
are ready for the registration of the netdevice, as once
registered there's no guarantee those wouldn't be used.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-20 17:11:54 -05:00
Sudarsana Reddy Kalluru
4c55215c05 qede: Add driver support for PTP
This patch adds the driver support for,
  - Registering the ptp clock functionality with the OS.
  - Timestamping the Rx/Tx PTP packets.
  - Ethtool callbacks related to PTP.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-15 12:42:53 -05:00
Eric Dumazet
6ad20165d3 drivers: net: generalize napi_complete_done()
napi_complete_done() allows to opt-in for gro_flush_timeout,
added back in linux-3.19, commit 3b47d30396
("net: gro: add a per device gro flush timer")

This allows for more efficient GRO aggregation without
sacrifying latencies.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-30 15:10:42 -05:00
Daniel Borkmann
a67edbf4fb bpf: add initial bpf tracepoints
This work adds a number of tracepoints to paths that are either
considered slow-path or exception-like states, where monitoring or
inspecting them would be desirable.

For bpf(2) syscall, tracepoints have been placed for main commands
when they succeed. In XDP case, tracepoint is for exceptions, that
is, f.e. on abnormal BPF program exit such as unknown or XDP_ABORTED
return code, or when error occurs during XDP_TX action and the packet
could not be forwarded.

Both have been split into separate event headers, and can be further
extended. Worst case, if they unexpectedly should get into our way in
future, they can also removed [1]. Of course, these tracepoints (like
any other) can be analyzed by eBPF itself, etc. Example output:

  # ./perf record -a -e bpf:* sleep 10
  # ./perf script
  sock_example  6197 [005]   283.980322:      bpf:bpf_map_create: map type=ARRAY ufd=4 key=4 val=8 max=256 flags=0
  sock_example  6197 [005]   283.980721:       bpf:bpf_prog_load: prog=a5ea8fa30ea6849c type=SOCKET_FILTER ufd=5
  sock_example  6197 [005]   283.988423:   bpf:bpf_prog_get_type: prog=a5ea8fa30ea6849c type=SOCKET_FILTER
  sock_example  6197 [005]   283.988443: bpf:bpf_map_lookup_elem: map type=ARRAY ufd=4 key=[06 00 00 00] val=[00 00 00 00 00 00 00 00]
  [...]
  sock_example  6197 [005]   288.990868: bpf:bpf_map_lookup_elem: map type=ARRAY ufd=4 key=[01 00 00 00] val=[14 00 00 00 00 00 00 00]
       swapper     0 [005]   289.338243:    bpf:bpf_prog_put_rcu: prog=a5ea8fa30ea6849c type=SOCKET_FILTER

  [1] https://lwn.net/Articles/705270/

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-25 13:17:47 -05:00
stephen hemminger
bc1f44709c net: make ndo_get_stats64 a void function
The network device operation for reading statistics is only called
in one place, and it ignores the return value. Having a structure
return value is potentially confusing because some future driver could
incorrectly assume that the return value was used.

Fix all drivers with ndo_get_stats64 to have a void function.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-08 17:51:44 -05:00
Mintz, Yuval
ce742922ac qed*: Advance driver versions to 8.10.10.20.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-01 21:02:14 -05:00
Mintz, Yuval
f990c82c38 qed*: Add support for ndo_set_vf_trust
Trusted VFs would be allowed to receive promiscuous and
multicast promiscuous data.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-01 21:02:14 -05:00
Mintz, Yuval
f29ffdb65f qed*: RSS indirection based on queue-handles
A step toward having qede agnostic to the queue configurations
in firmware/hardware - let the RSS indirections use queue handles
instead of actual queue indices.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-01 21:02:14 -05:00
Mintz, Yuval
04e0fd006a qede: Remove unnecessary datapath dereference
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-01 21:02:14 -05:00
Manish Chopra
7ca547bdb0 qede - mark SKB as encapsulated
When driver receives a recognized encapsulated packet it needs
to set the skb->encapsulation field as well.

Signed-off-by: Manish Chopra <Manish.Chopra@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-01 21:02:14 -05:00
Mintz, Yuval
e3eef7ee02 qede: Postpone reallocation until NAPI end
During Rx flow driver allocates a replacement buffer each time
it consumes an Rx buffer. Failing to do so, it would consume the
currently processed buffer and re-post it on the ring.
As a result, the Rx ring is always completely full [from driver POV].

We now allow the Rx ring to shorten by doing the re-allocations
at the end of the NAPI run. The only limitation is that we still want to
make sure each time we reallocate that we'd still have sufficient
elements in the Rx ring to guarantee that FW would be able to post
additional data and trigger an interrupt.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-01 21:02:14 -05:00
Mintz, Yuval
e1d32acbcb qed*: Change maximal number of queues
Today qede requests contexts that would suffice for 64 'whole'
combined queues [192 meant for 64 rx, tx and xdp tx queues],
but registers netdev and limits the number of queues based on
information received by qed. In turn, qed doesn't take context
into account when informing qede how many queues it can support.

This would lead to a configuration problem in case user tries
configuring >64 combined queues to interface [or >96 in case
xdp isn't enabled]. Since we don't have a mangement firware
that actually provides so many interrupt lines to a single
device we're currently safe but that's about to change soon.

The new maximum is hence changed:
  - For RoCE devices, the limit would remain 64.
  - For non-RoCE devices, the limit might be higher [depending
    on the actual configuration of the device].
qed would start enforcing that limit in both scenarios.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-01 21:02:14 -05:00
Mintz, Yuval
aed284c7f0 qede: Split filtering logic to its own file
This takes the various filtering logic of the driver and
moves them into their own dedicated file - qede_filter.c.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-01 21:02:14 -05:00
Mintz, Yuval
cdda926d40 qede: Break datapath logic into its own file
This adds a new file qede_fp.c and relocates the datapath-related
logic into it [from qede_main.c].

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-01 21:02:14 -05:00
Mintz, Yuval
e8f1cb507d qed*: Update to dual-license
Since the submission of the qedr driver, there's inconsistency
in the licensing of the various qed/qede files - some are GPLv2
and some are dual-license.
Since qedr requires dual-license and it's dependent on both,
we're updating the licensing of all qed/qede source files.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-01 21:02:14 -05:00
Linus Torvalds
4d5b57e05a Updates for 4.10 kernel merge window
- Shared mlx5 updates with net stack (will drop out on merge if Dave's
   tree has already been merged)
 - Driver updates: cxgb4, hfi1, hns-roce, i40iw, mlx4, mlx5, qedr, rxe
 - Debug cleanups
 - New connection rejection helpers
 - SRP updates
 - Various misc fixes
 - New paravirt driver from vmware
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJYUbAPAAoJELgmozMOVy/dMXcP/iuG5MNzfN8Ny1JftyBQGWg3
 cqoQ2OLj9CsXjwVB+5EqbcZHRZY852lKONaLoDKkIOx4YAXO2YuIKOp944vN7EQx
 96wfqzT1F5jzAcy5mYZXgLaStGFDAwejKMqeHd0LfJj3OEtemGnVPWYzyqSQmSKo
 dzJraS1Z9GIRppzU5WaRpB9PtRBkqIqGJ5vZ0EKLGhed5hYY5r0iMJB0GfriMRDO
 lJ4UUVfpsAoLPnqDBFH6IMn2V2UeAw9IR5zNa1mrM1RBfvt/uYTxrw1w3p9WoaNs
 GRodhk4DCeAfeyqzVPNBLyXZ4Zq4FzGe3UWM4qysJ1RR4oFNw9Cuw0Fqk8mrfznr
 7hv5TpGIckRZiKf8l6e+qLirF0qGtXJg29j2vPVQI9i5nSj95g1agA81PnLQlLLb
 flWyxeMj81my7lfMHN1xcV6pqPEKMCOysZmfcvVfJd2XxpjuVD7ekl/YXWp8o8kU
 YPdQMqPD626XsD8VpPdMszb9FPmx0JD0HEv+Y1rIFX8JegEI+c3H2X0dqC27T/Ou
 FEPWOy025EgHm0Fh/7eIzkG6tjZ4JHoCugJAcxNZGj2XW4eB6r5vY8UwJ8iQRv+n
 PVYHiy0UoIRePh0mrdOSSphGZMi/GO/DsqKwCtAMEK43WqZQju6wR7QSIGkh66mp
 4uSHJqpf3YEYylxGMhk3
 =QeGy
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma

Pull rdma updates from Doug Ledford:
 "This is the complete update for the rdma stack for this release cycle.

  Most of it is typical driver and core updates, but there is the
  entirely new VMWare pvrdma driver. You may have noticed that there
  were changes in DaveM's pull request to the bnxt Ethernet driver to
  support a RoCE RDMA driver. The bnxt_re driver was tentatively set to
  be pulled in this release cycle, but it simply wasn't ready in time
  and was dropped (a few review comments still to address, and some
  multi-arch build issues like prefetch() not working across all
  arches).

  Summary:

   - shared mlx5 updates with net stack (will drop out on merge if
     Dave's tree has already been merged)

   - driver updates: cxgb4, hfi1, hns-roce, i40iw, mlx4, mlx5, qedr, rxe

   - debug cleanups

   - new connection rejection helpers

   - SRP updates

   - various misc fixes

   - new paravirt driver from vmware"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (210 commits)
  IB: Add vmw_pvrdma driver
  IB/mlx4: fix improper return value
  IB/ocrdma: fix bad initialization
  infiniband: nes: return value of skb_linearize should be handled
  MAINTAINERS: Update Intel RDMA RNIC driver maintainers
  MAINTAINERS: Remove Mitesh Ahuja from emulex maintainers
  IB/core: fix unmap_sg argument
  qede: fix general protection fault may occur on probe
  IB/mthca: Replace pci_pool_alloc by pci_pool_zalloc
  mlx5, calc_sq_size(): Make a debug message more informative
  mlx5: Remove a set-but-not-used variable
  mlx5: Use { } instead of { 0 } to init struct
  IB/srp: Make writing the add_target sysfs attr interruptible
  IB/srp: Make mapping failures easier to debug
  IB/srp: Make login failures easier to debug
  IB/srp: Introduce a local variable in srp_add_one()
  IB/srp: Fix CONFIG_DYNAMIC_DEBUG=n build
  IB/multicast: Check ib_find_pkey() return value
  IPoIB: Avoid reading an uninitialized member variable
  IB/mad: Fix an array index check
  ...
2016-12-15 12:03:32 -08:00
Amrani, Ram
22b1ae6169 qede: fix general protection fault may occur on probe
The recent introduction of qedr driver support in qede causes a GPF when probing the driver in a server without a RoCE enabled QLogic NIC. This fix avoids using an uninitialized pointer in such a case. Caught by the kernel test robot.

Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-14 14:16:49 -05:00
Martin KaFai Lau
17bedab272 bpf: xdp: Allow head adjustment in XDP prog
This patch allows XDP prog to extend/remove the packet
data at the head (like adding or removing header).  It is
done by adding a new XDP helper bpf_xdp_adjust_head().

It also renames bpf_helper_changes_skb_data() to
bpf_helper_changes_pkt_data() to better reflect
that XDP prog does not work on skb.

This patch adds one "xdp_adjust_head" bit to bpf_prog for the
XDP-capable driver to check if the XDP prog requires
bpf_xdp_adjust_head() support.  The driver can then decide
to error out during XDP_SETUP_PROG.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-08 14:25:13 -05:00
Zhang Shengju
90caf3c9d4 qede: use reset to set network header
Since offset is zero, it's not necessary to use set function. Reset
function is straightforward, and will remove the unnecessary add
operation in set function.

Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Acked-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-03 15:49:16 -05:00
Daniel Borkmann
366cbf2f46 bpf, xdp: drop rcu_read_lock from bpf_prog_run_xdp and move to caller
After 326fe02d1e ("net/mlx4_en: protect ring->xdp_prog with rcu_read_lock"),
the rcu_read_lock() in bpf_prog_run_xdp() is superfluous, since callers
need to hold rcu_read_lock() already to make sure BPF program doesn't
get released in the background.

Thus, drop it from bpf_prog_run_xdp(), as it can otherwise be misleading.
Still keeping the bpf_prog_run_xdp() is useful as it allows for grepping
in XDP supported drivers and to keep the typecheck on the context intact.
For mlx4, this means we don't have a double rcu_read_lock() anymore. nfp can
just make use of bpf_prog_run_xdp(), too. For qede, just move rcu_read_lock()
out of the helper. When the driver gets atomic replace support, this will
move to call-sites eventually.

mlx5 needs actual fixing as it has the same issue as described already in
326fe02d1e ("net/mlx4_en: protect ring->xdp_prog with rcu_read_lock"),
that is, we're under RCU bh at this time, BPF programs are released via
call_rcu(), and call_rcu() != call_rcu_bh(), so we need to properly mark
read side as programs can get xchg()'ed in mlx5e_xdp_set() without queue
reset.

Fixes: 86994156c7 ("net/mlx5e: XDP fast RX drop bpf programs support")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-02 11:06:24 -05:00
Mintz, Yuval
cb6aeb0792 qede: Add support for XDP_TX
Add support for forwarding via XDP. Once the eBPF is attached,
driver would allocate & configure a designated transmission queue
meant solely for forwarding packets. Said queue would share the
receive-queue's interrupt line, and would have it's own Tx statistics.

Infrastructure changes required for this [spread-out through the code]:
 - Determine the DMA direction of the receive buffers based on the presence
of the eBPF program.
 - Turn the sw Tx ring into a union, as regular/XDP queues have different
needs for releasing resources after completion [regular requires the SKB,
XDP requires the transmitted page].

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:32:05 -05:00
Mintz, Yuval
496e051709 qede: Add basic XDP support
Add support for the ndo_xdp callback. This patch would support XDP_PASS,
XDP_DROP and XDP_ABORTED commands.

This also adds a per Rx queue statistic which counts number of packets
which didn't reach the stack [due to XDP].

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:32:05 -05:00
Mintz, Yuval
9eb22357d5 qede: Better utilize the qede_[rt]x_queue
Improve the cacheline usage of both queues by reordering -
This reduces the cachelines required for egress datapath processing
from 3 to 2 and those required by ingress datapath processing by 2.

It also changes a couple of datapath related functions that currently
require either the fastpath or the qede_dev, changing them to be based
on the tx/rx queue instead.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:32:05 -05:00
Mintz, Yuval
8a47253065 qede: Don't check netdevice for rx-hash
Receive-hashing is a fixed feature, so there's no need to check
during the ingress datapath whether it's set or not.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:32:04 -05:00
Mintz, Yuval
3da7a37ae6 qed*: Handle-based L2-queues.
The driver needs to maintain several FW/HW-indices for each one of
its queues. Currently, that mapping is done by the QED where it uses
an rx/tx array of so-called hw-cids, populating them whenever a new
queue is opened and clearing them upon destruction of said queues.

This maintenance is far from ideal - there's no real reason why
QED needs to maintain such a data-structure. It becomes even worse
when considering the fact that the PF's queues and its child VFs' queues
are all mapped into the same data-structure.
As a by-product, the set of parameters an interface needs to supply for
queue APIs is non-trivial, and some of the variables in the API
structures have different meaning depending on their exact place
in the configuration flow.

This patch re-organizes the way L2 queues are configured and maintained.
In short:
  - Required parameters for queue init are now well-defined.
  - Qed would allocate a queue-cid based on parameters.
    Upon initialization success, it would return a handle to caller.
  - Queue-handle would be maintained by entity requesting queue-init,
    not necessarily qed.
  - All further queue-APIs [update, destroy] would use the opaque
    handle as reference for the queue instead of various indices.

The possible owners of such handles:
  - PF queues [qede] - complete handles based on provided configuration.
  - VF queues [qede] - fw-context-less handles, containing only relative
    information; Only the PF-side would need the absolute indices
    for configuration, so they're omitted here.
  - VF queues [qed, PF-side] - complete handles based on VF initialization.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:32:04 -05:00
Mintz, Yuval
567b3c127a qede: Revise state locking scheme
As qede utilizes an internal-reload sequence as result of various
configuration changes, the netif state wouldn't always accurately describe
the status of the configuration.
To compensate, we're storing an internal state of the device, which should
only be accessed under the qede_lock.

This patch fixes and improves several state/lock interactions:
  - The internal state should only be checked while locked.
  - While holding lock, it's preferable to check state rather than
    the netdevice's state.
  - The reload sequence is not 'atomic' - unload and subsequent load
    are not in the same critical section.

This also add the 'locked' variant for the reload, which would later be
used by XDP - useful in the case where the correct sequence is 'lock,
check state and re-configure if good', instead of allowing the reload
itself to make the decision regarding the configurability of the device.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:32:04 -05:00
Mintz, Yuval
f4fad34c0e qede: Refactor data-path Rx flow
Driver's NAPI poll is using a long sequence for processing ingress
packets, and it's going to get even longer once we do XDP.
Break down the main loop into a series of sub-functions to allow
better readability of the function.

While we're at it, correct the accounting of the NAPI budget -
currently we're counting only packets passed to the stack against
the budget, even in case those are actually aggregations.
After refactoring every CQE processed would be counted against the budget.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:32:03 -05:00
Mintz, Yuval
4dbcd64002 qede: Refactor statistics gathering
Refactor logic for gathering statistics into a per-queue function.
This improves readability of the driver statistics' flows.

In addition, this would be required by the XDP forwarding queues
[as we'll need the Txq statistics gathering methods for those as well].

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:32:03 -05:00
Mintz, Yuval
80439a1704 qede: Remove 'num_tc'.
Driver currently doesn't support multi-CoS, but it contains logic
where multiple transmission queues could be theoretically manipulated.
No point in maintaining the infrastructure at the moment.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:32:03 -05:00
Mintz, Yuval
01e23015a9 qede: Optimize aggregation information size
Driver needs to maintain a structure per-each concurrent possible
open aggregation, but the structure storing that metadata is far from
being optimized - biggest waste in it is that there are 2 buffer metadata,
one for a replacement buffer when the aggregation begins and the other for
holding the first aggregation's buffer after it begins [as firmware might
still update it]. Those 2 can safely be united into a single metadata
structure.

struct qede_agg_info changes the following:

	/* size: 120, cachelines: 2, members: 9 */
	/* sum members: 114, holes: 1, sum holes: 4 */
	/* padding: 2 */
	/* paddings: 2, sum paddings: 8 */
	/* last cacheline: 56 bytes */
 -->
	/* size: 48, cachelines: 1, members: 9 */
	/* paddings: 1, sum paddings: 4 */
	/* last cacheline: 48 bytes */

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 14:32:02 -05:00
David S. Miller
bb598c1b8c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Several cases of bug fixes in 'net' overlapping other changes in
'net-next-.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-15 10:54:36 -05:00