Commit Graph

935853 Commits

Author SHA1 Message Date
Christoph Hellwig
a06d30ae7a net/atm: remove the atmdev_ops {get, set}sockopt methods
All implementations of these two methods are dummies that always
return -EINVAL.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-19 18:16:40 -07:00
Randy Dunlap
089377b7e8 net: rds: rdma_transport.h: delete duplicated word
Delete the doubled word "be" in a comment.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Cc: netdev@vger.kernel.org
Cc: linux-rdma@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-19 18:14:51 -07:00
Randy Dunlap
dfd5ec1ba6 net: atm: lec_arpc.h: delete duplicated word
Delete the doubled word "the" in a comment.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-19 18:14:21 -07:00
Oleksij Rempel
7dce80c2a5 net: phy: at803x: add mdix configuration support for AR9331 and AR8035
This patch add MDIX configuration ability for AR9331 and AR8035. Theoretically
it should work on other Atheros PHYs, but I was able to test only this
two.

Since I have no certified reference HW able to detect or configure MDIX, this
functionality was confirmed by oscilloscope.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-19 18:07:16 -07:00
David S. Miller
ff9a8c48eb Merge branch 'net-enetc-remove-bootloader-dependency'
Michael Walle says:

====================
net: enetc: remove bootloader dependency

These patches were picked from the following series:
https://lore.kernel.org/netdev/1567779344-30965-1-git-send-email-claudiu.manoil@nxp.com/
They have never been resent. I've picked them up, addressed Andrews
comments, fixed some more bugs and asked Claudiu if I can keep their SOB
tags; he agreed. I've tested this on our board which happens to have a
bootloader which doesn't do the enetc setup in all cases. Though, only
SGMII mode was tested.

changes since v6:
 - dropped _LPA_ infix for USXGMII constants

changes since v5:
 - fixed pcs->autoneg_complete and pcs->link assignment. Thanks Vladimir.

changes since v4:
 - moved (and renamed) the USXGMII constants to include/uapi/linux/mdio.h.
   Suggested by Russell King.

changes since v3:
 - rebased to latest net-next where devm_mdiobus_free() was removed.
   replace it by mdiobus_free(). The internal MDIO bus is optional, if
   there is any error, we try to run with the bootloader default PCS
   settings, thus in the error case, we need to free the mdiobus.

changes since v2:
 - removed SOBs from "net: enetc: Initialize SerDes for SGMII and USXGMII
   protocols" because almost everything has changed.
 - get a phy_device for the internal PCS PHY so we can use the phy_
   functions instead of raw mdiobus writes
 - reuse macros already defined in fsl_mdio.h, move missing bits from
   felix to fsl_mdio.h, because they share the same PCS PHY building
   block
 - added 2500BaseX mode (based on felix init routine)
 - changed xgmii mode to usxgmii mode, because it is actually USXGMII and
   felix does the same.
 - fixed devad, which is 0x1f (MMD_VEND2)

changes since v1:
 - mdiobus id is '"imdio-%s", dev_name(dev)' because the plain dev_name()
   is used by the emdio.
 - use mdiobus_write() instead of imdio->write(imdio, ..), since this is
   already a full featured mdiobus
 - set phy_mask to ~0 to avoid scanning the bus
 - use phy_interface_mode_is_rgmii(phy_mode) to also include the RGMII
   modes with pad delays.
 - move enetc_imdio_init() to enetc_pf.c, there shouldn't be any other
   users, should it?
 - renamed serdes to SerDes
 - printing the error code of mdiobus_register() in the error path
 - call mdiobus_unregister() on _remove()
 - call devm_mdiobus_free() if mdiobus_register() fails, since an
   error is not fatal
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-19 18:05:49 -07:00
Alex Marginean
07095c025a net: enetc: Use DT protocol information to set up the ports
Use DT information rather than in-band information from bootloader to
set up MAC for XGMII. For RGMII use the DT indication in addition to
RGMII defaults in hardware.
However, this implies that PHY connection information needs to be
extracted before netdevice creation, when the ENETC Port MAC is
being configured.

Signed-off-by: Alex Marginean <alexandru.marginean@nxp.com>
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Michael Walle <michael@walle.cc>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-19 18:05:49 -07:00
Michael Walle
975d183ef0 net: enetc: Initialize SerDes for SGMII and USXGMII protocols
ENETC has ethernet MACs capable of SGMII, 2500BaseX and USXGMII. But in
order to use these protocols some SerDes configurations need to be
performed. The SerDes is configurable via an internal PCS PHY which is
connected to an internal MDIO bus at address 0.

This patch basically removes the dependency on bootloader regarding
SerDes initialization.

Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-19 18:05:49 -07:00
Michael Walle
16659b811a net: dsa: felix: (re)use already existing constants
Now that there are USXGMII constants available, drop the old definitions
and reuse the generic ones.

Signed-off-by: Michael Walle <michael@walle.cc>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-19 18:05:49 -07:00
Michael Walle
c4471ad9a5 net: phy: add USXGMII link partner ability constants
The constants are taken from the USXGMII Singleport Copper Interface
specification. The naming are based on the SGMII ones, but with an MDIO_
prefix.

Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-19 18:05:49 -07:00
Alexei Starovoitov
e57892f50a Merge branch 'bpf-socket-lookup'
Jakub Sitnicki says:

====================
Changelog
=========
v4 -> v5:
- Enforce BPF prog return value to be SK_DROP or SK_PASS. (Andrii)
- Simplify prog runners now that only SK_DROP/PASS can be returned.
- Enable bpf_perf_event_output from the start. (Andrii)
- Drop patch
  "selftests/bpf: Rename test_sk_lookup_kern.c to test_ref_track_kern.c"
- Remove tests for narrow loads from context at an offset wider in size
  than target field, while we are discussing how to fix it:
  https://lore.kernel.org/bpf/20200710173123.427983-1-jakub@cloudflare.com/
- Rebase onto recent bpf-next (bfdfa51702)
- Other minor changes called out in per-patch changelogs,
  see patches: 2, 4, 6, 13-15
- Carried over Andrii's Acks where nothing changed.

v3 -> v4:
- Reduce BPF prog return codes to SK_DROP/SK_PASS (Lorenz)
- Default to drop on illegal return value from BPF prog (Lorenz)
- Extend bpf_sk_assign to accept NULL socket pointer.
- Switch to saner return values and add docs for new prog_array API (Andrii)
- Add support for narrow loads from BPF context fields (Yonghong)
- Fix broken build when IPv6 is compiled as a module (kernel test robot)
- Fix null/wild-ptr-deref on BPF context access
- Rebase to recent bpf-next (eef8a42d6c)
- Other minor changes called out in per-patch changelogs,
  see patches 1-2, 4, 6, 8, 10-12, 14, 16

v2 -> v3:
- Switch to link-based program attachment
- Support for multi-prog attachment
- Ability to skip reuseport socket selection
- Code on RX path is guarded by a static key
- struct in6_addr's are no longer copied into BPF prog context
- BPF prog context is initialized as late as possible
- Changes called out in patches 1-2, 4, 6, 8, 10-14, 16
- Patches dropped:
  01/17 flow_dissector: Extract attach/detach/query helpers
  03/17 inet: Store layer 4 protocol in inet_hashinfo
  08/17 udp: Store layer 4 protocol in udp_table

v1 -> v2:
- Changes called out in patches 2, 13-15, 17
- Rebase to recent bpf-next (b4563facdc)

RFCv2 -> v1:

- Switch to fetching a socket from a map and selecting a socket with
  bpf_sk_assign, instead of having a dedicated helper that does both.
- Run reuseport logic on sockets selected by BPF sk_lookup.
- Allow BPF sk_lookup to fail the lookup with no match.
- Go back to having just 2 hash table lookups in UDP.

RFCv1 -> RFCv2:

- Make socket lookup redirection map-based. BPF program now uses a
  dedicated helper and a SOCKARRAY map to select the socket to redirect to.
  A consequence of this change is that bpf_inet_lookup context is now
  read-only.
- Look for connected UDP sockets before allowing redirection from BPF.
  This makes connected UDP socket work as expected in the presence of
  inet_lookup prog.
- Share the code for BPF_PROG_{ATTACH,DETACH,QUERY} with flow_dissector,
  the only other per-netns BPF prog type.

Overview
========

This series proposes a new BPF program type named BPF_PROG_TYPE_SK_LOOKUP,
or BPF sk_lookup for short.

BPF sk_lookup program runs when transport layer is looking up a listening
socket for a new connection request (TCP), or when looking up an
unconnected socket for a packet (UDP).

This serves as a mechanism to overcome the limits of what bind() API allows
to express. Two use-cases driving this work are:

 (1) steer packets destined to an IP range, fixed port to a single socket

     192.0.2.0/24, port 80 -> NGINX socket

 (2) steer packets destined to an IP address, any port to a single socket

     198.51.100.1, any port -> L7 proxy socket

In its context, program receives information about the packet that
triggered the socket lookup. Namely IP version, L4 protocol identifier, and
address 4-tuple.

To select a socket BPF program fetches it from a map holding socket
references, like SOCKMAP or SOCKHASH, calls bpf_sk_assign(ctx, sk, ...)
helper to record the selection, and returns SK_PASS code. Transport layer
then uses the selected socket as a result of socket lookup.

Alternatively, program can also fail the lookup (SK_DROP), or let the
lookup continue as usual (SK_PASS without selecting a socket).

This lets the user match packets with listening (TCP) or receiving (UDP)
sockets freely at the last possible point on the receive path, where we
know that packets are destined for local delivery after undergoing
policing, filtering, and routing.

Program is attached to a network namespace, similar to BPF flow_dissector.
We add a new attach type, BPF_SK_LOOKUP, for this. Multiple programs can be
attached at the same time, in which case their return values are aggregated
according the rules outlined in patch #4 description.

Series structure
================

Patches are organized as so:

 1: enables multiple link-based prog attachments for bpf-netns
 2: introduces sk_lookup program type
 3-4: hook up the program to run on ipv4/tcp socket lookup
 5-6: hook up the program to run on ipv6/tcp socket lookup
 7-8: hook up the program to run on ipv4/udp socket lookup
 9-10: hook up the program to run on ipv6/udp socket lookup
 11-13: libbpf & bpftool support for sk_lookup
 14-15: verifier and selftests for sk_lookup

Patches are also available on GH:

  https://github.com/jsitnicki/linux/commits/bpf-inet-lookup-v5

Follow-up work
==============

I'll follow up with below items, which IMHO don't block the review:

- benchmark results for udp6 small packet flood scenario,
- user docs for new BPF prog type, Documentation/bpf/prog_sk_lookup.rst,
- timeout for accept() in tests after extending network_helper.[ch].

Thanks to the reviewers for their feedback to this patch series:

Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Lorenz Bauer <lmb@cloudflare.com>
Cc: Marek Majkowski <marek@cloudflare.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Yonghong Song <yhs@fb.com>

-jkbs

[RFCv1] https://lore.kernel.org/bpf/20190618130050.8344-1-jakub@cloudflare.com/
[RFCv2] https://lore.kernel.org/bpf/20190828072250.29828-1-jakub@cloudflare.com/
[v1] https://lore.kernel.org/bpf/20200511185218.1422406-18-jakub@cloudflare.com/
[v2] https://lore.kernel.org/bpf/20200506125514.1020829-1-jakub@cloudflare.com/
[v3] https://lore.kernel.org/bpf/20200702092416.11961-1-jakub@cloudflare.com/
[v4] https://lore.kernel.org/bpf/20200713174654.642628-1-jakub@cloudflare.com/
====================

Reviewed-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-07-17 20:19:32 -07:00
Jakub Sitnicki
0ab5539f85 selftests/bpf: Tests for BPF_SK_LOOKUP attach point
Add tests to test_progs that exercise:

 - attaching/detaching/querying programs to BPF_SK_LOOKUP hook,
 - redirecting socket lookup to a socket selected by BPF program,
 - failing a socket lookup on BPF program's request,
 - error scenarios for selecting a socket from BPF program,
 - accessing BPF program context,
 - attaching and running multiple BPF programs.

Run log:

  bash-5.0# ./test_progs -n 70
  #70/1 query lookup prog:OK
  #70/2 TCP IPv4 redir port:OK
  #70/3 TCP IPv4 redir addr:OK
  #70/4 TCP IPv4 redir with reuseport:OK
  #70/5 TCP IPv4 redir skip reuseport:OK
  #70/6 TCP IPv6 redir port:OK
  #70/7 TCP IPv6 redir addr:OK
  #70/8 TCP IPv4->IPv6 redir port:OK
  #70/9 TCP IPv6 redir with reuseport:OK
  #70/10 TCP IPv6 redir skip reuseport:OK
  #70/11 UDP IPv4 redir port:OK
  #70/12 UDP IPv4 redir addr:OK
  #70/13 UDP IPv4 redir with reuseport:OK
  #70/14 UDP IPv4 redir skip reuseport:OK
  #70/15 UDP IPv6 redir port:OK
  #70/16 UDP IPv6 redir addr:OK
  #70/17 UDP IPv4->IPv6 redir port:OK
  #70/18 UDP IPv6 redir and reuseport:OK
  #70/19 UDP IPv6 redir skip reuseport:OK
  #70/20 TCP IPv4 drop on lookup:OK
  #70/21 TCP IPv6 drop on lookup:OK
  #70/22 UDP IPv4 drop on lookup:OK
  #70/23 UDP IPv6 drop on lookup:OK
  #70/24 TCP IPv4 drop on reuseport:OK
  #70/25 TCP IPv6 drop on reuseport:OK
  #70/26 UDP IPv4 drop on reuseport:OK
  #70/27 TCP IPv6 drop on reuseport:OK
  #70/28 sk_assign returns EEXIST:OK
  #70/29 sk_assign honors F_REPLACE:OK
  #70/30 sk_assign accepts NULL socket:OK
  #70/31 access ctx->sk:OK
  #70/32 narrow access to ctx v4:OK
  #70/33 narrow access to ctx v6:OK
  #70/34 sk_assign rejects TCP established:OK
  #70/35 sk_assign rejects UDP connected:OK
  #70/36 multi prog - pass, pass:OK
  #70/37 multi prog - drop, drop:OK
  #70/38 multi prog - pass, drop:OK
  #70/39 multi prog - drop, pass:OK
  #70/40 multi prog - pass, redir:OK
  #70/41 multi prog - redir, pass:OK
  #70/42 multi prog - drop, redir:OK
  #70/43 multi prog - redir, drop:OK
  #70/44 multi prog - redir, redir:OK
  #70 sk_lookup:OK
  Summary: 1/44 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200717103536.397595-16-jakub@cloudflare.com
2020-07-17 20:18:17 -07:00
Jakub Sitnicki
f7726cbea4 selftests/bpf: Add verifier tests for bpf_sk_lookup context access
Exercise verifier access checks for bpf_sk_lookup context fields.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200717103536.397595-15-jakub@cloudflare.com
2020-07-17 20:18:17 -07:00
Jakub Sitnicki
93a3545d81 tools/bpftool: Add name mappings for SK_LOOKUP prog and attach type
Make bpftool show human-friendly identifiers for newly introduced program
and attach type, BPF_PROG_TYPE_SK_LOOKUP and BPF_SK_LOOKUP, respectively.

Also, add the new prog type bash-completion, man page and help message.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200717103536.397595-14-jakub@cloudflare.com
2020-07-17 20:18:17 -07:00
Jakub Sitnicki
499dd29d90 libbpf: Add support for SK_LOOKUP program type
Make libbpf aware of the newly added program type, and assign it a
section name.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200717103536.397595-13-jakub@cloudflare.com
2020-07-17 20:18:17 -07:00
Jakub Sitnicki
a352b32ae9 bpf: Sync linux/bpf.h to tools/
Newly added program, context type and helper is used by tests in a
subsequent patch. Synchronize the header file.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200717103536.397595-12-jakub@cloudflare.com
2020-07-17 20:18:17 -07:00
Jakub Sitnicki
6d4201b138 udp6: Run SK_LOOKUP BPF program on socket lookup
Same as for udp4, let BPF program override the socket lookup result, by
selecting a receiving socket of its choice or failing the lookup, if no
connected UDP socket matched packet 4-tuple.

Suggested-by: Marek Majkowski <marek@cloudflare.com>
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200717103536.397595-11-jakub@cloudflare.com
2020-07-17 20:18:17 -07:00
Jakub Sitnicki
2a08748cd3 udp6: Extract helper for selecting socket from reuseport group
Prepare for calling into reuseport from __udp6_lib_lookup as well.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200717103536.397595-10-jakub@cloudflare.com
2020-07-17 20:18:17 -07:00
Jakub Sitnicki
72f7e9440e udp: Run SK_LOOKUP BPF program on socket lookup
Following INET/TCP socket lookup changes, modify UDP socket lookup to let
BPF program select a receiving socket before searching for a socket by
destination address and port as usual.

Lookup of connected sockets that match packet 4-tuple is unaffected by this
change. BPF program runs, and potentially overrides the lookup result, only
if a 4-tuple match was not found.

Suggested-by: Marek Majkowski <marek@cloudflare.com>
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200717103536.397595-9-jakub@cloudflare.com
2020-07-17 20:18:17 -07:00
Jakub Sitnicki
7629c73a14 udp: Extract helper for selecting socket from reuseport group
Prepare for calling into reuseport from __udp4_lib_lookup as well.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200717103536.397595-8-jakub@cloudflare.com
2020-07-17 20:18:17 -07:00
Jakub Sitnicki
1122702f02 inet6: Run SK_LOOKUP BPF program on socket lookup
Following ipv4 stack changes, run a BPF program attached to netns before
looking up a listening socket. Program can return a listening socket to use
as result of socket lookup, fail the lookup, or take no action.

Suggested-by: Marek Majkowski <marek@cloudflare.com>
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200717103536.397595-7-jakub@cloudflare.com
2020-07-17 20:18:17 -07:00
Jakub Sitnicki
5df6531292 inet6: Extract helper for selecting socket from reuseport group
Prepare for calling into reuseport from inet6_lookup_listener as well.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200717103536.397595-6-jakub@cloudflare.com
2020-07-17 20:18:17 -07:00
Jakub Sitnicki
1559b4aa1d inet: Run SK_LOOKUP BPF program on socket lookup
Run a BPF program before looking up a listening socket on the receive path.
Program selects a listening socket to yield as result of socket lookup by
calling bpf_sk_assign() helper and returning SK_PASS code. Program can
revert its decision by assigning a NULL socket with bpf_sk_assign().

Alternatively, BPF program can also fail the lookup by returning with
SK_DROP, or let the lookup continue as usual with SK_PASS on return, when
no socket has been selected with bpf_sk_assign().

This lets the user match packets with listening sockets freely at the last
possible point on the receive path, where we know that packets are destined
for local delivery after undergoing policing, filtering, and routing.

With BPF code selecting the socket, directing packets destined to an IP
range or to a port range to a single socket becomes possible.

In case multiple programs are attached, they are run in series in the order
in which they were attached. The end result is determined from return codes
of all the programs according to following rules:

 1. If any program returned SK_PASS and selected a valid socket, the socket
    is used as result of socket lookup.
 2. If more than one program returned SK_PASS and selected a socket,
    last selection takes effect.
 3. If any program returned SK_DROP, and no program returned SK_PASS and
    selected a socket, socket lookup fails with -ECONNREFUSED.
 4. If all programs returned SK_PASS and none of them selected a socket,
    socket lookup continues to htable-based lookup.

Suggested-by: Marek Majkowski <marek@cloudflare.com>
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200717103536.397595-5-jakub@cloudflare.com
2020-07-17 20:18:16 -07:00
Jakub Sitnicki
80b373f74f inet: Extract helper for selecting socket from reuseport group
Prepare for calling into reuseport from __inet_lookup_listener as well.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200717103536.397595-4-jakub@cloudflare.com
2020-07-17 20:18:16 -07:00
Jakub Sitnicki
e9ddbb7707 bpf: Introduce SK_LOOKUP program type with a dedicated attach point
Add a new program type BPF_PROG_TYPE_SK_LOOKUP with a dedicated attach type
BPF_SK_LOOKUP. The new program kind is to be invoked by the transport layer
when looking up a listening socket for a new connection request for
connection oriented protocols, or when looking up an unconnected socket for
a packet for connection-less protocols.

When called, SK_LOOKUP BPF program can select a socket that will receive
the packet. This serves as a mechanism to overcome the limits of what
bind() API allows to express. Two use-cases driving this work are:

 (1) steer packets destined to an IP range, on fixed port to a socket

     192.0.2.0/24, port 80 -> NGINX socket

 (2) steer packets destined to an IP address, on any port to a socket

     198.51.100.1, any port -> L7 proxy socket

In its run-time context program receives information about the packet that
triggered the socket lookup. Namely IP version, L4 protocol identifier, and
address 4-tuple. Context can be further extended to include ingress
interface identifier.

To select a socket BPF program fetches it from a map holding socket
references, like SOCKMAP or SOCKHASH, and calls bpf_sk_assign(ctx, sk, ...)
helper to record the selection. Transport layer then uses the selected
socket as a result of socket lookup.

In its basic form, SK_LOOKUP acts as a filter and hence must return either
SK_PASS or SK_DROP. If the program returns with SK_PASS, transport should
look for a socket to receive the packet, or use the one selected by the
program if available, while SK_DROP informs the transport layer that the
lookup should fail.

This patch only enables the user to attach an SK_LOOKUP program to a
network namespace. Subsequent patches hook it up to run on local delivery
path in ipv4 and ipv6 stacks.

Suggested-by: Marek Majkowski <marek@cloudflare.com>
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200717103536.397595-3-jakub@cloudflare.com
2020-07-17 20:18:16 -07:00
Jakub Sitnicki
ce3aa9cc51 bpf, netns: Handle multiple link attachments
Extend the BPF netns link callbacks to rebuild (grow/shrink) or update the
prog_array at given position when link gets attached/updated/released.

This let's us lift the limit of having just one link attached for the new
attach type introduced by subsequent patch.

No functional changes intended.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200717103536.397595-2-jakub@cloudflare.com
2020-07-17 20:18:16 -07:00
Armin Wolf
a050d82f5b ne2k-pci: Use netif_msg_init to initialize msg_enable bits
Use netif_msg_enable() to process param settings.

Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 19:04:06 -07:00
David S. Miller
1143fede88 Merge branch 'net-atlantic-add-support-for-FW-4-x'
Mark Starovoytov says:

====================
net: atlantic: add support for FW 4.x

This patch set adds support for FW 4.x, which is about to get into the
production for some products.
4.x is mostly compatible with 3.x, save for soft reset, which requires
the acquisition of 2 additional semaphores.
Other differences (e.g. absence of PTP support) are handled via
capabilities.

Note: 4.x targets specific products only. 3.x is still the main firmware
branch, which should be used by most users (at least for now).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 19:00:54 -07:00
Dmitry Bogdanov
0044b1e147 net: atlantic: add support for FW 4.x
This patch adds support for FW 4.x, which is about to get into the
production for some products.
4.x is mostly compatible with 3.x, save for soft reset, which requires
the acquisition of 2 additional semaphores.
Other differences (e.g. absence of PTP support) are handled via
capabilities.

Note: 4.x targets specific products only. 3.x is still the main firmware
branch, which should be used by most users (at least for now).

Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 19:00:54 -07:00
Mark Starovoytov
b567edbfc8 net: atlantic: align return value of ver_match function with function name
This patch aligns the return value of hw_atl_utils_ver_match function with
its name.
Change the return type to bool, because it's better aligned with the actual
usage. Return true when the version matches, false otherwise.

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 19:00:54 -07:00
Mark Einon
11f3c1f583 net: ethernet: et131x: Remove redundant register read
Following the removal of an unused variable assignment (remove
unused variable 'pm_csr') the associated register read can also go,
as the read also occurs in the subsequent et1310_in_phy_coma()
call.

Signed-off-by: Mark Einon <mark.einon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 18:48:15 -07:00
Zhang Changzhong
eacc43d2c3 net: ethernet: et131x: Remove unused variable 'pm_csr'
Gcc report warning as follows:

drivers/net/ethernet/agere/et131x.c:953:6: warning:
 variable 'pm_csr' set but not used [-Wunused-but-set-variable]
  953 |  u32 pm_csr;
      |      ^~~~~~
drivers/net/ethernet/agere/et131x.c:1002:6⚠️
 variable 'pm_csr' set but not used [-Wunused-but-set-variable]
 1002 |  u32 pm_csr;
      |      ^~~~~~
drivers/net/ethernet/agere/et131x.c:3446:8: warning:
 variable 'pm_csr' set but not used [-Wunused-but-set-variable]
 3446 |    u32 pm_csr;
      |        ^~~~~~

After commit 38df6492eb ("et131x: Add PCIe gigabit ethernet driver
et131x to drivers/net"), 'pm_csr' is never used in these functions,
so removing it to avoid build warning.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Acked-by: Mark Einon <mark.einon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 18:43:40 -07:00
Zhang Changzhong
5686b10978 net: bna: Remove unused variable 't'
Gcc report warning as follows:

drivers/net/ethernet/brocade/bna/bfa_ioc.c:1538:6: warning:
 variable 't' set but not used [-Wunused-but-set-variable]
 1538 |  u32 t;
      |      ^

After commit c107ba171f ("bna: Firmware Patch Simplification"),
't' is never used, so removing it to avoid build warning.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 18:42:28 -07:00
Jakub Kicinski
18c7015cc6 net: bnxt: don't complain if TC flower can't be supported
The fact that NETIF_F_HW_TC is not set should be a sufficient
indication to the user that TC offloads are not supported.
No need to bother users of older firmware versions with
pointless warnings on every boot.

Also, since the support is optional, bnxt_init_tc() should not
return an error in case FW is old, similarly to how error
is not returned when CONFIG_BNXT_FLOWER_OFFLOAD is not set.

With that we can add an error message to the caller, to warn
about actual unexpected failures.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 18:26:20 -07:00
David S. Miller
d44a919a5c mlx5-updates-2020-07-16
Fixes:
 1) Fix build break when CONFIG_XPS is not set
 2) Fix missing switch_id for representors
 
 Updates:
 1) IPsec XFRM RX offloads from Raed and Huy.
   - Added IPSec RX steering flow tables to NIC RX
   - Refactoring of the existing FPGA IPSec, to add support
     for ConnectX IPsec.
   - RX data path handling for IPSec traffic
   - Synchronize offloading device ESN with xfrm received SN
 
 2) Parav allows E-Switch to siwtch to switchdev mode directly without
    the need to go through legacy mode first.
 
 3) From Tariq, Misc updates including:
    3.1) indirect calls for RX and XDP handlers
    3.2) Make MLX5_EN_TLS non-prompt as it should always be enabled when
         TLS and MLX5_EN are selected.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl8Q5J0ACgkQSD+KveBX
 +j5zhggAm/8ILhtG04BBKeQGay+m4CCg9qK7BrIavU3ta2t+DQdAxE+XmmHl+W2F
 DfL5sR0AiV8z8v6OF6Yjrh49Ys6k7LFh6msFP2vyVkUC6t02zRv7WYMlZn44Igqb
 Jg8n4Q806y5g2RJRmV/QFz9nOq8jxL/CXxA7eLCMiRSQKHl3LQ3TXbvvLJRY6ab2
 aZT9fhi6lJWhe7Rii932oUM+USikmilFgB0tBoSgVQ9fxa+cNTuMb2y/IKHQo5pi
 O9OUUKbPgYy3+xah+FCPLMx4izyv8F36XA7z6fGhtsM74pmFvC5e2eWOoqriWeBO
 8SL2m2+FSUnuoI6S2wKsBl5dePdezQ==
 =p788
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2020-07-16' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2020-07-16

Fixes:
1) Fix build break when CONFIG_XPS is not set
2) Fix missing switch_id for representors

Updates:
1) IPsec XFRM RX offloads from Raed and Huy.
  - Added IPSec RX steering flow tables to NIC RX
  - Refactoring of the existing FPGA IPSec, to add support
    for ConnectX IPsec.
  - RX data path handling for IPSec traffic
  - Synchronize offloading device ESN with xfrm received SN

2) Parav allows E-Switch to siwtch to switchdev mode directly without
   the need to go through legacy mode first.

3) From Tariq, Misc updates including:
   3.1) indirect calls for RX and XDP handlers
   3.2) Make MLX5_EN_TLS non-prompt as it should always be enabled when
        TLS and MLX5_EN are selected.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 13:04:17 -07:00
Christophe JAILLET
721dab2b56 net: alteon: Avoid some useless memset
Avoid a memset after a call to 'dma_alloc_coherent()'.
This is useless since
commit 518a2f1925 ("dma-mapping: zero memory returned from dma_alloc_*")

Replace a kmalloc+memset with a corresponding kzalloc.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 12:57:59 -07:00
Christophe JAILLET
f4079e5d72 net: alteon: switch from 'pci_' to 'dma_' API
The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below and has been
hand modified to replace GFP_ with a correct flag.
It has been compile tested.

When memory is allocated in 'ace_allocate_descriptors()' and
'ace_init()' GFP_KERNEL can be used because both functions are called from
the probe function and no lock is acquired.

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 12:57:21 -07:00
Christophe JAILLET
8d4f62ca19 net: sungem: switch from 'pci_' to 'dma_' API
The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below and has been
hand modified to replace GFP_ with a correct flag.
It has been compile tested.

When memory is allocated in 'gem_init_one()', GFP_KERNEL can be used
because it is a probe function and no lock is acquired.

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 12:56:40 -07:00
Suraj Upadhyay
e0c3f4c4fd net: decnet: af_decnet: Simplify goto loop.
Replace goto loop with while loop.

Signed-off-by: Suraj Upadhyay <usuraj35@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 12:56:11 -07:00
David S. Miller
c4fefd5a33 Merge branch 'tcp-dsack-multi-seg'
Priyaranjan Jha says:

====================
tcp: improve handling of DSACK covering multiple segments

Currently, while processing DSACK, we assume DSACK covers only one
segment. This leads to significant underestimation of no. of duplicate
segments with LRO/GRO. Also, the existing SNMP counters, TCPDSACKRecv
and TCPDSACKOfoRecv, make similar assumption for DSACK, which makes them
unusable for estimating spurious retransmit rates.

This patch series fixes the segment accounting with DSACK, by estimating
number of duplicate segments based on: (DSACKed sequence range) / MSS.
It also introduces a new SNMP counter, TCPDSACKRecvSegs, which tracks
the estimated number of duplicate segments.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 12:54:31 -07:00
Priyaranjan Jha
e3a5a1e8b6 tcp: add SNMP counter for no. of duplicate segments reported by DSACK
There are two existing SNMP counters, TCPDSACKRecv and TCPDSACKOfoRecv,
which are incremented depending on whether the DSACKed range is below
the cumulative ACK sequence number or not. Unfortunately, these both
implicitly assume each DSACK covers only one segment. This makes these
counters unusable for estimating spurious retransmit rates,
or real/non-spurious loss rate.

This patch introduces a new SNMP counter, TCPDSACKRecvSegs, which tracks
the estimated number of duplicate segments based on:
(DSACKed sequence range) / MSS. This counter is usable for estimating
spurious retransmit rates, or real/non-spurious loss rate.

Signed-off-by: Priyaranjan Jha <priyarjha@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 12:54:30 -07:00
Priyaranjan Jha
a71d77e6be tcp: fix segment accounting when DSACK range covers multiple segments
Currently, while processing DSACK, we assume DSACK covers only one
segment. This leads to significant underestimation of DSACKs with
LRO/GRO. This patch fixes segment accounting with DSACK by estimating
segment count from DSACK sequence range / MSS.

Signed-off-by: Priyaranjan Jha <priyarjha@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yousuk Seung <ysseung@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 12:54:30 -07:00
Christophe JAILLET
dcc82bb072 net: sun: cassini: switch from 'pci_' to 'dma_' API
The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below and has been
hand modified to replace GFP_ with a correct flag.
It has been compile tested.

When memory is allocated in 'cas_tx_tiny_alloc()', GFP_KERNEL can be used
because a few lines below in its only caller, 'cas_alloc_rxds()', is also
called. This function makes an explicit use of GFP_KERNEL.

When memory is allocated in 'cas_init_one()', GFP_KERNEL can be used
because it is a probe function and no lock is acquired.

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 12:52:56 -07:00
Davide Caratti
8c72894048 mptcp: silence warning in subflow_data_ready()
since commit d47a721520 ("mptcp: fix race in subflow_data_ready()"), it
is possible to observe a regression in MP_JOIN kselftests. For sockets in
TCP_CLOSE state, it's not sufficient to just wake up the main socket: we
also need to ensure that received data are made available to the reader.
Silence the WARN_ON_ONCE() in these cases: it preserves the syzkaller fix
and restores kselftests	when they are ran as follows:

  # while true; do
  > make KBUILD_OUTPUT=/tmp/kselftest TARGETS=net/mptcp kselftest
  > done

Reported-by: Florian Westphal <fw@strlen.de>
Fixes: d47a721520 ("mptcp: fix race in subflow_data_ready()")
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/47
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 12:47:00 -07:00
David S. Miller
79814d8179 Merge branch 'usbnet-multicast-filter-support-for-cdc-ncm-devices'
Bjørn Mork says:

====================
usbnet: multicast filter support for cdc ncm devices

This revives a 2 year old patch set from Miguel Rodríguez
Pérez, which appears to have been lost somewhere along the
way.  I've based it on the last version I found (v4), and
added one patch which I believe must have been missing in
the original.

I kept Oliver's ack on one of the patches, since both the patch and
the motivation still is the same.  Hope this is OK..

Thanks to the anonymous user <wxcafe@wxcafe.net> for bringing up this
problem in https://bugs.debian.org/965074

This is only build and load tested by me.  I don't have any device
where I can test the actual functionality.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 12:42:47 -07:00
Miguel Rodríguez Pérez
e10dcb1b6b net: cdc_ncm: hook into set_rx_mode to admit multicast traffic
We set set_rx_mode to usbnet_cdc_update_filter provided
by cdc_ether that simply admits all multicast traffic
if there is more than one multicast filter configured.

Signed-off-by: Miguel Rodríguez Pérez <miguel@det.uvigo.gal>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 12:42:47 -07:00
Miguel Rodríguez Pérez
37a2ebdd9e net: cdc_ncm: add .ndo_set_rx_mode to cdc_ncm_netdev_ops
The cdc_ncm driver overrides the net_device_ops structure used by usbnet
to be able to hook into .ndo_change_mtu. However, the structure was
missing the .ndo_set_rx_mode field, preventing the driver from
hooking into usbnet's set_rx_mode. This patch adds the missing callback to
usbnet_set_rx_mode in net_device_ops.

Signed-off-by: Miguel Rodríguez Pérez <miguel@det.uvigo.gal>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 12:42:47 -07:00
Bjørn Mork
1ea2b748b5 net: usbnet: export usbnet_set_rx_mode()
This function can be reused by other usbnet minidrivers.

Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 12:42:47 -07:00
Miguel Rodríguez Pérez
e506addeff net: cdc_ether: export usbnet_cdc_update_filter
This makes the function available to other drivers, like cdc_ncm.

Signed-off-by: Miguel Rodríguez Pérez <miguel@det.uvigo.gal>
Acked-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 12:42:47 -07:00
Miguel Rodríguez Pérez
0226009ce0 net: cdc_ether: use dev->intf to get interface information
usbnet_cdc_update_filter was getting the interface number from the
usb_interface struct in cdc_state->control. However, cdc_ncm does
not initialize that structure in its bind function, but uses
cdc_ncm_ctx instead. Getting intf directly from struct usbnet solves
the problem.

Signed-off-by: Miguel Rodríguez Pérez <miguel@det.uvigo.gal>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 12:42:47 -07:00
Eelco Chaudron
eac87c413b net: openvswitch: reorder masks array based on usage
This patch reorders the masks array every 4 seconds based on their
usage count. This greatly reduces the masks per packet hit, and
hence the overall performance. Especially in the OVS/OVN case for
OpenShift.

Here are some results from the OVS/OVN OpenShift test, which use
8 pods, each pod having 512 uperf connections, each connection
sends a 64-byte request and gets a 1024-byte response (TCP).
All uperf clients are on 1 worker node while all uperf servers are
on the other worker node.

Kernel without this patch     :  7.71 Gbps
Kernel with this patch applied: 14.52 Gbps

We also run some tests to verify the rebalance activity does not
lower the flow insertion rate, which does not.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Tested-by: Andrew Theurer <atheurer@redhat.com>
Reviewed-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 10:36:50 -07:00