Currently accepted msk sockets become established only after
accept() returns the new sk to user-space.
As MP_JOIN request are refused as per RFC spec on non fully
established socket, the above causes mp_join self-tests
instabilities.
This change lets the msk entering the established status
as soon as it receives the 3rd ack and propagates the first
subflow fully established status on the msk socket.
Finally we can change the subflow acceptance condition to
take in account both the sock state and the msk fully
established flag.
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Tested-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the unlikely event of a failure at connect time,
we currently clear the request_mptcp flag - so that
the MPC handshake is not started at all, but the msk
is not explicitly marked as fallback.
This would lead to later insertion of wrong DSS options
in the xmitted packets, in violation of RFC specs and
possibly fooling the peer.
Fixes: e1ff9e82e2 ("net: mptcp: improve fallback to TCP")
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Tested-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When updating a partially acked data fragment, we
actually corrupt it. This is irrelevant till we send
data on a single subflow, as retransmitted data, if
any are discarded by the peer as duplicate, but it
will cause data corruption as soon as we will start
creating non backup subflows.
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Tested-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently we do not init the subflow write sequence for
MP_JOIN subflows. This will cause bad mapping being
generated as soon as we will use non backup subflow.
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Tested-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
checkpatch warned about the L2TP_SKB_CB macro's use of its argument: add
braces to avoid the problem.
Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In l2tp_core.c both l2tp_tunnel_create and l2tp_session_create take
quite a number of arguments and have a correspondingly long prototype.
This is both quite difficult to scan visually, and triggers checkpatch
warnings.
Add a line break to make these function prototypes more readable.
Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
checkpatch warns about use of seq_printf where seq_puts would do.
Modify l2tp_debugfs accordingly.
Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use BIT(x) rather than (1<<x), reported by checkpatch.pl.
Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reported by checkpatch:
"WARNING: function definition argument 'struct sock *'
should also have an identifier name"
Add an identifier name to help document the prototype.
Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
l2tp_core has conditionally compiled code in l2tp_xmit_skb for IPv6
support. The structure of this code triggered a checkpatch warning
due to incorrect indentation.
Fix up the indentation to address the checkpatch warning.
Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arguments should be aligned with the function call open parenthesis as
per checkpatch. Tweak some function calls which were not aligned
correctly.
Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some l2tp code had line breaks which made the code more difficult to
read. These were originally motivated by the 80-character line width
coding guidelines, but were actually a negative from the perspective of
trying to follow the code.
Remove these linebreaks for clearer code, even if we do exceed 80
characters in width in some places.
Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Modify some l2tp comments to better adhere to kernel coding style, as
reported by checkpatch.pl.
Add descriptive comments for the l2tp per-net spinlocks to document
their use.
Fix an incorrect comment in l2tp_recv_common:
RFC2661 section 5.4 states that:
"The LNS controls enabling and disabling of sequence numbers by sending a
data message with or without sequence numbers present at any time during
the life of a session."
l2tp handles this correctly in l2tp_recv_common, but the comment around
the code was incorrect and confusing. Fix up the comment accordingly.
Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix up various whitespace issues as reported by checkpatch.pl:
* remove spaces around operators where appropriate,
* add missing blank lines following declarations,
* remove multiple blank lines, or trailing blank lines at the end of
functions.
Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently devlink instance is searched on all doit() operations.
But it is optionally stored into user_ptr[0]. This requires
rediscovering devlink again doing post_doit().
Few devlink commands related to port shared buffers needs 3 pointers
(devlink, devlink_port, and devlink_sb) while executing doit commands.
Though devlink pointer can be derived from the devlink_port during
post_doit() operation when doit() callback has acquired devlink
instance lock, relying on such scheme to access devlik pointer makes
code very fragile.
Hence, to avoid ambiguity in post_doit() and to avoid searching
devlink instance again, simplify code by always storing devlink
instance in user_ptr[0] and derive devlink_sb pointer in their
respective callback routines.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Otherwise the 'chain_len' filed will carry random values,
some token creation calls will fail due to excessive chain
length, causing unexpected fallback to TCP.
Fixes: 2c5ebd001d ("mptcp: refactor token container")
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Tested-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds support for the SIOCOUTQ IOCTL to get the send buffer fill
of a DCCP socket, like UDP and TCP sockets already have.
Regarding the used data field: DCCP uses per packet sequence numbers,
not per byte, so sequence numbers can't be used like in TCP. sk_wmem_queued
is not used by DCCP and always 0, even in test on highly congested paths.
Therefore this uses sk_wmem_alloc like in UDP.
Signed-off-by: Richard Sailer <richard_siegfried@systemli.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Due to unified Ethernet Switch Device Tree Bindings allow for ethernet-ports as
encapsulating node as well.
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The buildbot found a config where the header isn't already implicitly
pulled in, so add an explicit include as well.
Fixes: 8c918ffbba ("net: remove compat_sock_common_{get,set}sockopt")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov says:
====================
pull-request: bpf-next 2020-07-21
The following pull-request contains BPF updates for your *net-next* tree.
We've added 46 non-merge commits during the last 6 day(s) which contain
a total of 68 files changed, 4929 insertions(+), 526 deletions(-).
The main changes are:
1) Run BPF program on socket lookup, from Jakub.
2) Introduce cpumap, from Lorenzo.
3) s390 JIT fixes, from Ilya.
4) teach riscv JIT to emit compressed insns, from Luke.
5) use build time computed BTF ids in bpf iter, from Yonghong.
====================
Purely independent overlapping changes in both filter.h and xdp.h
Signed-off-by: David S. Miller <davem@davemloft.net>
Constify devlink instance pointer while checking if reload operation is
supported or not.
This helps to review the scope of checks done in reload.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reload operation is enabled or not is already checked by
devlink_reload(). Hence, remove the duplicate check from
devlink_nl_cmd_reload().
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no need to hold a device global lock when initializing
devlink device fields of a devlink instance which is not yet part of the
devices list.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
One additional field btf_id is added to struct
bpf_ctx_arg_aux to store the precomputed btf_ids.
The btf_id is computed at build time with
BTF_ID_LIST or BTF_ID_LIST_GLOBAL macro definitions.
All existing bpf iterators are changed to used
pre-compute btf_ids.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200720163403.1393551-1-yhs@fb.com
tcp and udp bpf_iter can reuse some socket ids in
btf_sock_ids, so make it global.
I put the extern definition in btf_ids.h as a central
place so it can be easily discovered by developers.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200720163402.1393427-1-yhs@fb.com
Currently, socket types (struct tcp_sock, udp_sock, etc.)
used by bpf_skc_to_*() helpers are computed when vmlinux_btf
is first built in the kernel.
Commit 5a2798ab32
("bpf: Add BTF_ID_LIST/BTF_ID/BTF_ID_UNUSED macros")
implemented a mechanism to compute btf_ids at kernel build
time which can simplify kernel implementation and reduce
runtime overhead by removing in-kernel btf_id calculation.
This patch did exactly this, removing in-kernel btf_id
computation and utilizing build-time btf_id computation.
If CONFIG_DEBUG_INFO_BTF is not defined, BTF_ID_LIST will
define an array with size of 5, which is not enough for
btf_sock_ids. So define its own static array if
CONFIG_DEBUG_INFO_BTF is not defined.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200720163358.1393023-1-yhs@fb.com
Now that DSA supports MTU configuration, undo the effects of commit
8b1efc0f83 ("net: remove MTU limits on a few ether_setup callers") and
let DSA interfaces use the default min_mtu and max_mtu specified by
ether_setup(). This is more important for min_mtu: since DSA is
Ethernet, the minimum MTU is the same as of any other Ethernet
interface, and definitely not zero. For the max_mtu, we have a callback
through which drivers can override that, if they want to.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Because kfree_skb already checked NULL skb parameter,
so the additional checks are unnecessary, just remove them.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that we have all the infrastructure in place for calling into the
dsa_ptr->netdev_ops function pointers, install them when we configure
the DSA CPU/management interface and tear them down. The flow is
unchanged from before, but now we preserve equality of tests when
network device drivers do tests like dev->netdev_ops == &foo_ops which
was not the case before since we were allocating an entirely new
structure.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the core net_device code call into our ndo_do_ioctl() and
ndo_get_phys_port_name() functions via the wrappers defined previously
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation for adding another layer of call into a DSA stacked ops
singleton, wrap the ndo_do_ioctl() call into dev_do_ioctl().
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add setsockopt SOL_IP/IP_RECVERR_4884 to return the offset to an
extension struct if present.
ICMP messages may include an extension structure after the original
datagram. RFC 4884 standardized this behavior. It stores the offset
in words to the extension header in u8 icmphdr.un.reserved[1].
The field is valid only for ICMP types destination unreachable, time
exceeded and parameter problem, if length is at least 128 bytes and
entire packet does not exceed 576 bytes.
Return the offset to the start of the extension struct when reading an
ICMP error from the error queue, if it matches the above constraints.
Do not return the raw u8 field. Return the offset from the start of
the user buffer, in bytes. The kernel does not return the network and
transport headers, so subtract those.
Also validate the headers. Return the offset regardless of validation,
as an invalid extension must still not be misinterpreted as part of
the original datagram. Note that !invalid does not imply valid. If
the extension version does not match, no validation can take place,
for instance.
For backward compatibility, make this optional, set by setsockopt
SOL_IP/IP_RECVERR_RFC4884. For API example and feature test, see
github.com/wdebruij/kerneltools/blob/master/tests/recv_icmp_v2.c
For forward compatibility, reserve only setsockopt value 1, leaving
other bits for additional icmp extensions.
Changes
v1->v2:
- convert word offset to byte offset from start of user buffer
- return in ee_data as u8 may be insufficient
- define extension struct and object header structs
- return len only if constraints met
- if returning len, also validate
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is just used once, and a direct return for the redirect to the AF
case is much easier to follow than jumping to the end of a very long
function.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the kernel pointer that sctp_setsockopt has available instead of
directly handling the user pointer.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the kernel pointer that sctp_setsockopt has available instead of
directly handling the user pointer.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the kernel pointer that sctp_setsockopt has available instead of
directly handling the user pointer.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the kernel pointer that sctp_setsockopt has available instead of
directly handling the user pointer.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the kernel pointer that sctp_setsockopt has available instead of
directly handling the user pointer.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the kernel pointer that sctp_setsockopt has available instead of
directly handling the user pointer.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the kernel pointer that sctp_setsockopt has available instead of
directly handling the user pointer.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the kernel pointer that sctp_setsockopt has available instead of
directly handling the user pointer.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the kernel pointer that sctp_setsockopt has available instead of
directly handling the user pointer.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the kernel pointer that sctp_setsockopt has available instead of
directly handling the user pointer.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the kernel pointer that sctp_setsockopt has available instead of
directly handling the user pointer.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the kernel pointer that sctp_setsockopt has available instead of
directly handling the user pointer.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the kernel pointer that sctp_setsockopt has available instead of
directly handling the user pointer.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the kernel pointer that sctp_setsockopt has available instead of
directly handling the user pointer.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the kernel pointer that sctp_setsockopt has available instead of
directly handling the user pointer.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>