Not every CLC proposal message needs the maximum buffer length.
Due to the MSG_WAITALL flag, it is important to use the peeked
real length when receiving the message.
Fixes: d63d271ce2 ("smc: switch to sock_recvmsg()")
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
llc_conn_send_pdu() pushes the skb into write queue and
calls llc_conn_send_pdus() to flush them out. However, the
status of dev_queue_xmit() is not returned to caller,
in this case, llc_conn_state_process().
llc_conn_state_process() needs hold the skb no matter
success or failure, because it still uses it after that,
therefore we should hold skb before dev_queue_xmit() when
that skb is the one being processed by llc_conn_state_process().
For other callers, they can just pass NULL and ignore
the return value as they are.
Reported-by: Noam Rathaus <noamr@beyondsecurity.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
strp_parser_err is called with a negative code everywhere, which then
calls abort_parser with a negative code. strp_msg_timeout calls
abort_parser directly with a positive code. Negate ETIMEDOUT
to match signed-ness of other calls.
The default abort_parser callback, strp_abort_strp, sets
sk->sk_err to err. Also negate the error here so sk_err always
holds a positive value, as the rest of the net code expects. Currently
a negative sk_err can result in endless loops, or user code that
thinks it actually sent/received err bytes.
Found while testing net/tls_sw recv path.
Fixes: 43a0c6751a ("strparser: Stream parser for messages")
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes a bug in the tcf_dump_walker function that can cause some actions
to not be reported when dumping a large number of actions. This issue
became more aggrevated when cookies feature was added. In particular
this issue is manifest when large cookie values are assigned to the
actions and when enough actions are created that the resulting table
must be dumped in multiple batches.
The number of actions returned in each batch is limited by the total
number of actions and the memory buffer size. With small cookies
the numeric limit is reached before the buffer size limit, which avoids
the code path triggering this bug. When large cookies are used buffer
fills before the numeric limit, and the erroneous code path is hit.
For example after creating 32 csum actions with the cookie
aaaabbbbccccdddd
$ tc actions ls action csum
total acts 26
action order 0: csum (tcp) action continue
index 1 ref 1 bind 0
cookie aaaabbbbccccdddd
.....
action order 25: csum (tcp) action continue
index 26 ref 1 bind 0
cookie aaaabbbbccccdddd
total acts 6
action order 0: csum (tcp) action continue
index 28 ref 1 bind 0
cookie aaaabbbbccccdddd
......
action order 5: csum (tcp) action continue
index 32 ref 1 bind 0
cookie aaaabbbbccccdddd
Note that the action with index 27 is omitted from the report.
Fixes: 4b3550ef53 ("[NET_SCHED]: Use nla_nest_start/nla_nest_end")"
Signed-off-by: Craig Dillabaugh <cdillaba@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- fix multicast-via-unicast transmissions for AP isolation and gateway
extension, by Linus Luessing (2 patches)
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEE1ilQI7G+y+fdhnrfoSvjmEKSnqEFAlq466kWHHN3QHNpbW9u
d3VuZGVybGljaC5kZQAKCRChK+OYQpKeobv5EACwz70r21xIUxwq6FWXysTfxxXt
H3qcXt2BzCaGYKmYt8xqz9z368v7wpgCsPfZsvW2eBYhcpCkGQm6PR2XNy7WMoBz
6aFO4SDDi5L/S+5O0iXg79niRiTe9lCbOl/r6uv/2FtY22rIfhocFwkBL8cjIC3E
hOWuFUbk3mAu8i+WMyhfDEUL33oy9CMvlJqaxqJuTHf3HxjOeGVjusOvSbQu+OGG
K+TCAbIjazQ02R9p6lXQoIAjfg+kfsYIdT84MTgjk91qCtp1ztRxr3MNAwTgPvA4
Vi4uZGLbLCMvaAF1wqVBSuOnFFwbUCc8IBIoUvvZSMQTYm3agcg96pFLMYqNzrhY
aHco1neJ+a+pGUmByijmYbsrJI2dYqK0V0OZlLG9WKwkrE3Y023LGGdXUbdl4RhS
LXqMLGVJ9eqV+m7MCuv/q/PTVOL89loAun0DuIU4IhJuDM2+5yEG5jBiI6ImIym+
KX3F9F9nyefr5aCYsd14izX6WHxTXkJQaVpjVNBP56P6eZMLMx81eozBD9eotFyv
A1EgQolLFEKWmtKU2KUK+qrGFNXaLlc9z8ZGkbizi/CTEjN0tr1UjW6B6toOOoZ+
fhF3HwlpgqVOdDodT+eDtxY8YboZBCpAnAOiE1LP3leVY4yab23Q02kskOPTPg1G
CfpZwDjXIyZburUvfA==
=j2w0
-----END PGP SIGNATURE-----
Merge tag 'batadv-net-for-davem-20180326' of git://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
Here are some batman-adv bugfixes:
- fix multicast-via-unicast transmissions for AP isolation and gateway
extension, by Linus Luessing (2 patches)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
After the qdisc lock was dropped in pfifo_fast we allow multiple
enqueue threads and dequeue threads to run in parallel. On the
enqueue side the skb bit ooo_okay is used to ensure all related
skbs are enqueued in-order. On the dequeue side though there is
no similar logic. What we observe is with fewer queues than CPUs
it is possible to re-order packets when two instances of
__qdisc_run() are running in parallel. Each thread will dequeue
a skb and then whichever thread calls the ndo op first will
be sent on the wire. This doesn't typically happen because
qdisc_run() is usually triggered by the same core that did the
enqueue. However, drivers will trigger __netif_schedule()
when queues are transitioning from stopped to awake using the
netif_tx_wake_* APIs. When this happens netif_schedule() calls
qdisc_run() on the same CPU that did the netif_tx_wake_* which
is usually done in the interrupt completion context. This CPU
is selected with the irq affinity which is unrelated to the
enqueue operations.
To resolve this we add a RUNNING bit to the qdisc to ensure
only a single dequeue per qdisc is running. Enqueue and dequeue
operations can still run in parallel and also on multi queue
NICs we can still have a dequeue in-flight per qdisc, which
is typically per CPU.
Fixes: c5ad119fb6 ("net: sched: pfifo_fast use skb_array")
Reported-by: Jakob Unterwurzacher <jakob.unterwurzacher@theobroma-systems.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
KMSAN reports use of uninitialized memory in the case when |alen| is
smaller than sizeof(struct sockaddr_nl), and therefore |nladdr| isn't
fully copied from the userspace.
Signed-off-by: Alexander Potapenko <glider@google.com>
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the SMC experimental TCP option in a SYN packet is lost on
the server side when SYN Cookies are active. However, the corresponding
SYNACK sent back to the client contains the SMC option. This causes an
inconsistent view of the SMC capabilities on the client and server.
This patch disables the SMC option in the SYNACK when SYN Cookies are
active to avoid this issue.
Fixes: 60e2a77807 ("tcp: TCP experimental option for SMC")
Signed-off-by: Hans Wippel <hwippel@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for your net tree,
they are:
1) Don't pick fixed hash implementation for NFT_SET_EVAL sets, otherwise
userspace hits EOPNOTSUPP with valid rules using the meter statement,
from Florian Westphal.
2) If you send a batch that flushes the existing ruleset (that contains
a NAT chain) and the new ruleset definition comes with a new NAT
chain, don't bogusly hit EBUSY. Also from Florian.
3) Missing netlink policy attribute validation, from Florian.
4) Detach conntrack template from skbuff if IP_NODEFRAG is set on,
from Paolo Abeni.
5) Cache device names in flowtable object, otherwise we may end up
walking over devices going aways given no rtnl_lock is held.
6) Fix incorrect net_device ingress with ingress hooks.
7) Fix crash when trying to read more data than available in UDP
packets from the nf_socket infrastructure, from Subash.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
skb_header_pointer will copy data into a buffer if data is non linear,
otherwise it will return a pointer in the linear section of the data.
nf_sk_lookup_slow_v{4,6} always copies data of size udphdr but later
accesses memory within the size of tcphdr (th->doff) in case of TCP
packets. This causes a crash when running with KASAN with the following
call stack -
BUG: KASAN: stack-out-of-bounds in xt_socket_lookup_slow_v4+0x524/0x718
net/netfilter/xt_socket.c:178
Read of size 2 at addr ffffffe3d417a87c by task syz-executor/28971
CPU: 2 PID: 28971 Comm: syz-executor Tainted: G B W O 4.9.65+ #1
Call trace:
[<ffffff9467e8d390>] dump_backtrace+0x0/0x428 arch/arm64/kernel/traps.c:76
[<ffffff9467e8d7e0>] show_stack+0x28/0x38 arch/arm64/kernel/traps.c:226
[<ffffff946842d9b8>] __dump_stack lib/dump_stack.c:15 [inline]
[<ffffff946842d9b8>] dump_stack+0xd4/0x124 lib/dump_stack.c:51
[<ffffff946811d4b0>] print_address_description+0x68/0x258 mm/kasan/report.c:248
[<ffffff946811d8c8>] kasan_report_error mm/kasan/report.c:347 [inline]
[<ffffff946811d8c8>] kasan_report.part.2+0x228/0x2f0 mm/kasan/report.c:371
[<ffffff946811df44>] kasan_report+0x5c/0x70 mm/kasan/report.c:372
[<ffffff946811bebc>] check_memory_region_inline mm/kasan/kasan.c:308 [inline]
[<ffffff946811bebc>] __asan_load2+0x84/0x98 mm/kasan/kasan.c:739
[<ffffff94694d6f04>] __tcp_hdrlen include/linux/tcp.h:35 [inline]
[<ffffff94694d6f04>] xt_socket_lookup_slow_v4+0x524/0x718 net/netfilter/xt_socket.c:178
Fix this by copying data into appropriate size headers based on protocol.
Fixes: a583636a83 ("inet: refactor inet[6]_lookup functions to take skb")
Signed-off-by: Tejaswi Tanikella <tejaswit@codeaurora.org>
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
DHCP connectivity issues can currently occur if the following conditions
are met:
1) A DHCP packet from a client to a server
2) This packet has a multicast destination
3) This destination has a matching entry in the translation table
(FF:FF:FF:FF:FF:FF for IPv4, 33:33:00:01:00:02/33:33:00:01:00:03
for IPv6)
4) The orig-node determined by TT for the multicast destination
does not match the orig-node determined by best-gateway-selection
In this case the DHCP packet will be dropped.
The "gateway-out-of-range" check is supposed to only be applied to
unicasted DHCP packets to a specific DHCP server.
In that case dropping the the unicasted frame forces the client to
retry via a broadcasted one, but now directed to the new best
gateway.
A DHCP packet with broadcast/multicast destination is already ensured to
always be delivered to the best gateway. Dropping a multicasted
DHCP packet here will only prevent completing DHCP as there is no
other fallback.
So far, it seems the unicast check was implicitly performed by
expecting the batadv_transtable_search() to return NULL for multicast
destinations. However, a multicast address could have always ended up in
the translation table and in fact is now common.
To fix this potential loss of a DHCP client-to-server packet to a
multicast address this patch adds an explicit multicast destination
check to reliably bail out of the gateway-out-of-range check for such
destinations.
The issue and fix were tested in the following three node setup:
- Line topology, A-B-C
- A: gateway client, DHCP client
- B: gateway server, hop-penalty increased: 30->60, DHCP server
- C: gateway server, code modifications to announce FF:FF:FF:FF:FF:FF
Without this patch, A would never transmit its DHCP Discover packet
due to an always "out-of-range" condition. With this patch,
a full DHCP handshake between A and B was possible again.
Fixes: be7af5cf9c ("batman-adv: refactoring gateway handling code")
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
For multicast frames AP isolation is only supposed to be checked on
the receiving nodes and never on the originating one.
Furthermore, the isolation or wifi flag bits should only be intepreted
as such for unicast and never multicast TT entries.
By injecting flags to the multicast TT entry claimed by a single
target node it was verified in tests that this multicast address
becomes unreachable, leading to packet loss.
Omitting the "src" parameter to the batadv_transtable_search() call
successfully skipped the AP isolation check and made the target
reachable again.
Fixes: 1d8ab8d3c1 ("batman-adv: Modified forwarding behaviour for multicast packets")
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
For tunnels created with IFLA_MTU, MTU of the netdevice is set by
rtnl_create_link() (called from rtnl_newlink()) before the device is
registered. However without IFLA_MTU that's not done.
rtnl_newlink() proceeds by calling struct rtnl_link_ops.newlink, which
via ip_tunnel_newlink() calls register_netdevice(), and that emits
NETDEV_REGISTER. Thus any listeners that inspect the netdevice get the
MTU of 0.
After ip_tunnel_newlink() corrects the MTU after registering the
netdevice, but since there's no event, the listeners don't get to know
about the MTU until something else happens--such as a NETDEV_UP event.
That's not ideal.
So instead of setting the MTU directly, go through dev_set_mtu(), which
takes care of distributing the necessary NETDEV_PRECHANGEMTU and
NETDEV_CHANGEMTU events.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* ath9k_htc doesn't like QoS NDP frames, use regular ones
* hwsim: set up wmediumd for radios created later
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAlqySjUACgkQB8qZga/f
l8RDig//bV/Fnwn7deyR7LJOi/g6HVyueCUi+cturTo9RQEQHnBtRVRu3c2Lnd+o
74f+2ofEWEFOYqTvZq5jUWjANOZ/ZGgA+fUw6tOfSmjkEPw9EkST8mQl7lKH0dSR
DYrRLKCHwPof9MQQgXHLq44e/26yFCxYP+ADSn9Q6yqo84e/cxP76nqSwYGLforx
By1zzkMXPKtfvXTZ41UZTfXRMml2LIxBbCoWTfScIZWvusQjCl653f3lBfR3fynj
qwzJJfcjt1TnOQSgCLzk03xi+ci5o157//GmvjT2hrRjRL3i22e+cmFtnXGNCjtg
avmc7HftV0sAY8gecqhCLOiWnCuxxjDz1KxZW3dccuJxh1/jfsGtby3H/6stkU4s
EU7AU/QhXL7HPHop+fyI+DapFmry0/h1tdKqoSGffONF/qMJmbkXFUvG0kAMqoWh
nb/LTlGaVqk8Bz7hNbzP6zkPgEep3i6dBC9iRdfK3gclcEALsh2Z0mx6+ae3FEX3
HHbfB9RTDYUT0Tk33cVFrjMsOpwdhcGsu1ncMJc/iuTOGI/AO6TVhgPEjEj/4xth
FO7SzpIpRUe47yKqM0Ykvoz5EwE2IWUt0/eXj+1X2hPNGpcYEZ4bvHQGk8D2h7tA
kwd0YqEXjiUw53cSnS4+A45ecOhErH/ZND/ULb65KMl/JDf5gOQ=
=fs06
-----END PGP SIGNATURE-----
Merge tag 'mac80211-for-davem-2018-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
Two more fixes (in three patches):
* ath9k_htc doesn't like QoS NDP frames, use regular ones
* hwsim: set up wmediumd for radios created later
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
For multipath routes the ONLINK flag can be specified per nexthop in
rtnh_flags or globally in rtm_flags. Update ip6_route_multipath_add
to consider the ONLINK setting coming from rtnh_flags. Each loop over
nexthops the config for the sibling route is initialized to the global
config and then per nexthop settings overlayed. The flag is 'or'ed into
fib6_config to handle the ONLINK flag coming from either rtm_flags or
rtnh_flags.
Fixes: fc1e64e109 ("net/ipv6: Add support for onlink flag")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- fix possible IPv6 packet loss when multicast extension is used, by Linus Luessing
- fix SKB handling issues for TTVN and DAT, by Matthias Schiffer (two patches)
- fix include for eventpoll, by Sven Eckelmann
- fix skb checksum for ttvn reroutes, by Sven Eckelmann
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEE1ilQI7G+y+fdhnrfoSvjmEKSnqEFAlqv5tcWHHN3QHNpbW9u
d3VuZGVybGljaC5kZQAKCRChK+OYQpKeob7PD/wPnjVmFl6uQlRHfOfUBzGvW9hU
ASakGylzfZTXzP2VviMJ+7JehXQgediM4caTr/8jWhJFeVxu5jmeYzv8nzOlwZJ/
PjNlX+ChuIAWv1wyxHI3YWwj7Y1Ox9Lp8Gs7m5UbmfyiZEtb+ybHjvPzZ/FMfhWo
hOgbndSpFKfwFuXFkXyitektBcKTiFUyjk9U22v91XCQZZzMb8KqougVBE+bdg0w
N8LfgKXVC448sUNKTEczhji7bFuJwR+Sogx1TwIIsyfT2Tp4JXY3A7M1LDwAL/Rc
nTu9L58vk4qUouRjIQJ7rhhoxdPSrD5p3oinzVnLvJBxgOS3pxegY3asX8sRMXsj
bgolIGCZ9vDfA21WXqa3RyjUiBEx9UId6W++h+22kVqVHt5tqIFK2nVQ6IInOXwB
kzd3UDrRBLgRdBpwlu/ii9rn68MOEdLNpL3Seo9DkViJESfLn1ODnFA1Y2rFoK6S
cVeYl1DVj4SQSXjNilqYhFZoTEo/kxN5KhgHRXzujTv6MCHVa2SnCZaXS5EXp8n6
u4gpPc363Isj17A+UVlmHmmzA6d8Jqe8veVvET0xSvQPHmE7eepMkJU9hIUNXfcv
fENmQnU10pS1keJqS6xuR6k2RxMoofYRUPpKEigCcH7z/0GytWQc6QVGmo19LD9b
GNcWAhKI453pc1IuUA==
=hsM8
-----END PGP SIGNATURE-----
Merge tag 'batadv-net-for-davem-20180319' of git://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
Here are some batman-adv bugfixes:
- fix possible IPv6 packet loss when multicast extension is used, by Linus Luessing
- fix SKB handling issues for TTVN and DAT, by Matthias Schiffer (two patches)
- fix include for eventpoll, by Sven Eckelmann
- fix skb checksum for ttvn reroutes, by Sven Eckelmann
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The netfilter netdevice event handler hold the nfnl_lock mutex, this
avoids races with a device going away while such device is being
attached to hooks from the netlink control plane. Therefore, either
control plane bails out with ENOENT or netdevice event path waits until
the hook that is attached to net_device is registered.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Devices going away have to grab the nfnl_lock from the netdev event path
to avoid races with control plane updates.
However, netlink dumps in netfilter do not hold nfnl_lock mutex. Cache
the device name into the objects to avoid an use-after-free situation
for a device that is going away.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
tcf_skbmod_init() can fail after the idr has been successfully reserved.
When this happens, every subsequent attempt to configure skbmod rules
using the same idr value will systematically fail with -ENOSPC, unless
the first attempt was done using the 'replace' keyword:
# tc action add action skbmod swap mac index 100
RTNETLINK answers: Cannot allocate memory
We have an error talking to the kernel
# tc action add action skbmod swap mac index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
# tc action add action skbmod swap mac index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
...
Fix this in tcf_skbmod_init(), ensuring that tcf_idr_release() is called
on the error path when the idr has been reserved, but not yet inserted.
Also, don't test 'ovr' in the error path, to avoid a 'replace' failure
implicitly become a 'delete' that leaks refcount in act_skbmod module:
# rmmod act_skbmod; modprobe act_skbmod
# tc action add action skbmod swap mac index 100
# tc action add action skbmod swap mac continue index 100
RTNETLINK answers: File exists
We have an error talking to the kernel
# tc action replace action skbmod swap mac continue index 100
RTNETLINK answers: Cannot allocate memory
We have an error talking to the kernel
# tc action list action skbmod
#
# rmmod act_skbmod
rmmod: ERROR: Module act_skbmod is in use
Fixes: 65a206c01e ("net/sched: Change act_api and act_xxx modules to use IDR")
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tcf_vlan_init() can fail after the idr has been successfully reserved.
When this happens, every subsequent attempt to configure vlan rules using
the same idr value will systematically fail with -ENOSPC, unless the first
attempt was done using the 'replace' keyword.
# tc action add action vlan pop index 100
RTNETLINK answers: Cannot allocate memory
We have an error talking to the kernel
# tc action add action vlan pop index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
# tc action add action vlan pop index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
...
Fix this in tcf_vlan_init(), ensuring that tcf_idr_release() is called on
the error path when the idr has been reserved, but not yet inserted. Also,
don't test 'ovr' in the error path, to avoid a 'replace' failure implicitly
become a 'delete' that leaks refcount in act_vlan module:
# rmmod act_vlan; modprobe act_vlan
# tc action add action vlan push id 5 index 100
# tc action replace action vlan push id 7 index 100
RTNETLINK answers: Cannot allocate memory
We have an error talking to the kernel
# tc action list action vlan
#
# rmmod act_vlan
rmmod: ERROR: Module act_vlan is in use
Fixes: 4c5b9d9642 ("act_vlan: VLAN action rewrite to use RCU lock/unlock and update")
Fixes: 65a206c01e ("net/sched: Change act_api and act_xxx modules to use IDR")
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
__tcf_ipt_init() can fail after the idr has been successfully reserved.
When this happens, subsequent attempts to configure xt/ipt rules using
the same idr value systematically fail with -ENOSPC:
# tc action add action xt -j LOG --log-prefix test1 index 100
tablename: mangle hook: NF_IP_POST_ROUTING
target: LOG level warning prefix "test1" index 100
RTNETLINK answers: Cannot allocate memory
We have an error talking to the kernel
Command "(null)" is unknown, try "tc actions help".
# tc action add action xt -j LOG --log-prefix test1 index 100
tablename: mangle hook: NF_IP_POST_ROUTING
target: LOG level warning prefix "test1" index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
Command "(null)" is unknown, try "tc actions help".
# tc action add action xt -j LOG --log-prefix test1 index 100
tablename: mangle hook: NF_IP_POST_ROUTING
target: LOG level warning prefix "test1" index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
...
Fix this in the error path of __tcf_ipt_init(), calling tcf_idr_release()
in place of tcf_idr_cleanup(). Since tcf_ipt_release() can now be called
when tcfi_t is NULL, we also need to protect calls to ipt_destroy_target()
to avoid NULL pointer dereference.
Fixes: 65a206c01e ("net/sched: Change act_api and act_xxx modules to use IDR")
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tcf_pedit_init() can fail to allocate 'keys' after the idr has been
successfully reserved. When this happens, subsequent attempts to configure
a pedit rule using the same idr value systematically fail with -ENOSPC:
# tc action add action pedit munge ip ttl set 63 index 100
RTNETLINK answers: Cannot allocate memory
We have an error talking to the kernel
# tc action add action pedit munge ip ttl set 63 index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
# tc action add action pedit munge ip ttl set 63 index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
...
Fix this in the error path of tcf_act_pedit_init(), calling
tcf_idr_release() in place of tcf_idr_cleanup().
Fixes: 65a206c01e ("net/sched: Change act_api and act_xxx modules to use IDR")
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tcf_act_police_init() can fail after the idr has been successfully
reserved (e.g., qdisc_get_rtab() may return NULL). When this happens,
subsequent attempts to configure a police rule using the same idr value
systematiclly fail with -ENOSPC:
# tc action add action police rate 1000 burst 1000 drop index 100
RTNETLINK answers: Cannot allocate memory
We have an error talking to the kernel
# tc action add action police rate 1000 burst 1000 drop index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
# tc action add action police rate 1000 burst 1000 drop index 100
RTNETLINK answers: No space left on device
...
Fix this in the error path of tcf_act_police_init(), calling
tcf_idr_release() in place of tcf_idr_cleanup().
Fixes: 65a206c01e ("net/sched: Change act_api and act_xxx modules to use IDR")
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
if the kernel fails to duplicate 'sdata', creation of a new action fails
with -ENOMEM. However, subsequent attempts to install the same action
using the same value of 'index' systematically fail with -ENOSPC, and
that value of 'index' will no more be usable by act_simple, until rmmod /
insmod of act_simple.ko is done:
# tc actions add action simple sdata hello index 100
# tc actions list action simple
action order 0: Simple <hello>
index 100 ref 1 bind 0
# tc actions flush action simple
# tc actions add action simple sdata hello index 100
RTNETLINK answers: Cannot allocate memory
We have an error talking to the kernel
# tc actions flush action simple
# tc actions add action simple sdata hello index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
# tc actions add action simple sdata hello index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
...
Fix this in the error path of tcf_simp_init(), calling tcf_idr_release()
in place of tcf_idr_cleanup().
Fixes: 65a206c01e ("net/sched: Change act_api and act_xxx modules to use IDR")
Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
when the following command sequence is entered
# tc action add action bpf bytecode '4,40 0 0 12,31 0 1 2048,6 0 0 262144,6 0 0 0' index 100
RTNETLINK answers: Invalid argument
We have an error talking to the kernel
# tc action add action bpf bytecode '4,40 0 0 12,21 0 1 2048,6 0 0 262144,6 0 0 0' index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
act_bpf correctly refuses to install the first TC rule, because 31 is not
a valid instruction. However, it refuses to install the second TC rule,
even if the BPF code is correct. Furthermore, it's no more possible to
install any other rule having the same value of 'index' until act_bpf
module is unloaded/inserted again. After the idr has been reserved, call
tcf_idr_release() instead of tcf_idr_cleanup(), to fix this issue.
Fixes: 65a206c01e ("net/sched: Change act_api and act_xxx modules to use IDR")
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 7b6ddeaf27 ("mac80211: use QoS NDP for AP probing") added an
argument qos_ok to ieee80211_nullfunc_get to support QoS NDP. Despite
the claim in the commit log "Change all the drivers to *not* allow
QoS NDP for now, even though it looks like most of them should be OK
with that", this commit enables QoS NDP in response to beacons (see
change to mlme.c:ieee80211_send_nullfunc), causing ath9k_htc to lose
IP connectivity. See:
https://patchwork.kernel.org/patch/10241109/https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=891060
Introduce a hardware flag to allow such buggy drivers to override the
correct default behaviour of mac80211 of sending QoS NDP packets.
Signed-off-by: Ben Caradoc-Davies <ben@transient.nz>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Fixes: 2f987a76a9 ("net: ipv6: keep sk status consistent after datagram connect failure")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current code performs unneeded free. Remove the redundant skb freeing
during the error path.
Fixes: 1555d204e7 ("devlink: Support for pipeline debug (dpipe)")
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Trofimovich reported that restoring an nft ruleset doesn't work
anymore unless old rule content is flushed first.
The problem stems from a recent change designed to prevent multiple nat
hooks at the same hook point locations and nftables transaction model.
A 'flush ruleset' won't take effect until the entire transaction has
completed.
So, if one has a nft.rules file that contains a 'flush ruleset',
followed by a nat hook register request, then 'nft -f file' will work,
but running 'nft -f file' again will fail with -EBUSY.
Reason is that nftables will place the flush/removal requests in the
transaction list, but it will not act on the removal until after all new
rules are in place.
The netfilter core will therefore get request to register a new nat
hook before the old one is removed -- this now fails as the netfilter
core can't know the existing hook is staged for removal.
To fix this, we can search the transaction log when a hook collision
is detected. The collision is okay if
1. there is a delete request pending for the nat hook that is already
registered.
2. there is no second add request for a matching nat hook.
This is required to only apply the exception once.
Fixes: f92b40a8b2 ("netfilter: core: only allow one nat hook per hook point")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
in nftables, 'meter' can be used to instantiate a hash-table at run
time:
rule add filter forward iif "internal" meter hostacct { ip saddr counter}
nft list meter ip filter hostacct
table ip filter {
meter hostacct {
type ipv4_addr
elements = { 192.168.0.1 : counter packets 8 bytes 2672, ..
because elemets get added on the fly, the kernel must chose a set
backend type that implements the ->update() function, otherwise
rule insertion fails with EOPNOTSUPP.
Therefore, skip set types that lack ->update, and also
make sure we do not discard a (bad) candidate when we did yet
find any candidate at all. This could happen when userspace prefers
low memory footprint -- the set implementation currently checked might
not be a fit at all. Make sure we pick it anyway (!bops). In
case next candidate is a better fix, it will be chosen instead.
But in case nothing else is found we at least have a non-ideal
match rather than no match at all.
Fixes: 6c03ae210c ("netfilter: nft_set_hash: add non-resizable hashtable implementation")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
batadv_check_unicast_ttvn may redirect a packet to itself or another
originator. This involves rewriting the ttvn and the destination address in
the batadv unicast header. These field were not yet pulled (with skb rcsum
update) and thus any change to them also requires a change in the receive
checksum.
Reported-by: Matthias Schiffer <mschiffer@universe-factory.net>
Fixes: a73105b8d4 ("batman-adv: improved client announcement mechanism")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
syzbot reported one use-after-free in pfifo_fast_enqueue() [1]
Issue here is that we can not reuse skb after a successful skb_array_produce()
since another cpu might have consumed it already.
I believe a similar problem exists in try_bulk_dequeue_skb_slow()
in case we put an skb into qdisc_enqueue_skb_bad_txq() for lockless qdisc.
[1]
BUG: KASAN: use-after-free in qdisc_pkt_len include/net/sch_generic.h:610 [inline]
BUG: KASAN: use-after-free in qdisc_qstats_cpu_backlog_inc include/net/sch_generic.h:712 [inline]
BUG: KASAN: use-after-free in pfifo_fast_enqueue+0x4bc/0x5e0 net/sched/sch_generic.c:639
Read of size 4 at addr ffff8801cede37e8 by task syzkaller717588/5543
CPU: 1 PID: 5543 Comm: syzkaller717588 Not tainted 4.16.0-rc4+ #265
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x24d lib/dump_stack.c:53
print_address_description+0x73/0x250 mm/kasan/report.c:256
kasan_report_error mm/kasan/report.c:354 [inline]
kasan_report+0x23c/0x360 mm/kasan/report.c:412
__asan_report_load4_noabort+0x14/0x20 mm/kasan/report.c:432
qdisc_pkt_len include/net/sch_generic.h:610 [inline]
qdisc_qstats_cpu_backlog_inc include/net/sch_generic.h:712 [inline]
pfifo_fast_enqueue+0x4bc/0x5e0 net/sched/sch_generic.c:639
__dev_xmit_skb net/core/dev.c:3216 [inline]
Fixes: c5ad119fb6 ("net: sched: pfifo_fast use skb_array")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot+ed43b6903ab968b16f54@syzkaller.appspotmail.com
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Checking for 0 is insufficient: when an SKB without a batadv header, but
with a VLAN header is received, hdr_size will be 4, making the following
code interpret the Ethernet header as a batadv header.
Fixes: be1db4f661 ("batman-adv: make the Distributed ARP Table vlan aware")
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
batadv_check_unicast_ttvn() calls skb_cow(), so pointers into the SKB data
must be (re)set after calling it. The ethhdr variable is dropped
altogether.
Fixes: 7cdcf6dddc ("batman-adv: add UNICAST_4ADDR packet type")
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
When errors are enqueued to the error queue via sock_queue_err_skb()
function, it is possible that the waiting application is not notified.
Calling 'sk->sk_data_ready()' would not notify applications that
selected only POLLERR events in poll() (for example).
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: Randy E. Witt <randy.e.witt@intel.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
nlmsg_multicast() consumes always the skb, thus the original skb must be
freed only when this function is called with a clone.
Fixes: cb9f7a9a5c ("netlink: ensure to loop over all netns in genlmsg_multicast_allns()")
Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Free memory by calling put_device(), if afiucv_iucv_init is not
successful.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>