The check in sctp_sockaddr_af is not robust enough to forbid binding a
v4mapped v6 addr on a v4 socket.
The worse thing is that v4 socket's bind_verify would not convert this
v4mapped v6 addr to a v4 addr. syzbot even reported a crash as the v4
socket bound a v6 addr.
This patch is to fix it by doing the common sa.sa_family check first,
then AF_INET check for v4mapped v6 addrs.
Fixes: 7dab83de50 ("sctp: Support ipv6only AF_INET6 sockets.")
Reported-by: syzbot+7b7b518b1228d2743963@syzkaller.appspotmail.com
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After commit cea0cc80a6 ("sctp: use the right sk after waking up from
wait_buf sleep"), it may change to lock another sk if the asoc has been
peeled off in sctp_wait_for_sndbuf.
However, the asoc's new sk could be already closed elsewhere, as it's in
the sendmsg context of the old sk that can't avoid the new sk's closing.
If the sk's last one refcnt is held by this asoc, later on after putting
this asoc, the new sk will be freed, while under it's own lock.
This patch is to revert that commit, but fix the old issue by returning
error under the old sk's lock.
Fixes: cea0cc80a6 ("sctp: use the right sk after waking up from wait_buf sleep")
Reported-by: syzbot+ac6ea7baa4432811eb50@syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After introducing sctp_stream structure, sctp uses stream->outcnt as the
out stream nums instead of c.sinit_num_ostreams.
However when users use sinit in cmsg, it only updates c.sinit_num_ostreams
in sctp_sendmsg. At that moment, stream->outcnt is still using previous
value. If it's value is not updated, the sinit_num_ostreams of sinit could
not really work.
This patch is to fix it by updating stream->outcnt and reiniting stream
if stream outcnt has been change by sinit in sendmsg.
Fixes: a83863174a ("sctp: prepare asoc stream for stream reconf")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Due to architecture limitations, the IBM VNIC client driver is unable
to perform MAC address changes unless the device has "logged in" to
its backing device. Currently, pending MAC changes are handled before
login, resulting in an error and failure to change the MAC address.
Moving that chunk to the end of the ibmvnic_login function, when we are
sure that it was successful, fixes that.
The MAC address can be changed when the device is up or down, so
only check if the device is in a "PROBED" state before setting the
MAC address.
Fixes: c26eba03e4 ("ibmvnic: Update reset infrastructure to support tunable parameters")
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Reviewed-by: John Allen <jallen@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
NL_SET_ERR_MSG() and NL_SET_ERR_MSG_ATTR() lead to the following warning
in newer versions of gcc:
warning: array initialized from parenthesized string constant
Just remove the parentheses, they're not needed in this context since
anyway since there can be no operator precendence issues or similar.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jim Westfall says:
====================
ipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY
This used to be the previous behavior in older kernels but became broken in
a263b30936 (ipv4: Make neigh lookups directly in output packet path)
and then later removed because it was broken in 0bb4087cbe (ipv4: Fix neigh
lookup keying over loopback/point-to-point devices)
Not having this results in there being an arp entry for every remote ip
address that the device talks to. Given a fairly active device it can
cause the arp table to become huge and/or having to add/purge large number
of entires to keep within table size thresholds.
$ ip -4 neigh show nud noarp | grep tun | wc -l
55850
$ lnstat -k arp_cache:entries,arp_cache:allocs,arp_cache:destroys -c 10
arp_cach|arp_cach|arp_cach|
entries| allocs|destroys|
81493|620166816|620126069|
101867| 10186| 0|
113854| 5993| 0|
118773| 2459| 0|
27937| 18579| 63998|
39256| 5659| 0|
56231| 8487| 0|
65602| 4685| 0|
79697| 7047| 0|
90733| 5517| 0|
v2:
- fixes coding style issues
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Map all lookup neigh keys to INADDR_ANY for loopback/point-to-point devices
to avoid making an entry for every remote ip the device needs to talk to.
This used the be the old behavior but became broken in a263b30936
(ipv4: Make neigh lookups directly in output packet path) and later removed
in 0bb4087cbe (ipv4: Fix neigh lookup keying over loopback/point-to-point
devices) because it was broken.
Signed-off-by: Jim Westfall <jwestfall@surrealistic.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use n->primary_key instead of pkey to account for the possibility that a neigh
constructor function may have modified the primary_key value.
Signed-off-by: Jim Westfall <jwestfall@surrealistic.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
ARSTR is always located at the start of the TSU register region, thus
using add_reg() instead of add_tsu_reg() in __sh_eth_get_regs() to dump it
causes EDMR or EDSR (depending on the register layout) to be dumped instead
of ARSTR. Use the correct condition/macro there...
Fixes: 6b4b4fead3 ("sh_eth: Implement ethtool register dump operations")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit ceaa001a17.
The OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS attr should be designed
as a nested attribute to support all ERSPAN v1 and v2's fields.
The current attr is a be32 supporting only one field. Thus, this
patch reverts it and later patch will redo it using nested attr.
Signed-off-by: William Tu <u9012063@gmail.com>
Cc: Jiri Benc <jbenc@redhat.com>
Cc: Pravin Shelar <pshelar@ovn.org>
Acked-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
sendfile() calls can hang endless with using Kernel TLS if a socket error occurs.
Socket error codes must be inverted by Kernel TLS before returning because
they are stored with positive sign. If returned non-inverted they are
interpreted as number of bytes sent, causing endless looping of the
splice mechanic behind sendfile().
Signed-off-by: Robert Hering <r.hering@avm.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
In my last patch, I missed fact that cork.base.dst was not initialized
in ip6_make_skb() :
If ip6_setup_cork() returns an error, we might attempt a dst_release()
on some random pointer.
Fixes: 862c03ee1d ("ipv6: fix possible mem leaks in ipv6_make_skb()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These fall-through are expected.
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 3765d35ed8 ("net: ipv4: Convert inet_rtm_getroute to rcu
versions of route lookup") broke "ip route get" in the presence
of rules that specify iif lo.
Host-originated traffic always has iif lo, because
ip_route_output_key_hash and ip6_route_output_flags set the flow
iif to LOOPBACK_IFINDEX. Thus, putting "iif lo" in an ip rule is a
convenient way to select only originated traffic and not forwarded
traffic.
inet_rtm_getroute used to match these rules correctly because
even though it sets the flow iif to 0, it called
ip_route_output_key which overwrites iif with LOOPBACK_IFINDEX.
But now that it calls ip_route_output_key_hash_rcu, the ifindex
will remain 0 and not match the iif lo in the rule. As a result,
"ip route get" will return ENETUNREACH.
Fixes: 3765d35ed8 ("net: ipv4: Convert inet_rtm_getroute to rcu versions of route lookup")
Tested: https://android.googlesource.com/kernel/tests/+/master/net/test/multinetwork_test.py passes again
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
syzbot triggered the WARN_ON in netlink_ack testing the bad_attr value.
The problem is that netlink_rcv_skb loops over the skb repeatedly invoking
the callback and without resetting the extack leaving potentially stale
data. Initializing each time through avoids the WARN_ON.
Fixes: 2d4bc93368 ("netlink: extended ACK reporting")
Reported-by: syzbot+315fa6766d0f7c359327@syzkaller.appspotmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When tipc_node_find_by_name() fails, the nlmsg is not
freed.
While on it, switch to a goto label to properly
free it.
Fixes: be9c086715c ("tipc: narrow down exposure of struct tipc_node")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Jon Maloy <jon.maloy@ericsson.com>
Cc: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This driver lacks a MODULE_LICENSE tag, leading to a Kbuild warning:
WARNING: modpost: missing MODULE_LICENSE() in drivers/net/ethernet/cirrus/cs89x0.o
This adds license, author, and description according to the
comment block at the start of the file.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
ppp_dev_uninit(), which is the .ndo_uninit() handler of PPP devices,
needs to lock pn->all_ppp_mutex. Therefore we mustn't call
register_netdevice() with pn->all_ppp_mutex already locked, or we'd
deadlock in case register_netdevice() fails and calls .ndo_uninit().
Fortunately, we can unlock pn->all_ppp_mutex before calling
register_netdevice(). This lock protects pn->units_idr, which isn't
used in the device registration process.
However, keeping pn->all_ppp_mutex locked during device registration
did ensure that no device in transient state would be published in
pn->units_idr. In practice, unlocking it before calling
register_netdevice() doesn't change this property: ppp_unit_register()
is called with 'ppp_mutex' locked and all searches done in
pn->units_idr hold this lock too.
Fixes: 8cb775bc0a ("ppp: fix device unregistration upon netns deletion")
Reported-and-tested-by: syzbot+367889b9c9e279219175@syzkaller.appspotmail.com
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
This explains why is the net usage of __ptr_ring_peek
actually ok without locks.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 9P of Xen module is missing required license and module information.
See https://bugzilla.kernel.org/show_bug.cgi?id=198109
Reported-by: Alan Bartlett <ajb@elrepo.org>
Fixes: 868eb12273 ("xen/9pfs: introduce Xen 9pfs transport driver")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
syzbot reported a warning from rfkill_alloc(), and after a while
I think that the reason is that it was doing fault injection and
the dev_set_name() failed, leaving the name NULL, and we didn't
check the return value and got to rfkill_alloc() with a NULL name.
Since we really don't want a NULL name, we ought to check the
return value.
Fixes: fb28ad3590 ("net: struct device - replace bus_id with dev_name(), dev_set_name()")
Reported-by: syzbot+1ddfb3357e1d7bb5b5d3@syzkaller.appspotmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
When creating a new radio on the fly, hwsim allows this
to be done with an arbitrary number of channels, but
cfg80211 only supports a limited number of simultaneous
channels, leading to a warning.
Fix this by validating the number - this requires moving
the define for the maximum out to a visible header file.
Reported-by: syzbot+8dd9051ff19940290931@syzkaller.appspotmail.com
Fixes: b59ec8dd43 ("mac80211_hwsim: fix number of channels in interface combinations")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
When closing multiple wmediumd instances with many radios and try to
unload the mac80211_hwsim module, it may happen that the work items live
longer than the module. To wait especially for this deletion work items,
add a work queue, otherwise flush_scheduled_work would be necessary.
Signed-off-by: Benjamin Beichler <benjamin.beichler@uni-rostock.de>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
As ieee80211_bss_get_ie() derefences an RCU to return ssid_ie, both
the call to this function and any operation on this variable need
protection by the RCU read lock.
Fixes: 44905265bc ("nl80211: don't expose wdev->ssid for most interfaces")
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Paul reported that he got a report about undefined behaviour
that seems to me to originate in using uninitialized memory
when the channel structure here is used in the event code in
nl80211 later.
He never reported whether this fixed it, and I wasn't able
to trigger this so far, but we should do the right thing and
fully initialize the on-stack structure anyway.
Reported-by: Paul Menzel <pmenzel+linux-wireless@molgen.mpg.de>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Daniel Borkmann says:
====================
pull-request: bpf 2018-01-13
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Follow-up fix to the recent BPF out-of-bounds speculation
fix that prevents max_entries overflows and an undefined
behavior on 32 bit archs on index_mask calculation, from
Daniel.
2) Reject unsupported BPF_ARSH opcode in 32 bit ALU mode that
was otherwise throwing an unknown opcode warning in the
interpreter, from Daniel.
3) Typo fix in one of the user facing verbose() messages that
was added during the BPF out-of-bounds speculation fix,
from Colin.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJaV/8zAAoJEEg/ir3gV/o+Ui0IAJWzQe1HHNJeWUykLNYMbjbQ
vxnnsbYd5Y2j/Q+tJdi6FJxEvC8kF78BJvDp37plV+lqZAXsvaGlLgyWcnY/XkFs
byVjueQVyil/JUyIw5ciK8DqOOI/tdo5v51ZJ05JZYRdyP0E1KVwy9jXHEpVJKmm
QlIfjrzbv5p6ydysKi6Z9NFKJ9CjSIyHh4Ew8VQKuzJ+AtoZ8L6XWGDHvoZl8DAb
jweJwRU9bhBqyoWjRRGtTqfcNmwvhYcJwuzER9vKK4l4JAvzhl8PGzeWsE8wPYBW
uKNRAPET7HQQsZblPU2l66oZVFMXhBsBnuY90jlPlmeHB/KLBLZp/m4EIzHEsoM=
=EiuX
-----END PGP SIGNATURE-----
Merge tag 'mlx5-fixes-2018-01-11' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
Mellanox, mlx5 fixes 2018-01-11
The following series includes fixes to mlx5 core and netdev driver.
To highlight we have two critical fixes in this series:
1st patch from Eran to address a fix for Host2BMC Breakage.
2nd patch from Saeed to address the RDMA IRQ vector affinity settings query
issue, the patch provides the correct mlx5_core implementation for RDMA to
correctly query vector affinity.
I sent this patch privately to Sagi a week a go, so he could to test it
but I didn't hear from him.
All other patches are trivial misc fixes.
Please pull and let me know if there's any problem.
for -stable v4.14-y and later:
("net/mlx5: Fix get vector affinity helper function")
("{net,ib}/mlx5: Don't disable local loopback multicast traffic when needed")
Note: Merging this series with net-next will produce the following conflict:
<<<<<<< HEAD
u8 disable_local_lb[0x1];
u8 reserved_at_3e2[0x1];
u8 log_min_hairpin_wq_data_sz[0x5];
u8 reserved_at_3e8[0x3];
=======
u8 disable_local_lb_uc[0x1];
u8 disable_local_lb_mc[0x1];
u8 reserved_at_3e3[0x8];
>>>>>>> 359c96447ac2297fabe15ef30b60f3b4b71e7fd0
To resolve, use the following hunk:
i.e:
<<<<<<
u8 disable_local_lb_uc[0x1];
u8 disable_local_lb_mc[0x1];
u8 log_min_hairpin_wq_data_sz[0x5];
u8 reserved_at_3e8[0x3];
>>>>>>
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Steffen Klassert says:
====================
pull request (net): ipsec 2018-01-11
1) Don't allow to change the encap type on state updates.
The encap type is set on state initialization and
should not change anymore. From Herbert Xu.
2) Skip dead policies when rehashing to fix a
slab-out-of-bounds bug in xfrm_hash_rebuild.
From Florian Westphal.
3) Two buffer overread fixes in pfkey.
From Eric Biggers.
4) Fix rcu usage in xfrm_get_type_offload,
request_module can sleep, so can't be used
under rcu_read_lock. From Sabrina Dubroca.
5) Fix an uninitialized lock in xfrm_trans_queue.
Use __skb_queue_tail instead of skb_queue_tail
in xfrm_trans_queue as we don't need the lock.
From Herbert Xu.
6) Currently it is possible to create an xfrm state with an
unknown encap type in ESP IPv4. Fix this by returning an
error on unknown encap types. Also from Herbert Xu.
7) Fix sleeping inside a spinlock in xfrm_policy_cache_flush.
From Florian Westphal.
8) Fix ESP GRO when the headers not fully in the linear part
of the skb. We need to pull before we can access them.
9) Fix a skb leak on error in key_notify_policy.
10) Fix a race in the xdst pcpu cache, we need to
run the resolver routines with bottom halfes
off like the old flowcache did.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABCAAGBQJaV5ghAAoJEEp/3jgCEfOL/bEH/3UopVYhLpHPqIAGCQuONuvG
4imm/19uVIsbJMCHZ/bbksPPOaswqCKws5ScfIJKzg9vI8PQaQ28BnQh/vELHlmk
lUgdXgVPaCoM/3UlCquxmBn5IooLt0t7LdCvMO2MvF3mf+jThyfwcoc1H2xe1Yh2
k9dPQWcfNBY7jGTBGzqyuNwfg9DZSMyBRx4JmOsqlCI/GDcA8DsLHIykA4wRQRF+
gtSCYeUuZsfukCk4Z1ImoYyfbtK/tYH4s9UOc5pEsd0s0NUs8bn7hgQn2QKWmQt4
HsOzNTiAl/pXSRi8BMqPyFGVdZLT1cRkeo/FW1Lkxg5NqJmv005ebJ1Jx8YHW9A=
=JTlC
-----END PGP SIGNATURE-----
Merge tag 'ceph-for-4.15-rc8' of git://github.com/ceph/ceph-client
Pull ceph fixes from Ilya Dryomov:
"Two rbd fixes for 4.12 and 4.2 issues respectively, marked for
stable"
* tag 'ceph-for-4.15-rc8' of git://github.com/ceph/ceph-client:
rbd: set max_segments to USHRT_MAX
rbd: reacquire lock should update lock owner client id
introduced by yours truly.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJaVzO3AAoJEEEQszewGV1zATUQAJKL3aYxRbm+BzszDAivdZ9d
Pb4OEmMPyUMnYyl6o4GvCqJrWhIFCW3F5QY/4gsFt+ZYf2wsVZGvGGca3nYPFYG6
qCp+gIhpY32GlVW9lqVKVEplYkxYyLva+qCKzV+aiMar2kDV8YApEbv59X7R97ir
lsxujweIcYgTQYuEJI/Wr3vwqWGZL/hPVqydfI/gSKVYdqHca2hIvz5mZpYDVM/f
34hgX5V8z0yj8z4rUmgoShAicToK8de3sXChG7uBaFDGeOVX1LMgjUdFucQzHr0U
aPUAWYsrqVi5ZczK5ClhKFCr5y1cfYPv0I9bYsyqHsHFMJq3HMI1eRoFy25POFTi
RnelnThRQKPydPCVIGif3/xfDZOn/IVi9BN6AOxTjnAw/DA/gIavQUDcoh8j4EFK
BePwSiKyJKg4CT3q/beIz2Lxx7JrCzioB3Q1kNGQipIv/X5LeT7lNVEmWP3X1Dif
0od6eeOhFwTSK+sXO0pSHuTB+QhOatLBdoOaHbRnwWqkWMjwVxr2KAFrHzxBueXg
qBGD+GyytZjda9wxBJxWTJNYQA75xBZM1eZUvlv2Fu3hzlDvHUcSVJ8rC82qJ6k7
/xz2rJz86ic+MJ3p3qOdNlUOJ7pEskqiUcttK0/LWXeKhgw3+oTQEG/asGfrPeZg
GAp/6k4XgaXI8iALKsWi
=S8Fj
-----END PGP SIGNATURE-----
Merge tag 'gpio-v4.15-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO fix from Linus Walleij:
"Fix a raw vs elaborate GPIO descriptor bug introduced by yours truly"
* tag 'gpio-v4.15-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: Add missing open drain/source handling to gpiod_set_value_cansleep()
To avoid configuration override, timestamp set call will
be moved from the netdevice open flow to the init flow.
By this, a close-open procedure will not override the timestamp
configuration.
In addition, the change will rename mlx5e_timestamp_set function
to be mlx5e_timestamp_init.
Fixes: ef9814deaf ("net/mlx5e: Add HW timestamping (TS) support")
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
PPS event did not update ptp_clock_event fields, therefore,
timestamp value was not updated correctly. This fix updates the
event source and the timestamp value for each PPS event.
Fixes: 7c39afb394 ("net/mlx5: PTP code migration to driver core section")
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Reported-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Set features function sets dev->features in order to keep track of which
features were successfully changed and which weren't (in case the user
asks for more than one change in a single command).
This breaks the logic in __netdev_update_features which assumes that
dev->features is not changed on success and checks for diffs between
features and dev->features (diffs that might not exist at this point
because of the driver override).
The solution is to keep track of successful/failed feature changes and
assign them to dev->features in case of failure only.
Fixes: 0e405443e8 ("net/mlx5e: Improve set features ndo resiliency")
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Should not do the following swap between TCs 0 and 1
when max num of TCs is 1:
tclass[prio=0]=1, tclass[prio=1]=0, tclass[prio=i]=i (for i>1)
Fixes: 08fb1dacdd ("net/mlx5e: Support DCBNL IEEE ETS")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
ethtool statistics should be updated even when the interface is down
since it shows more than just netdev counters, which might change while
the logical link is down.
One useful use case, for example, is when running RoCE traffic over the
interface (while the logical link is down, but physical link is up) and
examining rx_prioX_bytes.
Fixes: f62b8bb8f2 ("net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality")
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
We didn't store the result of mlx5_init_once, due to that
mlx5_load_one returned success on error. Fix that.
Fixes: 59211bd3b6 ("net/mlx5: Split the load/unload flow into hardware and software flows")
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Change mlx5_get_uars_page to return ERR_PTR in case of
allocation failure. Change all callers accordingly to
check the IS_ERR(ptr) instead of NULL.
Fixes: 59211bd3b6 ("net/mlx5: Split the load/unload flow into hardware and software flows")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Fix a memory leak where in case that pci_alloc_irq_vectors failed,
priv->irq_info was not released.
Fixes: e126ba97db ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
mlx5_get_vector_affinity used to call pci_irq_get_affinity and after
reverting the patch that sets the device affinity via PCI_IRQ_AFFINITY
API, calling pci_irq_get_affinity becomes useless and it breaks RDMA
mlx5 users. To fix this, this patch provides an alternative way to
retrieve IRQ vector affinity using legacy IRQ API, following
smp_affinity read procfs implementation.
Fixes: 231243c827 ("Revert mlx5: move affinity hints assignments to generic code")
Fixes: a435393aca ("mlx5: move affinity hints assignments to generic code")
Cc: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
There are systems platform information management interfaces (such as
HOST2BMC) for which we cannot disable local loopback multicast traffic.
Separate disable_local_lb_mc and disable_local_lb_uc capability bits so
driver will not disable multicast loopback traffic if not supported.
(It is expected that Firmware will not set disable_local_lb_mc if
HOST2BMC is running for example.)
Function mlx5_nic_vport_update_local_lb will do best effort to
disable/enable UC/MC loopback traffic and return success only in case it
succeeded to changed all allowed by Firmware.
Adapt mlx5_ib and mlx5e to support the new cap bits.
Fixes: 2c43c5a036 ("net/mlx5e: Enable local loopback in loopback selftest")
Fixes: c85023e153 ("IB/mlx5: Add raw ethernet local loopback support")
Fixes: bded747bb4 ("net/mlx5: Add raw ethernet local loopback firmware command")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Cc: kernel-team@fb.com
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Pull vfs regression fix from Al Viro/
Fix a leak in socket() introduced by commit 8e1611e235 ("make
sock_alloc_file() do sock_release() on failures").
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
Fix a leak in socket(2) when we fail to allocate a file descriptor.
Pull networking fixes from David Miller:
1) BPF speculation prevention and BPF_JIT_ALWAYS_ON, from Alexei
Starovoitov.
2) Revert dev_get_random_name() changes as adjust the error code
returns seen by userspace definitely breaks stuff.
3) Fix TX DMA map/unmap on older iwlwifi devices, from Emmanuel
Grumbach.
4) From wrong AF family when requesting sock diag modules, from Andrii
Vladyka.
5) Don't add new ipv6 routes attached to the null_entry, from Wei Wang.
6) Some SCTP sockopt length fixes from Marcelo Ricardo Leitner.
7) Don't leak when removing VLAN ID 0, from Cong Wang.
8) Hey there's a potential leak in ipv6_make_skb() too, from Eric
Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (27 commits)
ipv6: sr: fix TLVs not being copied using setsockopt
ipv6: fix possible mem leaks in ipv6_make_skb()
mlxsw: spectrum_qdisc: Don't use variable array in mlxsw_sp_tclass_congestion_enable
mlxsw: pci: Wait after reset before accessing HW
nfp: always unmask aux interrupts at init
8021q: fix a memory leak for VLAN 0 device
of_mdio: avoid MDIO bus removal when a PHY is missing
caif_usb: use strlcpy() instead of strncpy()
doc: clarification about setting SO_ZEROCOPY
net: gianfar_ptp: move set_fipers() to spinlock protecting area
sctp: make use of pre-calculated len
sctp: add a ceiling to optlen in some sockopts
sctp: GFP_ATOMIC is not needed in sctp_setsockopt_events
bpf: introduce BPF_JIT_ALWAYS_ON config
bpf: avoid false sharing of map refcount with max_entries
ipv6: remove null_entry before adding default route
SolutionEngine771x: add Ether TSU resource
SolutionEngine771x: fix Ether platform data
docs-rst: networking: wire up msg_zerocopy
net: ipv4: emulate READ_ONCE() on ->hdrincl bit-field in raw_sendmsg()
...
Got broken by "make sock_alloc_file() do sock_release() on failures" -
cleanup after sock_map_fd() failure got pulled all the way into
sock_alloc_file(), but it used to serve the case when sock_map_fd()
failed *before* getting to sock_alloc_file() as well, and that got
lost. Trivial to fix, fortunately.
Fixes: 8e1611e235 (make sock_alloc_file() do sock_release() on failures)
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
syzkaller tried to alloc a map with 0xfffffffd entries out of a userns,
and thus unprivileged. With the recently added logic in b2157399cc
("bpf: prevent out-of-bounds speculation") we round this up to the next
power of two value for max_entries for unprivileged such that we can
apply proper masking into potentially zeroed out map slots.
However, this will generate an index_mask of 0xffffffff, and therefore
a + 1 will let this overflow into new max_entries of 0. This will pass
allocation, etc, and later on map access we still enforce on the original
attr->max_entries value which was 0xfffffffd, therefore triggering GPF
all over the place. Thus bail out on overflow in such case.
Moreover, on 32 bit archs roundup_pow_of_two() can also not be used,
since fls_long(max_entries - 1) can result in 32 and 1UL << 32 in 32 bit
space is undefined. Therefore, do this by hand in a 64 bit variable.
This fixes all the issues triggered by syzkaller's reproducers.
Fixes: b2157399cc ("bpf: prevent out-of-bounds speculation")
Reported-by: syzbot+b0efb8e572d01bce1ae0@syzkaller.appspotmail.com
Reported-by: syzbot+6c15e9744f75f2364773@syzkaller.appspotmail.com
Reported-by: syzbot+d2f5524fb46fd3b312ee@syzkaller.appspotmail.com
Reported-by: syzbot+61d23c95395cc90dbc2b@syzkaller.appspotmail.com
Reported-by: syzbot+0d363c942452cca68c01@syzkaller.appspotmail.com
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The following snippet was throwing an 'unknown opcode cc' warning
in BPF interpreter:
0: (18) r0 = 0x0
2: (7b) *(u64 *)(r10 -16) = r0
3: (cc) (u32) r0 s>>= (u32) r0
4: (95) exit
Although a number of JITs do support BPF_ALU | BPF_ARSH | BPF_{K,X}
generation, not all of them do and interpreter does neither. We can
leave existing ones and implement it later in bpf-next for the
remaining ones, but reject this properly in verifier for the time
being.
Fixes: 17a5267067 ("bpf: verifier (add verifier core)")
Reported-by: syzbot+93c4904c5c70348a6890@syzkaller.appspotmail.com
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Trivial fix to spelling mistake in error message text.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Function ipv6_push_rthdr4 allows to add an IPv6 Segment Routing Header
to a socket through setsockopt, but the current implementation doesn't
copy possible TLVs at the end of the SRH received from userspace.
Therefore, the execution of the following branch if (sr_has_hmac(sr_phdr))
{ ... } will never complete since the len and type fields of a possible
HMAC TLV are not copied, hence seg6_get_tlv_hmac will return an error,
and the HMAC will not be computed.
This commit adds a memcpy in case TLVs have been appended to the SRH.
Fixes: a149e7c7ce ("ipv6: sr: add support for SRH injection through setsockopt")
Acked-by: David Lebrun <dlebrun@google.com>
Signed-off-by: Mathieu Xhonneux <m.xhonneux@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>