Commit Graph

737166 Commits

Author SHA1 Message Date
Prashant Sreedharan
506b0a395f tg3: APE heartbeat changes
In ungraceful host shutdown or driver crash case BMC connectivity is
lost. APE firmware is missing the driver state in this
case to keep the BMC connectivity alive.
This patch has below change to address this issue.

Heartbeat mechanism with APE firmware. This heartbeat mechanism
is needed to notify the APE firmware about driver state.

This patch also has the change in wait time for APE event from
1ms to 20ms as there can be some delay in getting response.

v2: Drop inline keyword as per David suggestion.

Signed-off-by: Prashant Sreedharan <prashant.sreedharan@broadcom.com>
Signed-off-by: Satish Baddipadige <satish.baddipadige@broadcom.com>
Signed-off-by: Siva Reddy Kallam <siva.kallam@broadcom.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-19 14:16:52 -05:00
Ido Schimmel
d1c95af366 mlxsw: spectrum_router: Do not unconditionally clear route offload indication
When mlxsw replaces (or deletes) a route it removes the offload
indication from the replaced route. This is problematic for IPv4 routes,
as the offload indication is stored in the fib_info which is usually
shared between multiple routes.

Instead of unconditionally clearing the offload indication, only clear
it if no other route is using the fib_info.

Fixes: 3984d1a89f ("mlxsw: spectrum_router: Provide offload indication using nexthop flags")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Alexander Petrovskiy <alexpe@mellanox.com>
Tested-by: Alexander Petrovskiy <alexpe@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-19 11:21:08 -05:00
David S. Miller
cae69256fe Merge branch 'qualcomm-rmnet-Fix-issues-with-CONFIG_DEBUG_PREEMPT-enabled'
Subash Abhinov Kasiviswanathan says:

====================
net: qualcomm: rmnet: Fix issues with CONFIG_DEBUG_PREEMPT enabled

Patch 1 and 2 fixes issues identified when CONFIG_DEBUG_PREEMPT was
enabled. These involve APIs which were called in invalid contexts.

Patch 3 is a null derefence fix identified by code inspection.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-19 11:17:34 -05:00
Subash Abhinov Kasiviswanathan
f57bbaae72 net: qualcomm: rmnet: Fix possible null dereference in command processing
If a command packet with invalid mux id is received, the packet would
not have a valid endpoint. This invalid endpoint maybe dereferenced
leading to a crash. Identified by manual code inspection.

Fixes: 3352e6c457 ("net: qualcomm: rmnet: Convert the muxed endpoint to hlist")
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-19 11:17:34 -05:00
Subash Abhinov Kasiviswanathan
4dba8bbce9 net: qualcomm: rmnet: Fix warning seen with 64 bit stats
With CONFIG_DEBUG_PREEMPT enabled, a warning was seen on device
creation. This occurs due to the incorrect cpu API usage in
ndo_get_stats64 handler.

BUG: using smp_processor_id() in preemptible [00000000] code: rmnetcli/5743
caller is debug_smp_processor_id+0x1c/0x24
Call trace:
[<ffffff9d48c8967c>] dump_backtrace+0x0/0x2a8
[<ffffff9d48c89bbc>] show_stack+0x20/0x28
[<ffffff9d4901fff8>] dump_stack+0xa8/0xe0
[<ffffff9d490421e0>] check_preemption_disabled+0x104/0x108
[<ffffff9d49042200>] debug_smp_processor_id+0x1c/0x24
[<ffffff9d494a36b0>] rmnet_get_stats64+0x64/0x13c
[<ffffff9d49b014e0>] dev_get_stats+0x68/0xd8
[<ffffff9d49d58df8>] rtnl_fill_stats+0x54/0x140
[<ffffff9d49b1f0b8>] rtnl_fill_ifinfo+0x428/0x9cc
[<ffffff9d49b23834>] rtmsg_ifinfo_build_skb+0x80/0xf4
[<ffffff9d49b23930>] rtnetlink_event+0x88/0xb4
[<ffffff9d48cd21b4>] raw_notifier_call_chain+0x58/0x78
[<ffffff9d49b028a4>] call_netdevice_notifiers_info+0x48/0x78
[<ffffff9d49b08bf8>] __netdev_upper_dev_link+0x290/0x5e8
[<ffffff9d49b08fcc>] netdev_master_upper_dev_link+0x3c/0x48
[<ffffff9d494a2e74>] rmnet_newlink+0xf0/0x1c8
[<ffffff9d49b23360>] rtnl_newlink+0x57c/0x6c8
[<ffffff9d49b2355c>] rtnetlink_rcv_msg+0xb0/0x244
[<ffffff9d49b5230c>] netlink_rcv_skb+0xb4/0xdc
[<ffffff9d49b204f4>] rtnetlink_rcv+0x34/0x44
[<ffffff9d49b51af0>] netlink_unicast+0x1ec/0x294
[<ffffff9d49b51fdc>] netlink_sendmsg+0x320/0x390
[<ffffff9d49ae6858>] sock_sendmsg+0x54/0x60
[<ffffff9d49ae91bc>] SyS_sendto+0x1a0/0x1e4
[<ffffff9d48c83770>] el0_svc_naked+0x24/0x28

Fixes: 192c4b5d48 ("net: qualcomm: rmnet: Add support for 64 bit stats")
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-19 11:17:34 -05:00
Subash Abhinov Kasiviswanathan
b37f78f234 net: qualcomm: rmnet: Fix crash on real dev unregistration
With CONFIG_DEBUG_PREEMPT enabled, a crash with the following call
stack was observed when removing a real dev which had rmnet devices
attached to it.
To fix this, remove the netdev_upper link APIs and instead use the
existing information in rmnet_port and rmnet_priv to get the
association between real and rmnet devs.

BUG: sleeping function called from invalid context
in_atomic(): 0, irqs_disabled(): 0, pid: 5762, name: ip
Preemption disabled at:
[<ffffff9d49043564>] debug_object_active_state+0xa4/0x16c
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
Modules linked in:
PC is at ___might_sleep+0x13c/0x180
LR is at ___might_sleep+0x17c/0x180
[<ffffff9d48ce0924>] ___might_sleep+0x13c/0x180
[<ffffff9d48ce09c0>] __might_sleep+0x58/0x8c
[<ffffff9d49d6253c>] mutex_lock+0x2c/0x48
[<ffffff9d48ed4840>] kernfs_remove_by_name_ns+0x48/0xa8
[<ffffff9d48ed6ec8>] sysfs_remove_link+0x30/0x58
[<ffffff9d49b05840>] __netdev_adjacent_dev_remove+0x14c/0x1e0
[<ffffff9d49b05914>] __netdev_adjacent_dev_unlink_lists+0x40/0x68
[<ffffff9d49b08820>] netdev_upper_dev_unlink+0xb4/0x1fc
[<ffffff9d494a29f0>] rmnet_dev_walk_unreg+0x6c/0xc8
[<ffffff9d49b00b40>] netdev_walk_all_lower_dev_rcu+0x58/0xb4
[<ffffff9d494a30fc>] rmnet_config_notify_cb+0xf4/0x134
[<ffffff9d48cd21b4>] raw_notifier_call_chain+0x58/0x78
[<ffffff9d49b028a4>] call_netdevice_notifiers_info+0x48/0x78
[<ffffff9d49b0b568>] rollback_registered_many+0x230/0x3c8
[<ffffff9d49b0b738>] unregister_netdevice_many+0x38/0x94
[<ffffff9d49b1e110>] rtnl_delete_link+0x58/0x88
[<ffffff9d49b201dc>] rtnl_dellink+0xbc/0x1cc
[<ffffff9d49b2355c>] rtnetlink_rcv_msg+0xb0/0x244
[<ffffff9d49b5230c>] netlink_rcv_skb+0xb4/0xdc
[<ffffff9d49b204f4>] rtnetlink_rcv+0x34/0x44
[<ffffff9d49b51af0>] netlink_unicast+0x1ec/0x294
[<ffffff9d49b51fdc>] netlink_sendmsg+0x320/0x390
[<ffffff9d49ae6858>] sock_sendmsg+0x54/0x60
[<ffffff9d49ae6f94>] ___sys_sendmsg+0x298/0x2b0
[<ffffff9d49ae98f8>] SyS_sendmsg+0xb4/0xf0
[<ffffff9d48c83770>] el0_svc_naked+0x24/0x28

Fixes: ceed73a2cf ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation")
Fixes: 60d58f971c ("net: qualcomm: rmnet: Implement bridge mode")
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-19 11:17:33 -05:00
Xin Long
9ab2323ca1 sctp: remove the left unnecessary check for chunk in sctp_renege_events
Commit fb23403536 ("sctp: remove the useless check in
sctp_renege_events") forgot to remove another check for
chunk in sctp_renege_events.

Dan found this when doing a static check.

This patch is to remove that check, and also to merge
two checks into one 'if statement'.

Fixes: fb23403536 ("sctp: remove the useless check in sctp_renege_events")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-16 16:32:37 -05:00
David Howells
a16b8d0cf2 rxrpc: Work around usercopy check
Due to a check recently added to copy_to_user(), it's now not permitted to
copy from slab-held data to userspace unless the slab is whitelisted.  This
affects rxrpc_recvmsg() when it attempts to place an RXRPC_USER_CALL_ID
control message in the userspace control message buffer.  A warning is
generated by usercopy_warn() because the source is the copy of the
user_call_ID retained in the rxrpc_call struct.

Work around the issue by copying the user_call_ID to a variable on the
stack and passing that to put_cmsg().

The warning generated looks like:

	Bad or missing usercopy whitelist? Kernel memory exposure attempt detected from SLUB object 'dmaengine-unmap-128' (offset 680, size 8)!
	WARNING: CPU: 0 PID: 1401 at mm/usercopy.c:81 usercopy_warn+0x7e/0xa0
	...
	RIP: 0010:usercopy_warn+0x7e/0xa0
	...
	Call Trace:
	 __check_object_size+0x9c/0x1a0
	 put_cmsg+0x98/0x120
	 rxrpc_recvmsg+0x6fc/0x1010 [rxrpc]
	 ? finish_wait+0x80/0x80
	 ___sys_recvmsg+0xf8/0x240
	 ? __clear_rsb+0x25/0x3d
	 ? __clear_rsb+0x15/0x3d
	 ? __clear_rsb+0x25/0x3d
	 ? __clear_rsb+0x15/0x3d
	 ? __clear_rsb+0x25/0x3d
	 ? __clear_rsb+0x15/0x3d
	 ? __clear_rsb+0x25/0x3d
	 ? __clear_rsb+0x15/0x3d
	 ? finish_task_switch+0xa6/0x2b0
	 ? trace_hardirqs_on_caller+0xed/0x180
	 ? _raw_spin_unlock_irq+0x29/0x40
	 ? __sys_recvmsg+0x4e/0x90
	 __sys_recvmsg+0x4e/0x90
	 do_syscall_64+0x7a/0x220
	 entry_SYSCALL_64_after_hwframe+0x26/0x9b

Reported-by: Jonathan Billings <jsbillings@jsbillings.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Kees Cook <keescook@chromium.org>
Tested-by: Jonathan Billings <jsbillings@jsbillings.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-16 16:22:27 -05:00
Eric Dumazet
43a08e0f58 tun: fix tun_napi_alloc_frags() frag allocator
<Mark Rutland reported>
    While fuzzing arm64 v4.16-rc1 with Syzkaller, I've been hitting a
    misaligned atomic in __skb_clone:

        atomic_inc(&(skb_shinfo(skb)->dataref));

   where dataref doesn't have the required natural alignment, and the
   atomic operation faults. e.g. i often see it aligned to a single
   byte boundary rather than a four byte boundary.

   AFAICT, the skb_shared_info is misaligned at the instant it's
   allocated in __napi_alloc_skb()  __napi_alloc_skb()
</end of report>

Problem is caused by tun_napi_alloc_frags() using
napi_alloc_frag() with user provided seg sizes,
leading to other users of this API getting unaligned
page fragments.

Since we would like to not necessarily add paddings or alignments to
the frags that tun_napi_alloc_frags() attaches to the skb, switch to
another page frag allocator.

As a bonus skb_page_frag_refill() can use GFP_KERNEL allocations,
meaning that we can not deplete memory reserves as easily.

Fixes: 90e33d4594 ("tun: enable napi_gro_frags() for TUN/TAP driver")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-16 16:20:46 -05:00
Alexey Kodanev
15f35d49c9 udplite: fix partial checksum initialization
Since UDP-Lite is always using checksum, the following path is
triggered when calculating pseudo header for it:

  udp4_csum_init() or udp6_csum_init()
    skb_checksum_init_zero_check()
      __skb_checksum_validate_complete()

The problem can appear if skb->len is less than CHECKSUM_BREAK. In
this particular case __skb_checksum_validate_complete() also invokes
__skb_checksum_complete(skb). If UDP-Lite is using partial checksum
that covers only part of a packet, the function will return bad
checksum and the packet will be dropped.

It can be fixed if we skip skb_checksum_init_zero_check() and only
set the required pseudo header checksum for UDP-Lite with partial
checksum before udp4_csum_init()/udp6_csum_init() functions return.

Fixes: ed70fcfcee ("net: Call skb_checksum_init in IPv4")
Fixes: e4f45b7f40 ("net: Call skb_checksum_init in IPv6")
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-16 15:57:42 -05:00
David S. Miller
da27988766 skbuff: Fix comment mis-spelling.
'peform' --> 'perform'

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-16 15:52:42 -05:00
Paolo Abeni
dfec091439 dn_getsockoptdecnet: move nf_{get/set}sockopt outside sock lock
After commit 3f34cfae12 ("netfilter: on sockopt() acquire sock lock
only in the required scope"), the caller of nf_{get/set}sockopt() must
not hold any lock, but, in such changeset, I forgot to cope with DECnet.

This commit addresses the issue moving the nf call outside the lock,
in the dn_{get,set}sockopt() with the same schema currently used by
ipv4 and ipv6. Also moves the unhandled sockopts of the end of the main
switch statements, to improve code readability.

Reported-by: Petr Vandrovec <petr@vandrovec.name>
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=198791#c2
Fixes: 3f34cfae12 ("netfilter: on sockopt() acquire sock lock only in the required scope")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-16 15:46:15 -05:00
Casey Leedom
7dcf688d4c PCI/cxgb4: Extend T3 PCI quirk to T4+ devices
We've run into a problem where our device is attached
to a Virtual Machine and the use of the new pci_set_vpd_size()
API doesn't help.  The VM kernel has been informed that
the accesses are okay, but all of the actual VPD Capability
Accesses are trapped down into the KVM Hypervisor where it
goes ahead and imposes the silent denials.

The right idea is to follow the kernel.org
commit 1c7de2b4ff ("PCI: Enable access to non-standard VPD for
Chelsio devices (cxgb3)") which Alexey Kardashevskiy authored
to establish a PCI Quirk for our T3-based adapters. This commit
extends that PCI Quirk to cover Chelsio T4 devices and later.

The advantage of this approach is that the VPD Size gets set early
in the Base OS/Hypervisor Boot and doesn't require that the cxgb4
driver even be available in the Base OS/Hypervisor.  Thus PF4 can
be exported to a Virtual Machine and everything should work.

Fixes: 67e658794c ("cxgb4: Set VPD size so we can read both VPD structures")
Cc: <stable@vger.kernel.org>  # v4.9+
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-16 15:41:53 -05:00
Rahul Lakkireddy
e6f02a4d57 cxgb4: fix trailing zero in CIM LA dump
Set correct size of the CIM LA dump for T6.

Fixes: 27887bc7cb ("cxgb4: collect hardware LA dumps")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-16 15:30:36 -05:00
Ganesh Goudar
c4e43e14cd cxgb4: free up resources of pf 0-3
free pf 0-3 resources, commit baf5086840 ("cxgb4:
restructure VF mgmt code") erroneously removed the
code which frees the pf 0-3 resources, causing the
probe of pf 0-3 to fail in case of driver reload.

Fixes: baf5086840 ("cxgb4: restructure VF mgmt code")
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-16 15:29:52 -05:00
Stefano Brivio
a8c6db1dfd fib_semantics: Don't match route with mismatching tclassid
In fib_nh_match(), if output interface or gateway are passed in
the FIB configuration, we don't have to check next hops of
multipath routes to conclude whether we have a match or not.

However, we might still have routes with different realms
matching the same output interface and gateway configuration,
and this needs to cause the match to fail. Otherwise the first
route inserted in the FIB will match, regardless of the realms:

 # ip route add 1.1.1.1 dev eth0 table 1234 realms 1/2
 # ip route append 1.1.1.1 dev eth0 table 1234 realms 3/4
 # ip route list table 1234
 1.1.1.1 dev eth0 scope link realms 1/2
 1.1.1.1 dev eth0 scope link realms 3/4
 # ip route del 1.1.1.1 dev ens3 table 1234 realms 3/4
 # ip route list table 1234
 1.1.1.1 dev ens3 scope link realms 3/4

whereas route with realms 3/4 should have been deleted instead.

Explicitly check for fc_flow passed in the FIB configuration
(this comes from RTA_FLOW extracted by rtm_to_fib_config()) and
fail matching if it differs from nh_tclassid.

The handling of RTA_FLOW for multipath routes later in
fib_nh_match() is still needed, as we can have multiple RTA_FLOW
attributes that need to be matched against the tclassid of each
next hop.

v2: Check that fc_flow is set before discarding the match, so
    that the user can still select the first matching rule by
    not specifying any realm, as suggested by David Ahern.

Reported-by: Jianlin Shi <jishi@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-16 15:19:54 -05:00
Kees Cook
fe9c842695 NFC: llcp: Limit size of SDP URI
The tlv_len is u8, so we need to limit the size of the SDP URI. Enforce
this both in the NLA policy and in the code that performs the allocation
and copy, to avoid writing past the end of the allocated buffer.

Fixes: d9b8d8e19b ("NFC: llcp: Service Name Lookup netlink interface")
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-16 15:16:05 -05:00
Boris Pismenny
c410c1966f tls: getsockopt return record sequence number
Return the TLS record sequence number in getsockopt.

Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 15:05:19 -05:00
Boris Pismenny
257082e6ae tls: reset the crypto info if copy_from_user fails
copy_from_user could copy some partial information, as a result
TLS_CRYPTO_INFO_READY(crypto_info) could be true while crypto_info is
using uninitialzed data.

This patch resets crypto_info when copy_from_user fails.

fixes: 3c4d755915 ("tls: kernel TLS support")
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 15:05:19 -05:00
Boris Pismenny
a1dfa6812b tls: retrun the correct IV in getsockopt
Current code returns four bytes of salt followed by four bytes of IV.
This patch returns all eight bytes of IV.

fixes: 3c4d755915 ("tls: kernel TLS support")
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 15:05:19 -05:00
David S. Miller
8ace02073e Merge branch 'net-segmentation-offload-doc-fixes'
Daniel Axtens says:

====================
Updates to segmentation-offloads.txt

I've been trying to wrap my head around GSO for a while now. This is a
set of small changes to the docs that would probably have been helpful
when I was starting out.

I realise that GSO_DODGY is still a notable omission - I'm hesitant to
write too much on it just yet as I don't understand it well and I
think it's in the process of changing.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 14:52:39 -05:00
Daniel Axtens
a677088922 docs: segmentation-offloads.txt: add SCTP info
Most of this is extracted from 90017accff ("sctp: Add GSO support"),
with some extra text about GSO_BY_FRAGS and the need to check for it.

Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 14:52:39 -05:00
Daniel Axtens
bc3c2431d4 docs: segmentation-offloads.txt: Fix ref to SKB_GSO_TUNNEL_REMCSUM
The doc originally called it SKB_GSO_REMCSUM. Fix it.

Fixes: f7a6272bf3 ("Documentation: Add documentation for TSO and GSO features")
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 14:52:39 -05:00
Daniel Axtens
a65820e695 docs: segmentation-offloads.txt: update for UFO depreciation
UFO is deprecated except for tuntap and packet per 0c19f846d5,
("net: accept UFO datagrams from tuntap and packet"). Update UFO
docs to reflect this.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 14:52:38 -05:00
David S. Miller
080fe7aa18 Merge branch 'tipc-locking-fixes'
Ying Xue says:

====================
tipc: Fix missing RTNL lock protection during setting link properties

At present it's unsafe to configure link properties through netlink
as the entire setting process is not under RTNL lock protection. Now
TIPC supports two different sets of netlink APIs at the same time, and
they share the same set of backend functions to configure bearer,
media and net properties. In order to solve the missing RTNL issue,
we have to make the whole __tipc_nl_compat_doit() protected by RTNL,
which means any function called within it cannot take RTNL any more.
So in the series we first introduce the following new functions which
doesn't hold RTNl lock:

 - __tipc_nl_bearer_disable()
 - __tipc_nl_bearer_enable()
 - __tipc_nl_bearer_set()
 - __tipc_nl_media_set()
 - __tipc_nl_net_set()

Meanwhile, __tipc_nl_compat_doit() has been reconstructed to minimize
the time of holding RTNL lock.

Changes in v4:
 - Per suggestion of Kirill Tkhai, divided original big one patch into
   seven small ones so that they can be easily reviewed.

Changes in v3:
 - Optimized return method of __tipc_nl_bearer_enable() regarding
   the comments from David M and Kirill Tkhai
 - Moved the allocations of memory in __tipc_nl_compat_doit() out
   of RTNL lock to minimize the time of holding RTNL lock according
   to the suggestion of Kirill Tkhai.

Changes in v2:
 - The whole operation of setting bearer/media properties has been
   protected under RTNL, as per feedback from David M.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 14:46:33 -05:00
Ying Xue
ed4ffdfec2 tipc: Fix missing RTNL lock protection during setting link properties
Currently when user changes link properties, TIPC first checks if
user's command message contains media name or bearer name through
tipc_media_find() or tipc_bearer_find() which is protected by RTNL
lock. But when tipc_nl_compat_link_set() conducts the checking with
the two functions, it doesn't hold RTNL lock at all, as a result,
the following complaints were reported:

audit: type=1400 audit(1514679888.244:9): avc:  denied  { write } for
pid=3194 comm="syzkaller021477" path="socket:[11143]" dev="sockfs"
ino=11143 scontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023
tcontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023
tclass=netlink_generic_socket permissive=1
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>

=============================
WARNING: suspicious RCU usage
4.15.0-rc5+ #152 Not tainted
-----------------------------
net/tipc/bearer.c:177 suspicious rcu_dereference_protected() usage!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1
2 locks held by syzkaller021477/3194:
  #0:  (cb_lock){++++}, at: [<00000000d20133ea>] genl_rcv+0x19/0x40
net/netlink/genetlink.c:634
  #1:  (genl_mutex){+.+.}, at: [<00000000fcc5d1bc>] genl_lock
net/netlink/genetlink.c:33 [inline]
  #1:  (genl_mutex){+.+.}, at: [<00000000fcc5d1bc>] genl_rcv_msg+0x115/0x140
net/netlink/genetlink.c:622

stack backtrace:
CPU: 1 PID: 3194 Comm: syzkaller021477 Not tainted 4.15.0-rc5+ #152
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:17 [inline]
  dump_stack+0x194/0x257 lib/dump_stack.c:53
  lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4585
  tipc_bearer_find+0x2b4/0x3b0 net/tipc/bearer.c:177
  tipc_nl_compat_link_set+0x329/0x9f0 net/tipc/netlink_compat.c:729
  __tipc_nl_compat_doit net/tipc/netlink_compat.c:288 [inline]
  tipc_nl_compat_doit+0x15b/0x660 net/tipc/netlink_compat.c:335
  tipc_nl_compat_handle net/tipc/netlink_compat.c:1119 [inline]
  tipc_nl_compat_recv+0x112f/0x18f0 net/tipc/netlink_compat.c:1201
  genl_family_rcv_msg+0x7b7/0xfb0 net/netlink/genetlink.c:599
  genl_rcv_msg+0xb2/0x140 net/netlink/genetlink.c:624
  netlink_rcv_skb+0x21e/0x460 net/netlink/af_netlink.c:2408
  genl_rcv+0x28/0x40 net/netlink/genetlink.c:635
  netlink_unicast_kernel net/netlink/af_netlink.c:1275 [inline]
  netlink_unicast+0x4e8/0x6f0 net/netlink/af_netlink.c:1301
  netlink_sendmsg+0xa4a/0xe60 net/netlink/af_netlink.c:1864
  sock_sendmsg_nosec net/socket.c:636 [inline]
  sock_sendmsg+0xca/0x110 net/socket.c:646
  sock_write_iter+0x31a/0x5d0 net/socket.c:915
  call_write_iter include/linux/fs.h:1772 [inline]
  new_sync_write fs/read_write.c:469 [inline]
  __vfs_write+0x684/0x970 fs/read_write.c:482
  vfs_write+0x189/0x510 fs/read_write.c:544
  SYSC_write fs/read_write.c:589 [inline]
  SyS_write+0xef/0x220 fs/read_write.c:581
  do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline]
  do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389
  entry_SYSENTER_compat+0x54/0x63 arch/x86/entry/entry_64_compat.S:129

In order to correct the mistake, __tipc_nl_compat_doit() has been
protected by RTNL lock, which means the whole operation of setting
bearer/media properties is under RTNL protection.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reported-by: syzbot <syzbot+6345fd433db009b29413@syzkaller.appspotmail.com>

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 14:46:33 -05:00
Ying Xue
5631f65dec tipc: Introduce __tipc_nl_net_set
Introduce __tipc_nl_net_set() which doesn't hold RTNL lock.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 14:46:33 -05:00
Ying Xue
07ffb22357 tipc: Introduce __tipc_nl_media_set
Introduce __tipc_nl_media_set() which doesn't hold RTNL lock.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 14:46:32 -05:00
Ying Xue
93532bb1d4 tipc: Introduce __tipc_nl_bearer_set
Introduce __tipc_nl_bearer_set() which doesn't holding RTNL lock.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 14:46:32 -05:00
Ying Xue
45cf7edfbc tipc: Introduce __tipc_nl_bearer_enable
Introduce __tipc_nl_bearer_enable() which doesn't hold RTNL lock.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 14:46:32 -05:00
Ying Xue
d59d8b77ab tipc: Introduce __tipc_nl_bearer_disable
Introduce __tipc_nl_bearer_disable() which doesn't hold RTNL lock.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 14:46:32 -05:00
Ying Xue
e5d1a1eec0 tipc: Refactor __tipc_nl_compat_doit
As preparation for adding RTNL to make (*cmd->transcode)() and
(*cmd->transcode)() constantly protected by RTNL lock, we move out of
memory allocations existing between them as many as possible so that
the time of holding RTNL can be minimized in __tipc_nl_compat_doit().

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 14:46:32 -05:00
David S. Miller
361b123180 Merge branch 'ibmvnic-leaks'
Thomas Falcon says:

====================
ibmvnic: Fix memory leaks in the driver

This patch set is pretty self-explanatory. It includes
a number of patches that fix memory leaks found with
kmemleak in the ibmvnic driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 14:39:11 -05:00
Thomas Falcon
d0869c0071 ibmvnic: Clean RX pool buffers during device close
During device close or reset, there were some cases of outstanding
RX socket buffers not being freed. Include a function similar to the
one that already exists to clean TX socket buffers in this case.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 14:39:10 -05:00
Thomas Falcon
4b9b0f0135 ibmvnic: Free RX socket buffer in case of adapter error
If a RX buffer is returned to the client driver with an error, free the
corresponding socket buffer before continuing.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 14:39:10 -05:00
Thomas Falcon
6e4842ddfc ibmvnic: Fix NAPI structures memory leak
This memory is allocated during initialization but never freed,
so do that now.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 14:39:10 -05:00
Thomas Falcon
34f0f4e3f4 ibmvnic: Fix login buffer memory leaks
During device bringup, the driver exchanges login buffers with
firmware. These buffers contain information such number of TX
and RX queues alloted to the device, RX buffer size, etc. These
buffers weren't being properly freed on device reset or close.

We can free the buffer we send to firmware as soon as we get
a response. There is information in the response buffer that
the driver needs for normal operation so retain it until the
next reset or removal.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 14:39:09 -05:00
Thomas Falcon
cc85c02edf ibmvnic: Wait until reset is complete to set carrier on
Pushes back setting the carrier on until the end of the reset
code. This resolves a bug where a watchdog timer was detecting
that a TX queue had stalled before the adapter reset was complete.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 14:31:34 -05:00
Jesper Dangaard Brouer
e6dbe9397e Revert "net: thunderx: Add support for xdp redirect"
This reverts commit aa136d0c82.

As I previously[1] pointed out this implementation of XDP_REDIRECT is
wrong.  XDP_REDIRECT is a facility that must work between different
NIC drivers.  Another NIC driver can call ndo_xdp_xmit/nicvf_xdp_xmit,
but your driver patch assumes payload data (at top of page) will
contain a queue index and a DMA addr, this is not true and worse will
likely contain garbage.

Given you have not fixed this in due time (just reached v4.16-rc1),
the only option I see is a revert.

[1] http://lkml.kernel.org/r/20171211130902.482513d3@redhat.com

Cc: Sunil Goutham <sgoutham@cavium.com>
Cc: Christina Jacob <cjacob@caviumnetworks.com>
Cc: Aleksey Makarov <aleksey.makarov@cavium.com>
Fixes: aa136d0c82 ("net: thunderx: Add support for xdp redirect")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 14:23:39 -05:00
Xin Long
fae8b6f4a6 sctp: fix some copy-paste errors for file comments
This patch is to fix the file comments in stream.c and
stream_interleave.c

v1->v2:
  rephrase the comment for stream.c according to Neil's suggestion.

Fixes: a83863174a ("sctp: prepare asoc stream for stream reconf")
Fixes: 0c3f6f6554 ("sctp: implement make_datafrag for sctp_stream_interleave")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 14:18:32 -05:00
Jakub Kicinski
ac5b70198a net: fix race on decreasing number of TX queues
netif_set_real_num_tx_queues() can be called when netdev is up.
That usually happens when user requests change of number of
channels/rings with ethtool -L.  The procedure for changing
the number of queues involves resetting the qdiscs and setting
dev->num_tx_queues to the new value.  When the new value is
lower than the old one, extra care has to be taken to ensure
ordering of accesses to the number of queues vs qdisc reset.

Currently the queues are reset before new dev->num_tx_queues
is assigned, leaving a window of time where packets can be
enqueued onto the queues going down, leading to a likely
crash in the drivers, since most drivers don't check if TX
skbs are assigned to an active queue.

Fixes: e6484930d7 ("net: allocate tx queues in register_netdevice")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-14 14:12:55 -05:00
Sowmini Varadhan
d4014d8cc6 rds: do not call ->conn_alloc with GFP_KERNEL
Commit ebeeb1ad9b ("rds: tcp: use rds_destroy_pending() to synchronize
netns/module teardown and rds connection/workq management")
adds an rcu read critical section to __rd_conn_create. The
memory allocations in that critcal section need to use
GFP_ATOMIC to avoid sleeping.

This patch was verified with syzkaller reproducer.

Reported-by: syzbot+a0564419941aaae3fe3c@syzkaller.appspotmail.com
Fixes: ebeeb1ad9b ("rds: tcp: use rds_destroy_pending() to synchronize
       netns/module teardown and rds connection/workq management")
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-13 13:52:02 -05:00
David S. Miller
f7219bf311 Merge branch 'net-sched-couple-of-fixes'
Jiri Pirko says:

====================
net: sched: couple of fixes

This patchset contains couple of fixes following-up the shared block
patchsets.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-13 12:29:03 -05:00
Jiri Pirko
339c21d7c4 net: sched: fix tc_u_common lookup
The offending commit wrongly assumes 1:1 mapping between block and q.
However, there are multiple blocks for a single q for classful qdiscs.
Since the obscure tc_u_common sharing mechanism expects it to be shared
among a qdisc, fix it by storing q pointer in case the block is not
shared.

Reported-by: Paweł Staszewski <pstaszewski@itcare.pl>
Reported-by: Cong Wang <xiyou.wangcong@gmail.com>
Fixes: 7fa9d974f3 ("net: sched: cls_u32: use block instead of q in tc_u_common")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-13 12:29:02 -05:00
Jiri Pirko
bb047ddd14 net: sched: don't set q pointer for shared blocks
It is pointless to set block->q for block which are shared among
multiple qdiscs. So remove the assignment in that case. Do a bit of code
reshuffle to make block->index initialized at that point so we can use
tcf_block_shared() helper.

Reported-by: Cong Wang <xiyou.wangcong@gmail.com>
Fixes: 4861738775 ("net: sched: introduce shared filter blocks infrastructure")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-13 12:29:02 -05:00
Jiri Pirko
0f2d2b2736 mlxsw: spectrum_router: Fix error path in mlxsw_sp_vr_create
Since mlxsw_sp_fib_create() and mlxsw_sp_mr_table_create()
use ERR_PTR macro to propagate int err through return of a pointer,
the return value is not NULL in case of failure. So if one
of the calls fails, one of vr->fib4, vr->fib6 or vr->mr4_table
is not NULL and mlxsw_sp_vr_is_used wrongly assumes
that vr is in use which leads to crash like following one:

[ 1293.949291] BUG: unable to handle kernel NULL pointer dereference at 00000000000006c9
[ 1293.952729] IP: mlxsw_sp_mr_table_flush+0x15/0x70 [mlxsw_spectrum]

Fix this by using local variables to hold the pointers and set vr->*
only in case everything went fine.

Fixes: 76610ebbde ("mlxsw: spectrum_router: Refactor virtual router handling")
Fixes: a3d9bc506d ("mlxsw: spectrum_router: Extend virtual routers with IPv6 support")
Fixes: d42b0965b1 ("mlxsw: spectrum_router: Add multicast routes notification handling functionality")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-13 12:22:29 -05:00
Tobias Klauser
d4e9a408ef net: af_unix: fix typo in UNIX_SKB_FRAGS_SZ comment
Change "minimun" to "minimum".

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-13 12:21:45 -05:00
Hauke Mehrtens
da360299b6 uapi/if_ether.h: move __UAPI_DEF_ETHHDR libc define
This fixes a compile problem of some user space applications by not
including linux/libc-compat.h in uapi/if_ether.h.

linux/libc-compat.h checks which "features" the header files, included
from the libc, provide to make the Linux kernel uapi header files only
provide no conflicting structures and enums. If a user application mixes
kernel headers and libc headers it could happen that linux/libc-compat.h
gets included too early where not all other libc headers are included
yet. Then the linux/libc-compat.h would not prevent all the
redefinitions and we run into compile problems.
This patch removes the include of linux/libc-compat.h from
uapi/if_ether.h to fix the recently introduced case, but not all as this
is more or less impossible.

It is no problem to do the check directly in the if_ether.h file and not
in libc-compat.h as this does not need any fancy glibc header detection
as glibc never provided struct ethhdr and should define
__UAPI_DEF_ETHHDR by them self when they will provide this.

The following test program did not compile correctly any more:

#include <linux/if_ether.h>
#include <netinet/in.h>
#include <linux/in.h>

int main(void)
{
	return 0;
}

Fixes: 6926e041a8 ("uapi/if_ether.h: prevent redefinition of struct ethhdr")
Reported-by: Guillaume Nault <g.nault@alphalink.fr>
Cc: <stable@vger.kernel.org> # 4.15
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-13 11:23:24 -05:00
Jan Glauber
07a2e1cf39 net: cavium: fix NULL pointer dereference in cavium_ptp_put
Prevent a kernel panic on reboot if ptp_clock is NULL by checking
the ptp pointer before using it.

Signed-off-by: Jan Glauber <jglauber@cavium.com>
Fixes: 8c56df372b ("net: add support for Cavium PTP coprocessor")
Cc: Radoslaw Biernacki <rad@semihalf.com>
Cc: Aleksey Makarov <aleksey.makarov@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-12 14:38:37 -05:00
Mika Westerberg
027d351c54 net: thunderbolt: Run disconnect flow asynchronously when logout is received
The control channel calls registered callbacks when control messages
such as XDomain protocol messages are received. The control channel
handling is done in a worker running on system workqueue which means the
networking driver can't run tear down flow which includes sending
disconnect request and waiting for a reply in the same worker. Otherwise
reply is never received (as the work is already running) and the
operation times out.

To fix this run disconnect ThunderboltIP flow asynchronously once
ThunderboltIP logout message is received.

Fixes: e69b6c02b4 ("net: Add support for networking over Thunderbolt cable")
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-12 12:03:04 -05:00