Commit Graph

828400 Commits

Author SHA1 Message Date
Linus Torvalds
600d725831 Power Supply Fixes for 5.1 cycle
* cpcap-battery: fix a division by zero
 * core: fix systemd issue due to log messages produced by uevent
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE72YNB0Y/i3JqeVQT2O7X88g7+poFAlzKDOMACgkQ2O7X88g7
 +prUnhAAlQoIMn2x7DnjYo1ZL6lj4+PHSzIC9qSVA3EGdRMq49pEHYcVuVN6kwCf
 amWMjq9Yt1787jSQM/QlvnA0XjKCPSQE7qFzLa6/VrHbNh4KXaZt5I6s9k0+/qKr
 SN30EeDfrtp9AYMu/AAZe+HtdSUCo29HuZsznb3QCCszGUG30PESuqHMqV9iCmul
 c4XIBMIsl0vxDxo2jJUzLtNdSlf9bmAQIiQ68GrkGCcPmP7bRg7U4i8UImB6D34v
 LTULG5U8coCbwZb+2n6Fc6Zek1GkWYteJODSomyxHpSt/6Y4z4s4XGcrkk4sYz8R
 QVUL/I5DXnZP9CZg/tJUuyno9PFK5AgS1695SGBNODQgEFcD11QViWlYZDt4/RZh
 QRiD2L5pSB39N+WSRcAmHns7JHLmrlc2kGChvF8ULNwLMuiziVrgv9uDfNHoebRG
 v2EGj2RInXlf06tME5cA1hQzcT2CrejkpHrSp84Hina1VUMig4VDi57QURSsnyyR
 e13jwF9k7Tnbn26YNAeGtzSjDHU8CjfQ9zlHOjnyr3F+eYlJrov+4hJWSBWHPJTj
 G55BIrVkOLAd4VhbhoGwGu9hLZC2hHVV4M25Ijp7zBSNGHxcb1t5ZcUx2PAwB7R4
 RZxqosnu+B6OQjHWy8vgIBaUUbMGM1/Ldrx/uFWGVtL+qBUJIiY=
 =Q783
 -----END PGP SIGNATURE-----

Merge tag 'for-v5.1-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply

Pull power supply fixes from Sebastian Reichel:
 "Two more fixes for the 5.1 cycle.

  One division by zero fix in a specific driver and one core workaround
  for bad userspace behaviour from systemd regarding uevents. IMHO this
  can be considered to be a userspace bug, but the debug messages are
  useless anyways

   - cpcap-battery: fix a division by zero

   - core: fix systemd issue due to log messages produced by uevent"

* tag 'for-v5.1-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply:
  power: supply: sysfs: prevent endless uevent loop with CONFIG_POWER_SUPPLY_DEBUG
  power: supply: cpcap-battery: Fix division by zero
2019-05-01 14:57:23 -07:00
Martin KaFai Lau
886b7a5010 ipv6: A few fixes on dereferencing rt->from
It is a followup after the fix in
commit 9c69a13205 ("route: Avoid crash from dereferencing NULL rt->from")

rt6_do_redirect():
1. NULL checking is needed on rt->from because a parallel
   fib6_info delete could happen that sets rt->from to NULL.
   (e.g. rt6_remove_exception() and fib6_drop_pcpu_from()).

2. fib6_info_hold() is not enough.  Same reason as (1).
   Meaning, holding dst->__refcnt cannot ensure
   rt->from is not NULL or rt->from->fib6_ref is not 0.

   Instead of using fib6_info_hold_safe() which ip6_rt_cache_alloc()
   is already doing, this patch chooses to extend the rcu section
   to keep "from" dereference-able after checking for NULL.

inet6_rtm_getroute():
1. NULL checking is also needed on rt->from for a similar reason.
   Note that inet6_rtm_getroute() is using RTNL_FLAG_DOIT_UNLOCKED.

Fixes: a68886a691 ("net/ipv6: Make from in rt6_info rcu protected")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Wei Wang <weiwan@google.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01 17:17:54 -04:00
Nicholas Mc Guire
f3505745c0 rds: ib: force endiannes annotation
While the endiannes is being handled correctly as indicated by the comment
above the offending line - sparse was unhappy with the missing annotation
as be64_to_cpu() expects a __be64 argument. To mitigate this annotation
all involved variables are changed to a consistent __le64 and the
 conversion to uint64_t delayed to the call to rds_cong_map_updated().

Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01 17:15:36 -04:00
Linus Torvalds
65beea4c3a ARC updates for 5.1 final
- regression in memset if line size !64
 
  - avoid panic if PAE and IOC
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJcycMVAAoJEGnX8d3iisJefY0P/R5jR6F0XDsnV/px0tQz2Dhk
 Lwl759QUoXeRzqeq8m2xDffIiEUmmPuUspauoVaGcHLMwBHOEOZ0WjTK3eSCF3XK
 zDT0cQ0oWcXyRZo46GLhW6hZYy17tFhXsE7fAURxIUrwu5sUaW5Z+YTMXrbRzWUb
 QmVJuf98sN4uheZ2Ayupg/5rOQhXh9+/IlMcybytmWQa+F4IuJkd505wNfQ3kWqQ
 ZeSNhzpKNAf4IofpfCowIGKE+9eyeeinMdYbppBkPCbX+ozpAj8ZcV3WRH5sFexP
 Ha3llkN4wBQZ0HglEO3FqRYnkU1qKgADXHcW3rz6nxNCb9WqGCk7sx1PuEARqAOX
 ZLvXT2c6+rUc74dORZAHjLdT4HEE1nSIp7KmuvYQY/Hg/AdcKcBtgJEAToJa4iNY
 lhZnjK5SXUV2ppPLI2duHphGeh3jzjR7Oa3ksLxKc2iQ2RBCJeyBj4G9g7NJYnL0
 sb5pUDadHkn5MkqOU0Zy6mQ+5Lm1CCfodc2F/3KMyxNxAkmC8mhDFsASgY/2Z3li
 llj9NX2Cnmw697shFQTg24hxKZ1yg49c1LhHnkIvnPdJ4DuDjWIHnpuzHnBWCbHS
 qRw4ZZ2Psr2hNTzauVwHYbpLALcHTMUXU6Ecx0TuEWNyGchehV4llmvCn1ZxWGmQ
 v+2cv+4Jt2l6ocRxENDY
 =Qm+2
 -----END PGP SIGNATURE-----

Merge tag 'arc-5.1-final' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc

Pull ARC fixes from Vineet Gupta:
 "A few minor fixes for ARC.

   - regression in memset if line size !64

   - avoid panic if PAE and IOC"

* tag 'arc-5.1-final' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARC: memset: fix build with L1_CACHE_SHIFT != 6
  ARC: [hsdk] Make it easier to add PAE40 region to DTB
  ARC: PAE40: don't panic and instead turn off hw ioc
2019-05-01 13:40:30 -07:00
Alex Williamson
15d2aba7c6 PCI/portdrv: Use shared MSI/MSI-X vector for Bandwidth Management
The Interrupt Message Number in the PCIe Capabilities register (PCIe r4.0,
sec 7.5.3.2) indicates which MSI/MSI-X vector is shared by interrupts
related to the PCIe Capability, including Link Bandwidth Management and
Link Autonomous Bandwidth Interrupts (Link Control, 7.5.3.7), Command
Completed and Hot-Plug Interrupts (Slot Control, 7.5.3.10), and the PME
Interrupt (Root Control, 7.5.3.12).

pcie_message_numbers() checked whether we want to enable PME or Hot-Plug
interrupts but neglected to check for Link Bandwidth Management, so if we
only wanted the Bandwidth Management interrupts, it decided we didn't need
any vectors at all.  Then pcie_port_enable_irq_vec() tried to reallocate
zero vectors, which failed, resulting in fallback to INTx.

On some systems, e.g., an X79-based workstation, that INTx seems broken or
not handled correctly, so we got spurious IRQ16 interrupts for Bandwidth
Management events.

Change pcie_message_numbers() so that if we want Link Bandwidth Management
interrupts, we use the shared MSI/MSI-X vector from the PCIe Capabilities
register.

Fixes: e8303bb7a7 ("PCI/LINK: Report degraded links via link bandwidth notification")
Link: https://lore.kernel.org/lkml/155597243666.19387.1205950870601742062.stgit@gimli.home
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2019-05-01 15:34:01 -05:00
Linus Torvalds
fb0af61d3a ACPI fix for 5.1-rc8
Revert a recent ACPICA change that caused initialization to fail on
 systems with Thunderbolt docking stations connected at the init time.
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAlzJV98SHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRx0FIP/iBkMXnLk6snFKOjOfHRqYx/cLzix67Q
 hcVDzUPoLF8FuXyRoHPiqH9eVMznLp801X1r0ER1GnCkp5GJPS05beXEhZBqPUNr
 dz8UXGL6iJjT7A2L8DRAyDg/H8Ql7ddg/7qKVQbjr9V8Yu1KFRTCP/6n+d3vhDuB
 PRi1yII7sSkzFhHRyS6Yl8PuW+hxeBfgoFelfNWwUBtozTkAW9TyBmDXIfqzZjfS
 QBh9y4lp0yOYDjhZ2zwKEQKCzO6g1l2OC7o0hvYnkwByrV/fKIfNMvzrdU7JO0sP
 42qq6ap52+2TBbI8fqf9hWuXDCnFZ1DtHLKs4vecGV2FzXN20Qh8lPiTi4bUQmze
 8KRVqfpSyKKQ4U00YuV47S1GoBnrUZ1LplH7tnIVoa0DSkubcAlA2+uUvjDXOKUJ
 C2QkKOQ+wVZTw2GEoj2JS0iKArb6tW/6F8I1dtZomIZnAEvTzPBGdbiX4zbp9LAp
 0PTU41tTmV/lROs2cVnyf7tU2dM57r6O4pMn1ZMK/Lq4aLm9R2iEAk29vhalgvVw
 HvgbCoXHz6cark7b90098JPnT/bCvnGXJqoZ6G/2vwoujuvNPA4FHbt7HG3ijo7u
 bxABEhewIeEiJHcmbjEeqczA35vSoeIqeXyYAhtAnlHG3jooexKSDXAmiHYY1+jp
 AfuAQL38awi0
 =2XoK
 -----END PGP SIGNATURE-----

Merge tag 'acpi-5.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fix from Rafael Wysocki:
 "Revert a recent ACPICA change that caused initialization to fail on
  systems with Thunderbolt docking stations connected at the init time"

* tag 'acpi-5.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  Revert "ACPICA: Clear status of GPEs before enabling them"
2019-05-01 13:03:39 -07:00
Linus Torvalds
7e74e235bb gcc-9: don't warn about uninitialized btrfs extent_type variable
The 'extent_type' variable does seem to be reliably initialized, but
it's _very_ non-obvious, since there's a "goto next" case that jumps
over the normal initialization.  That will then always trigger the
"start >= extent_end" test, which will end up never falling through to
the use of that variable.

But the code is certainly not obvious, and the compiler warning looks
reasonable.  Make 'extent_type' an int, and initialize it to an invalid
negative value, which seems to be the common pattern in other places.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-01 12:19:20 -07:00
Hangbin Liu
f68d7c44e7 selftests: fib_rule_tests: print the result and return 1 if any tests failed
Fixes: 65b2b4939a ("selftests: net: initial fib rule tests")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01 14:30:23 -04:00
Linus Torvalds
459e3a2153 gcc-9: properly declare the {pv,hv}clock_page storage
The pvlock_page and hvclock_page variables are (as the name implies)
addresses to pages, created by the linker script.

But we declared them as just "extern u8" variables, which _works_, but
now that gcc does some more bounds checking, it causes warnings like

    warning: array subscript 1 is outside array bounds of ‘u8[1]’

when we then access more than one byte from those variables.

Fix this by simply making the declaration of the variables match
reality, which makes the compiler happy too.

Signed-off-by: Linus Torvalds <torvalds@-linux-foundation.org>
2019-05-01 11:20:53 -07:00
Linus Torvalds
cf67690884 gcc-9: don't warn about uninitialized variable
I'm not sure what made gcc warn about this code now.  The 'ret' variable
does end up initialized in all cases, but it's definitely not obvious,
so the compiler is quite reasonable to warn about this.

So just add initialization to make it all much more obvious both to
compilers and to humans.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-01 11:07:40 -07:00
Linus Torvalds
6f303d6053 gcc-9: silence 'address-of-packed-member' warning
We already did this for clang, but now gcc has that warning too.  Yes,
yes, the address may be unaligned.  And that's kind of the point.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-01 11:05:41 -07:00
Shmulik Ladkani
d2f0c96114 ipv4: ip_do_fragment: Preserve skb_iif during fragmentation
Previously, during fragmentation after forwarding, skb->skb_iif isn't
preserved, i.e. 'ip_copy_metadata' does not copy skb_iif from given
'from' skb.

As a result, ip_do_fragment's creates fragments with zero skb_iif,
leading to inconsistent behavior.

Assume for example an eBPF program attached at tc egress (post
forwarding) that examines __sk_buff->ingress_ifindex:
 - the correct iif is observed if forwarding path does not involve
   fragmentation/refragmentation
 - a bogus iif is observed if forwarding path involves
   fragmentation/refragmentatiom

Fix, by preserving skb_iif during 'ip_copy_metadata'.

Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01 13:28:34 -04:00
Mark Rutland
d4ef647510 io_uring: avoid page allocation warnings
In io_sqe_buffer_register() we allocate a number of arrays based on the
iov_len from the user-provided iov. While we limit iov_len to SZ_1G,
we can still attempt to allocate arrays exceeding MAX_ORDER.

On a 64-bit system with 4KiB pages, for an iov where iov_base = 0x10 and
iov_len = SZ_1G, we'll calculate that nr_pages = 262145. When we try to
allocate a corresponding array of (16-byte) bio_vecs, requiring 4194320
bytes, which is greater than 4MiB. This results in SLUB warning that
we're trying to allocate greater than MAX_ORDER, and failing the
allocation.

Avoid this by using kvmalloc() for allocations dependent on the
user-provided iov_len. At the same time, fix a leak of imu->bvec when
registration fails.

Full splat from before this patch:

WARNING: CPU: 1 PID: 2314 at mm/page_alloc.c:4595 __alloc_pages_nodemask+0x7ac/0x2938 mm/page_alloc.c:4595
Kernel panic - not syncing: panic_on_warn set ...
CPU: 1 PID: 2314 Comm: syz-executor326 Not tainted 5.1.0-rc7-dirty #4
Hardware name: linux,dummy-virt (DT)
Call trace:
 dump_backtrace+0x0/0x2f0 include/linux/compiler.h:193
 show_stack+0x20/0x30 arch/arm64/kernel/traps.c:158
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x110/0x190 lib/dump_stack.c:113
 panic+0x384/0x68c kernel/panic.c:214
 __warn+0x2bc/0x2c0 kernel/panic.c:571
 report_bug+0x228/0x2d8 lib/bug.c:186
 bug_handler+0xa0/0x1a0 arch/arm64/kernel/traps.c:956
 call_break_hook arch/arm64/kernel/debug-monitors.c:301 [inline]
 brk_handler+0x1d4/0x388 arch/arm64/kernel/debug-monitors.c:316
 do_debug_exception+0x1a0/0x468 arch/arm64/mm/fault.c:831
 el1_dbg+0x18/0x8c
 __alloc_pages_nodemask+0x7ac/0x2938 mm/page_alloc.c:4595
 alloc_pages_current+0x164/0x278 mm/mempolicy.c:2132
 alloc_pages include/linux/gfp.h:509 [inline]
 kmalloc_order+0x20/0x50 mm/slab_common.c:1231
 kmalloc_order_trace+0x30/0x2b0 mm/slab_common.c:1243
 kmalloc_large include/linux/slab.h:480 [inline]
 __kmalloc+0x3dc/0x4f0 mm/slub.c:3791
 kmalloc_array include/linux/slab.h:670 [inline]
 io_sqe_buffer_register fs/io_uring.c:2472 [inline]
 __io_uring_register fs/io_uring.c:2962 [inline]
 __do_sys_io_uring_register fs/io_uring.c:3008 [inline]
 __se_sys_io_uring_register fs/io_uring.c:2990 [inline]
 __arm64_sys_io_uring_register+0x9e0/0x1bc8 fs/io_uring.c:2990
 __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
 invoke_syscall arch/arm64/kernel/syscall.c:47 [inline]
 el0_svc_common.constprop.0+0x148/0x2e0 arch/arm64/kernel/syscall.c:83
 el0_svc_handler+0xdc/0x100 arch/arm64/kernel/syscall.c:129
 el0_svc+0x8/0xc arch/arm64/kernel/entry.S:948
SMP: stopping secondary CPUs
Dumping ftrace buffer:
   (ftrace buffer empty)
Kernel Offset: disabled
CPU features: 0x002,23000438
Memory Limit: none
Rebooting in 1 seconds..

Fixes: edafccee56 ("io_uring: add support for pre-mapped user IO buffers")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-block@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-01 10:00:25 -06:00
Jakub Kicinski
2dcb003314 net/tls: avoid NULL pointer deref on nskb->sk in fallback
update_chksum() accesses nskb->sk before it has been set
by complete_skb(), move the init up.

Fixes: e8f6979981 ("net/tls: Add generic NIC offload infrastructure")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01 11:37:56 -04:00
David Ahern
15d55bae4e selftests: fib_rule_tests: Fix icmp proto with ipv6
A recent commit returns an error if icmp is used as the ip-proto for
IPv6 fib rules. Update fib_rule_tests to send ipv6-icmp instead of icmp.

Fixes: 5e1a99eae8 ("ipv4: Add ICMPv6 support when parse route ipproto")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01 11:30:17 -04:00
Willem de Bruijn
486efdc8f6 packet: validate msg_namelen in send directly
Packet sockets in datagram mode take a destination address. Verify its
length before passing to dev_hard_header.

Prior to 2.6.14-rc3, the send code ignored sll_halen. This is
established behavior. Directly compare msg_namelen to dev->addr_len.

Change v1->v2: initialize addr in all paths

Fixes: 6b8d95f179 ("packet: validate address length if non-zero")
Suggested-by: David Laight <David.Laight@aculab.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01 11:28:35 -04:00
Willem de Bruijn
b2cf86e156 packet: in recvmsg msg_name return at least sizeof sockaddr_ll
Packet send checks that msg_name is at least sizeof sockaddr_ll.
Packet recv must return at least this length, so that its output
can be passed unmodified to packet send.

This ceased to be true since adding support for lladdr longer than
sll_addr. Since, the return value uses true address length.

Always return at least sizeof sockaddr_ll, even if address length
is shorter. Zero the padding bytes.

Change v1->v2: do not overwrite zeroed padding again. use copy_len.

Fixes: 0fb375fb9b ("[AF_PACKET]: Allow for > 8 byte hardware addresses.")
Suggested-by: David Laight <David.Laight@aculab.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01 11:27:55 -04:00
Will Deacon
24cf262da1 Merge branch 'for-next/timers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux into for-next/core
Conflicts:
	arch/arm64/Kconfig
	arch/arm64/include/asm/arch_timer.h
2019-05-01 15:45:36 +01:00
Ming Lei
f5eb4d3b92 iov_iter: fix iov_iter_type
Commit 875f1d0769 ("iov_iter: add ITER_BVEC_FLAG_NO_REF flag")
introduces one extra flag of ITER_BVEC_FLAG_NO_REF, and this flag
is stored into iter->type.

However, iov_iter_type() doesn't consider the new added flag, fix
it by masking this flag in iov_iter_type().

Fixes: 875f1d0769 ("iov_iter: add ITER_BVEC_FLAG_NO_REF flag")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-01 08:38:47 -06:00
Ming Lei
60a27b906d block: fix handling for BIO_NO_PAGE_REF
Commit 399254aaf4 ("block: add BIO_NO_PAGE_REF flag") introduces
BIO_NO_PAGE_REF, and once this flag is set for one bio, all pages
in the bio won't be get/put during IO.

However, if one bio is submitted via __blkdev_direct_IO_simple(),
even though BIO_NO_PAGE_REF is set, pages still may be put.

Fixes this issue by avoiding to put pages if BIO_NO_PAGE_REF is
set.

Fixes: 399254aaf4 ("block: add BIO_NO_PAGE_REF flag")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-01 08:38:47 -06:00
Jens Axboe
817869d251 io_uring: drop req submit reference always in async punt
If we don't end up actually calling submit in io_sq_wq_submit_work(),
we still need to drop the submit reference to the request. If we
don't, then we can leak the request. This can happen if we race
with ring shutdown while flushing the workqueue for requests that
require use of the mm_struct.

Fixes: e65ef56db4 ("io_uring: use regular request ref counts")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-01 08:38:47 -06:00
Mark Rutland
52e04ef4c9 io_uring: free allocated io_memory once
If io_allocate_scq_urings() fails to allocate an sq_* region, it will
call io_mem_free() for any previously allocated regions, but leave
dangling pointers to these regions in the ctx. Any regions which have
not yet been allocated are left NULL. Note that when returning
-EOVERFLOW, the previously allocated sq_ring is not freed, which appears
to be an unintentional leak.

When io_allocate_scq_urings() fails, io_uring_create() will call
io_ring_ctx_wait_and_kill(), which calls io_mem_free() on all the sq_*
regions, assuming the pointers are valid and not NULL.

This can result in pages being freed multiple times, which has been
observed to corrupt the page state, leading to subsequent fun. This can
also result in virt_to_page() on NULL, resulting in the use of bogus
page addresses, and yet more subsequent fun. The latter can be detected
with CONFIG_DEBUG_VIRTUAL on arm64.

Adding a cleanup path to io_allocate_scq_urings() complicates the logic,
so let's leave it to io_ring_ctx_free() to consistently free these
pointers, and simplify the io_allocate_scq_urings() error paths.

Full splats from before this patch below. Note that the pointer logged
by the DEBUG_VIRTUAL "non-linear address" warning has been hashed, and
is actually NULL.

[   26.098129] page:ffff80000e949a00 count:0 mapcount:-128 mapping:0000000000000000 index:0x0
[   26.102976] flags: 0x63fffc000000()
[   26.104373] raw: 000063fffc000000 ffff80000e86c188 ffff80000ea3df08 0000000000000000
[   26.108917] raw: 0000000000000000 0000000000000001 00000000ffffff7f 0000000000000000
[   26.137235] page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0)
[   26.143960] ------------[ cut here ]------------
[   26.146020] kernel BUG at include/linux/mm.h:547!
[   26.147586] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[   26.149163] Modules linked in:
[   26.150287] Process syz-executor.21 (pid: 20204, stack limit = 0x000000000e9cefeb)
[   26.153307] CPU: 2 PID: 20204 Comm: syz-executor.21 Not tainted 5.1.0-rc7-00004-g7d30b2ea43d6 #18
[   26.156566] Hardware name: linux,dummy-virt (DT)
[   26.158089] pstate: 40400005 (nZcv daif +PAN -UAO)
[   26.159869] pc : io_mem_free+0x9c/0xa8
[   26.161436] lr : io_mem_free+0x9c/0xa8
[   26.162720] sp : ffff000013003d60
[   26.164048] x29: ffff000013003d60 x28: ffff800025048040
[   26.165804] x27: 0000000000000000 x26: ffff800025048040
[   26.167352] x25: 00000000000000c0 x24: ffff0000112c2820
[   26.169682] x23: 0000000000000000 x22: 0000000020000080
[   26.171899] x21: ffff80002143b418 x20: ffff80002143b400
[   26.174236] x19: ffff80002143b280 x18: 0000000000000000
[   26.176607] x17: 0000000000000000 x16: 0000000000000000
[   26.178997] x15: 0000000000000000 x14: 0000000000000000
[   26.181508] x13: 00009178a5e077b2 x12: 0000000000000001
[   26.183863] x11: 0000000000000000 x10: 0000000000000980
[   26.186437] x9 : ffff000013003a80 x8 : ffff800025048a20
[   26.189006] x7 : ffff8000250481c0 x6 : ffff80002ffe9118
[   26.191359] x5 : ffff80002ffe9118 x4 : 0000000000000000
[   26.193863] x3 : ffff80002ffefe98 x2 : 44c06ddd107d1f00
[   26.196642] x1 : 0000000000000000 x0 : 000000000000003e
[   26.198892] Call trace:
[   26.199893]  io_mem_free+0x9c/0xa8
[   26.201155]  io_ring_ctx_wait_and_kill+0xec/0x180
[   26.202688]  io_uring_setup+0x6c4/0x6f0
[   26.204091]  __arm64_sys_io_uring_setup+0x18/0x20
[   26.205576]  el0_svc_common.constprop.0+0x7c/0xe8
[   26.207186]  el0_svc_handler+0x28/0x78
[   26.208389]  el0_svc+0x8/0xc
[   26.209408] Code: aa0203e0 d0006861 9133a021 97fcdc3c (d4210000)
[   26.211995] ---[ end trace bdb81cd43a21e50d ]---

[   81.770626] ------------[ cut here ]------------
[   81.825015] virt_to_phys used for non-linear address: 000000000d42f2c7 (          (null))
[   81.827860] WARNING: CPU: 1 PID: 30171 at arch/arm64/mm/physaddr.c:15 __virt_to_phys+0x48/0x68
[   81.831202] Modules linked in:
[   81.832212] CPU: 1 PID: 30171 Comm: syz-executor.20 Not tainted 5.1.0-rc7-00004-g7d30b2ea43d6 #19
[   81.835616] Hardware name: linux,dummy-virt (DT)
[   81.836863] pstate: 60400005 (nZCv daif +PAN -UAO)
[   81.838727] pc : __virt_to_phys+0x48/0x68
[   81.840572] lr : __virt_to_phys+0x48/0x68
[   81.842264] sp : ffff80002cf67c70
[   81.843858] x29: ffff80002cf67c70 x28: ffff800014358e18
[   81.846463] x27: 0000000000000000 x26: 0000000020000080
[   81.849148] x25: 0000000000000000 x24: ffff80001bb01f40
[   81.851986] x23: ffff200011db06c8 x22: ffff2000127e3c60
[   81.854351] x21: ffff800014358cc0 x20: ffff800014358d98
[   81.856711] x19: 0000000000000000 x18: 0000000000000000
[   81.859132] x17: 0000000000000000 x16: 0000000000000000
[   81.861586] x15: 0000000000000000 x14: 0000000000000000
[   81.863905] x13: 0000000000000000 x12: ffff1000037603e9
[   81.866226] x11: 1ffff000037603e8 x10: 0000000000000980
[   81.868776] x9 : ffff80002cf67840 x8 : ffff80001bb02920
[   81.873272] x7 : ffff1000037603e9 x6 : ffff80001bb01f47
[   81.875266] x5 : ffff1000037603e9 x4 : dfff200000000000
[   81.876875] x3 : ffff200010087528 x2 : ffff1000059ecf58
[   81.878751] x1 : 44c06ddd107d1f00 x0 : 0000000000000000
[   81.880453] Call trace:
[   81.881164]  __virt_to_phys+0x48/0x68
[   81.882919]  io_mem_free+0x18/0x110
[   81.886585]  io_ring_ctx_wait_and_kill+0x13c/0x1f0
[   81.891212]  io_uring_setup+0xa60/0xad0
[   81.892881]  __arm64_sys_io_uring_setup+0x2c/0x38
[   81.894398]  el0_svc_common.constprop.0+0xac/0x150
[   81.896306]  el0_svc_handler+0x34/0x88
[   81.897744]  el0_svc+0x8/0xc
[   81.898715] ---[ end trace b4a703802243cbba ]---

Fixes: 2b188cc1bb ("Add io_uring IO interface")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-block@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-01 08:38:47 -06:00
Mark Rutland
975554b03e io_uring: fix SQPOLL cpu validation
In io_sq_offload_start(), we call cpu_possible() on an unbounded cpu
value from userspace. On v5.1-rc7 on arm64 with
CONFIG_DEBUG_PER_CPU_MAPS, this results in a splat:

  WARNING: CPU: 1 PID: 27601 at include/linux/cpumask.h:121 cpu_max_bits_warn include/linux/cpumask.h:121 [inline]

There was an attempt to fix this in commit:

  917257daa0 ("io_uring: only test SQPOLL cpu after we've verified it")

... by adding a check after the cpu value had been limited to NR_CPU_IDS
using array_index_nospec(). However, this left an unbound check at the
start of the function, for which the warning still fires.

Let's fix this correctly by checking that the cpu value is bound by
nr_cpu_ids before passing it to cpu_possible(). Note that only
nr_cpu_ids of a cpumask are guaranteed to exist at runtime, and
nr_cpu_ids can be significantly smaller than NR_CPUs. For example, an
arm64 defconfig has NR_CPUS=256, while my test VM has 4 vCPUs.

Following the intent from the commit message for 917257daa0, the
check is moved under the SQ_AFF branch, which is the only branch where
the cpu values is consumed. The check is performed before bounding the
value with array_index_nospec() so that we don't silently accept bogus
cpu values from userspace, where array_index_nospec() would force these
values to 0.

I suspect we can remove the array_index_nospec() call entirely, but I've
conservatively left that in place, updated to use nr_cpu_ids to match
the prior check.

Tested on arm64 with the Syzkaller reproducer:

  https://syzkaller.appspot.com/bug?extid=cd714a07c6de2bc34293
  https://syzkaller.appspot.com/x/repro.syz?x=15d8b397200000

Full splat from before this patch:

WARNING: CPU: 1 PID: 27601 at include/linux/cpumask.h:121 cpu_max_bits_warn include/linux/cpumask.h:121 [inline]
WARNING: CPU: 1 PID: 27601 at include/linux/cpumask.h:121 cpumask_check include/linux/cpumask.h:128 [inline]
WARNING: CPU: 1 PID: 27601 at include/linux/cpumask.h:121 cpumask_test_cpu include/linux/cpumask.h:344 [inline]
WARNING: CPU: 1 PID: 27601 at include/linux/cpumask.h:121 io_sq_offload_start fs/io_uring.c:2244 [inline]
WARNING: CPU: 1 PID: 27601 at include/linux/cpumask.h:121 io_uring_create fs/io_uring.c:2864 [inline]
WARNING: CPU: 1 PID: 27601 at include/linux/cpumask.h:121 io_uring_setup+0x1108/0x15a0 fs/io_uring.c:2916
Kernel panic - not syncing: panic_on_warn set ...
CPU: 1 PID: 27601 Comm: syz-executor.0 Not tainted 5.1.0-rc7 #3
Hardware name: linux,dummy-virt (DT)
Call trace:
 dump_backtrace+0x0/0x2f0 include/linux/compiler.h:193
 show_stack+0x20/0x30 arch/arm64/kernel/traps.c:158
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x110/0x190 lib/dump_stack.c:113
 panic+0x384/0x68c kernel/panic.c:214
 __warn+0x2bc/0x2c0 kernel/panic.c:571
 report_bug+0x228/0x2d8 lib/bug.c:186
 bug_handler+0xa0/0x1a0 arch/arm64/kernel/traps.c:956
 call_break_hook arch/arm64/kernel/debug-monitors.c:301 [inline]
 brk_handler+0x1d4/0x388 arch/arm64/kernel/debug-monitors.c:316
 do_debug_exception+0x1a0/0x468 arch/arm64/mm/fault.c:831
 el1_dbg+0x18/0x8c
 cpu_max_bits_warn include/linux/cpumask.h:121 [inline]
 cpumask_check include/linux/cpumask.h:128 [inline]
 cpumask_test_cpu include/linux/cpumask.h:344 [inline]
 io_sq_offload_start fs/io_uring.c:2244 [inline]
 io_uring_create fs/io_uring.c:2864 [inline]
 io_uring_setup+0x1108/0x15a0 fs/io_uring.c:2916
 __do_sys_io_uring_setup fs/io_uring.c:2929 [inline]
 __se_sys_io_uring_setup fs/io_uring.c:2926 [inline]
 __arm64_sys_io_uring_setup+0x50/0x70 fs/io_uring.c:2926
 __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
 invoke_syscall arch/arm64/kernel/syscall.c:47 [inline]
 el0_svc_common.constprop.0+0x148/0x2e0 arch/arm64/kernel/syscall.c:83
 el0_svc_handler+0xdc/0x100 arch/arm64/kernel/syscall.c:129
 el0_svc+0x8/0xc arch/arm64/kernel/entry.S:948
SMP: stopping secondary CPUs
Dumping ftrace buffer:
   (ftrace buffer empty)
Kernel Offset: disabled
CPU features: 0x002,23000438
Memory Limit: none
Rebooting in 1 seconds..

Fixes: 917257daa0 ("io_uring: only test SQPOLL cpu after we've verified it")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-block@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

Simplied the logic

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-01 08:38:37 -06:00
Will Deacon
50abbe1962 Merge branch 'for-next/mitigations' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux into for-next/core 2019-05-01 15:34:56 +01:00
Will Deacon
9431ac2bf6 Merge branch 'for-next/futex' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux into for-next/core 2019-05-01 15:34:17 +01:00
Josh Poimboeuf
4ad499c942 Documentation: Add ARM64 to kernel-parameters.rst
Add ARM64 to the legend of architectures.  It's already used in several
places in kernel-parameters.txt.

Suggested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-05-01 14:48:08 +01:00
Josh Poimboeuf
a111b7c0f2 arm64/speculation: Support 'mitigations=' cmdline option
Configure arm64 runtime CPU speculation bug mitigations in accordance
with the 'mitigations=' cmdline option.  This affects Meltdown, Spectre
v2, and Speculative Store Bypass.

The default behavior is unchanged.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
[will: reorder checks so KASLR implies KPTI and SSBS is affected by cmdline]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-05-01 14:48:07 +01:00
Will Deacon
eb337cdfcd arm64: ssbs: Don't treat CPUs with SSBS as unaffected by SSB
SSBS provides a relatively cheap mitigation for SSB, but it is still a
mitigation and its presence does not indicate that the CPU is unaffected
by the vulnerability.

Tweak the mitigation logic so that we report the correct string in sysfs.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-05-01 14:48:06 +01:00
Mian Yousaf Kaukab
61ae1321f0 arm64: enable generic CPU vulnerabilites support
Enable CPU vulnerabilty show functions for spectre_v1, spectre_v2,
meltdown and store-bypass.

Signed-off-by: Mian Yousaf Kaukab <ykaukab@suse.de>
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-05-01 14:48:06 +01:00
Jeremy Linton
526e065dbc arm64: add sysfs vulnerability show for speculative store bypass
Return status based on ssbd_state and __ssb_safe. If the
mitigation is disabled, or the firmware isn't responding then
return the expected machine state based on a whitelist of known
good cores.

Given a heterogeneous machine, the overall machine vulnerability
defaults to safe but is reset to unsafe when we miss the whitelist
and the firmware doesn't explicitly tell us the core is safe.
In order to make that work we delay transitioning to vulnerable
until we know the firmware isn't responding to avoid a case
where we miss the whitelist, but the firmware goes ahead and
reports the core is not vulnerable. If all the cores in the
machine have SSBS, then __ssb_safe will remain true.

Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-05-01 14:47:55 +01:00
Arun KS
61cf61d81e arm64: Fix size of __early_cpu_boot_status
__early_cpu_boot_status is of type long. Use quad
assembler directive to allocate proper size.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Arun KS <arunks@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-05-01 14:39:26 +01:00
Xin Long
fbd019737d sctp: avoid running the sctp state machine recursively
Ying triggered a call trace when doing an asconf testing:

  BUG: scheduling while atomic: swapper/12/0/0x10000100
  Call Trace:
   <IRQ>  [<ffffffffa4375904>] dump_stack+0x19/0x1b
   [<ffffffffa436fcaf>] __schedule_bug+0x64/0x72
   [<ffffffffa437b93a>] __schedule+0x9ba/0xa00
   [<ffffffffa3cd5326>] __cond_resched+0x26/0x30
   [<ffffffffa437bc4a>] _cond_resched+0x3a/0x50
   [<ffffffffa3e22be8>] kmem_cache_alloc_node+0x38/0x200
   [<ffffffffa423512d>] __alloc_skb+0x5d/0x2d0
   [<ffffffffc0995320>] sctp_packet_transmit+0x610/0xa20 [sctp]
   [<ffffffffc098510e>] sctp_outq_flush+0x2ce/0xc00 [sctp]
   [<ffffffffc098646c>] sctp_outq_uncork+0x1c/0x20 [sctp]
   [<ffffffffc0977338>] sctp_cmd_interpreter.isra.22+0xc8/0x1460 [sctp]
   [<ffffffffc0976ad1>] sctp_do_sm+0xe1/0x350 [sctp]
   [<ffffffffc099443d>] sctp_primitive_ASCONF+0x3d/0x50 [sctp]
   [<ffffffffc0977384>] sctp_cmd_interpreter.isra.22+0x114/0x1460 [sctp]
   [<ffffffffc0976ad1>] sctp_do_sm+0xe1/0x350 [sctp]
   [<ffffffffc097b3a4>] sctp_assoc_bh_rcv+0xf4/0x1b0 [sctp]
   [<ffffffffc09840f1>] sctp_inq_push+0x51/0x70 [sctp]
   [<ffffffffc099732b>] sctp_rcv+0xa8b/0xbd0 [sctp]

As it shows, the first sctp_do_sm() running under atomic context (NET_RX
softirq) invoked sctp_primitive_ASCONF() that uses GFP_KERNEL flag later,
and this flag is supposed to be used in non-atomic context only. Besides,
sctp_do_sm() was called recursively, which is not expected.

Vlad tried to fix this recursive call in Commit c078669340 ("sctp: Fix
oops when sending queued ASCONF chunks") by introducing a new command
SCTP_CMD_SEND_NEXT_ASCONF. But it didn't work as this command is still
used in the first sctp_do_sm() call, and sctp_primitive_ASCONF() will
be called in this command again.

To avoid calling sctp_do_sm() recursively, we send the next queued ASCONF
not by sctp_primitive_ASCONF(), but by sctp_sf_do_prm_asconf() in the 1st
sctp_do_sm() directly.

Reported-by: Ying Xu <yinxu@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01 09:18:57 -04:00
Jan Kiszka
37e9c087c8 stmmac: pci: Fix typo in IOT2000 comment
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01 09:15:52 -04:00
Randy Dunlap
799381e49b Documentation: fix netdev-FAQ.rst markup warning
Fix ReST underline warning:

./Documentation/networking/netdev-FAQ.rst:135: WARNING: Title underline too short.

Q: I made changes to only a few patches in a patch series should I resend only those changed?
--------------------------------------------------------------------------------------------

Fixes: ffa9125373 ("Documentation: networking: Update netdev-FAQ regarding patches")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01 09:12:51 -04:00
Jens Axboe
5c8b0b54db io_uring: have submission side sqe errors post a cqe
Currently we only post a cqe if we get an error OUTSIDE of submission.
For submission, we return the error directly through io_uring_enter().
This is a bit awkward for applications, and it makes more sense to
always post a cqe with an error, if the error happens on behalf of an
sqe.

This changes submission behavior a bit. io_uring_enter() returns -ERROR
for an error, and > 0 for number of sqes submitted. Before this change,
if you wanted to submit 8 entries and had an error on the 5th entry,
io_uring_enter() would return 4 (for number of entries successfully
submitted) and rewind the sqring. The application would then have to
peek at the sqring and figure out what was wrong with the head sqe, and
then skip it itself. With this change, we'll return 5 since we did
consume 5 sqes, and the last sqe (with the error) will result in a cqe
being posted with the error.

This makes the logic easier to handle in the application, and it cleans
up the submission part.

Suggested-by: Stefan Bühler <source@stbuehler.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-01 06:37:55 -06:00
Pierre-Louis Bossart
23583f7795 ACPI / property: fix handling of data_nodes in acpi_get_next_subnode()
When the DSDT tables expose devices with subdevices and a set of
hierarchical _DSD properties, the data returned by
acpi_get_next_subnode() is incorrect, with the results suggesting a bad
pointer assignment. The parser works fine with device_nodes or
data_nodes, but not with a combination of the two.

The problem is traced to an invalid pointer used when jumping from
handling device_nodes to data nodes. The existing code looks for data
nodes below the last subdevice found instead of the common root. Fix
by forcing the acpi_device pointer to be derived from the same fwnode
for the two types of subnodes.

This same problem of handling device and data nodes was already fixed
in a similar way by 'commit bf4703fdd1 ("ACPI / property: fix data
node parsing in acpi_get_next_subnode()")' but broken later by 'commit
34055190b1 ("ACPI / property: Add fwnode_get_next_child_node()")', so
this should probably go to linux-stable all the way to 4.12

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-05-01 10:06:08 +02:00
Eric Dumazet
0e23387491 ipv6: fix races in ip6_dst_destroy()
We had many syzbot reports that seem to be caused by use-after-free
of struct fib6_info.

ip6_dst_destroy(), fib6_drop_pcpu_from() and rt6_remove_exception()
are writers vs rt->from, and use non consistent synchronization among
themselves.

Switching to xchg() will solve the issues with no possible
lockdep issues.

BUG: KASAN: user-memory-access in atomic_dec_and_test include/asm-generic/atomic-instrumented.h:747 [inline]
BUG: KASAN: user-memory-access in fib6_info_release include/net/ip6_fib.h:294 [inline]
BUG: KASAN: user-memory-access in fib6_info_release include/net/ip6_fib.h:292 [inline]
BUG: KASAN: user-memory-access in fib6_drop_pcpu_from net/ipv6/ip6_fib.c:927 [inline]
BUG: KASAN: user-memory-access in fib6_purge_rt+0x4f6/0x670 net/ipv6/ip6_fib.c:960
Write of size 4 at addr 0000000000ffffb4 by task syz-executor.1/7649

CPU: 0 PID: 7649 Comm: syz-executor.1 Not tainted 5.1.0-rc6+ #183
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x172/0x1f0 lib/dump_stack.c:113
 kasan_report.cold+0x5/0x40 mm/kasan/report.c:321
 check_memory_region_inline mm/kasan/generic.c:185 [inline]
 check_memory_region+0x123/0x190 mm/kasan/generic.c:191
 kasan_check_write+0x14/0x20 mm/kasan/common.c:108
 atomic_dec_and_test include/asm-generic/atomic-instrumented.h:747 [inline]
 fib6_info_release include/net/ip6_fib.h:294 [inline]
 fib6_info_release include/net/ip6_fib.h:292 [inline]
 fib6_drop_pcpu_from net/ipv6/ip6_fib.c:927 [inline]
 fib6_purge_rt+0x4f6/0x670 net/ipv6/ip6_fib.c:960
 fib6_del_route net/ipv6/ip6_fib.c:1813 [inline]
 fib6_del+0xac2/0x10a0 net/ipv6/ip6_fib.c:1844
 fib6_clean_node+0x3a8/0x590 net/ipv6/ip6_fib.c:2006
 fib6_walk_continue+0x495/0x900 net/ipv6/ip6_fib.c:1928
 fib6_walk+0x9d/0x100 net/ipv6/ip6_fib.c:1976
 fib6_clean_tree+0xe0/0x120 net/ipv6/ip6_fib.c:2055
 __fib6_clean_all+0x118/0x2a0 net/ipv6/ip6_fib.c:2071
 fib6_clean_all+0x2b/0x40 net/ipv6/ip6_fib.c:2082
 rt6_sync_down_dev+0x134/0x150 net/ipv6/route.c:4057
 rt6_disable_ip+0x27/0x5f0 net/ipv6/route.c:4062
 addrconf_ifdown+0xa2/0x1220 net/ipv6/addrconf.c:3705
 addrconf_notify+0x19a/0x2260 net/ipv6/addrconf.c:3630
 notifier_call_chain+0xc7/0x240 kernel/notifier.c:93
 __raw_notifier_call_chain kernel/notifier.c:394 [inline]
 raw_notifier_call_chain+0x2e/0x40 kernel/notifier.c:401
 call_netdevice_notifiers_info+0x3f/0x90 net/core/dev.c:1753
 call_netdevice_notifiers_extack net/core/dev.c:1765 [inline]
 call_netdevice_notifiers net/core/dev.c:1779 [inline]
 dev_close_many+0x33f/0x6f0 net/core/dev.c:1522
 rollback_registered_many+0x43b/0xfd0 net/core/dev.c:8177
 rollback_registered+0x109/0x1d0 net/core/dev.c:8242
 unregister_netdevice_queue net/core/dev.c:9289 [inline]
 unregister_netdevice_queue+0x1ee/0x2c0 net/core/dev.c:9282
 unregister_netdevice include/linux/netdevice.h:2658 [inline]
 __tun_detach+0xd5b/0x1000 drivers/net/tun.c:727
 tun_detach drivers/net/tun.c:744 [inline]
 tun_chr_close+0xe0/0x180 drivers/net/tun.c:3443
 __fput+0x2e5/0x8d0 fs/file_table.c:278
 ____fput+0x16/0x20 fs/file_table.c:309
 task_work_run+0x14a/0x1c0 kernel/task_work.c:113
 exit_task_work include/linux/task_work.h:22 [inline]
 do_exit+0x90a/0x2fa0 kernel/exit.c:876
 do_group_exit+0x135/0x370 kernel/exit.c:980
 __do_sys_exit_group kernel/exit.c:991 [inline]
 __se_sys_exit_group kernel/exit.c:989 [inline]
 __x64_sys_exit_group+0x44/0x50 kernel/exit.c:989
 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x458da9
Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007ffeafc2a6a8 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 000000000000001c RCX: 0000000000458da9
RDX: 0000000000412a80 RSI: 0000000000a54ef0 RDI: 0000000000000043
RBP: 00000000004be552 R08: 000000000000000c R09: 000000000004c0d1
R10: 0000000002341940 R11: 0000000000000246 R12: 00000000ffffffff
R13: 00007ffeafc2a7f0 R14: 000000000004c065 R15: 00007ffeafc2a800

Fixes: a68886a691 ("net/ipv6: Make from in rt6_info rcu protected")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: David Ahern <dsahern@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Wei Wang <weiwan@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-30 23:38:38 -04:00
Jim Mattson
e8ab8d24b4 KVM: nVMX: Fix size checks in vmx_set_nested_state
The size checks in vmx_nested_state are wrong because the calculations
are made based on the size of a pointer to a struct kvm_nested_state
rather than the size of a struct kvm_nested_state.

Reported-by: Felix Wilhelm  <fwilhelm@google.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Drew Schmitt <dasch@google.com>
Reviewed-by: Marc Orr <marcorr@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Fixes: 8fcc4b5923
Cc: stable@ver.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-05-01 00:43:44 +02:00
Linus Torvalds
f2bc9c908d Merge tag 'fsnotify_for_v5.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull fsnotify fix from Jan Kara:
 "A fix of user trigerable NULL pointer dereference syzbot has recently
  spotted.

  The problem was introduced in this merge window so no CC stable is
  needed"

* tag 'fsnotify_for_v5.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  fsnotify: Fix NULL ptr deref in fanotify_get_fsid()
2019-04-30 15:03:00 -07:00
Paolo Bonzini
6245242d91 KVM/ARM fixes for 5.1, take #2:
- Don't try to emulate timers on userspace access
 - Fix unaligned huge mappings, again
 - Properly reset a vcpu that fails to reset(!)
 - Properly retire pending LPIs on reset
 - Fix computation of emulated CNTP_TVAL
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAlzBuBEVHG1hcmMuenlu
 Z2llckBhcm0uY29tAAoJECPQ0LrRPXpDM8MP/2bpQQCNTZNzEiVUDLk49ADKslHX
 TGo3QA8hCCeZLcjgqe+Qu4GsskaP11Ci2AU8qEzlj2nPFVGos1FV1FpVy9eGh8y/
 dEl0IN3gKBTSpNDM2rBPbdh3UBNU5YA6t59/GMzo1LHu5jU6FOFMKs3eGgpl4Nt6
 d47jIvMLEictoShbZe0s9S6m2MNfg+hP0d44XI6wssF8fSme2xHDXUI3sk96zQlj
 QmI+orDLBIcZLCGfielCEk3f//ZADmsQ2voIxetBWp4lsA8buZNcsMW1rTWdrd1S
 /UMvBBot57qITJV5xDHnU4wcgaAxOp46148jRRF8u3qd5VER9idDWRz/mNayfCsL
 0+KGY4eF8X8K600IuOrKjjKd6Wq1t2+qYbg7GKgxRaVJX6PQ+2T0IyehyvXCyXWB
 s4o2F1NDmJhnLpcep73gjNWa/WUvrMDkj8pnn+Bo3A9516Q7b0/KxPgMFHSLlje5
 dvUqqVCUFi8xqnSzA5fGwx0amUomYHD2YRDUZpX8MjaPaptVzQHeMT60S14Z3pP7
 dvRS5cNI6PLpM9/o/gsM4o3PVF+7z6HPJf93y5tU8yUSPyWp4e7VLPckt+wJyEbo
 EGIVsxEBpOyLo4ByXnrId2d0IE90cXVtA2W8j8H1lIlmmzN57O6k+nm0K1x4bh/i
 qCTV8pngeLtmpLVD
 =Orvj
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-fixes-for-5.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master

KVM/ARM fixes for 5.1, take #2:

- Don't try to emulate timers on userspace access
- Fix unaligned huge mappings, again
- Properly reset a vcpu that fails to reset(!)
- Properly retire pending LPIs on reset
- Fix computation of emulated CNTP_TVAL
2019-04-30 21:23:06 +02:00
Vitaly Kuznetsov
eba3afde1c KVM: selftests: make hyperv_cpuid test pass on AMD
Enlightened VMCS is only supported on Intel CPUs but the test shouldn't
fail completely.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-30 21:22:16 +02:00
Sean Christopherson
b904cb8dff KVM: lapic: Check for in-kernel LAPIC before deferencing apic pointer
...to avoid dereferencing a null pointer when querying the per-vCPU
timer advance.

Fixes: 39497d7660 ("KVM: lapic: Track lapic timer advance per vCPU")
Reported-by: syzbot+f7e65445a40d3e0e4ebf@syzkaller.appspotmail.com
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-30 21:22:15 +02:00
Paolo Bonzini
76d58e0f07 KVM: fix KVM_CLEAR_DIRTY_LOG for memory slots of unaligned size
If a memory slot's size is not a multiple of 64 pages (256K), then
the KVM_CLEAR_DIRTY_LOG API is unusable: clearing the final 64 pages
either requires the requested page range to go beyond memslot->npages,
or requires log->num_pages to be unaligned, and kvm_clear_dirty_log_protect
requires log->num_pages to be both in range and aligned.

To allow this case, allow log->num_pages not to be a multiple of 64 if
it ends exactly on the last page of the slot.

Reported-by: Peter Xu <peterx@redhat.com>
Fixes: 98938aa8ed ("KVM: validate userspace input in kvm_clear_dirty_log_protect()", 2019-01-02)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-30 21:22:15 +02:00
Vitaly Kuznetsov
0699c64a4b x86/kvm/mmu: reset MMU context when 32-bit guest switches PAE
Commit 47c42e6b41 ("KVM: x86: fix handling of role.cr4_pae and rename it
to 'gpte_size'") introduced a regression: 32-bit PAE guests stopped
working. The issue appears to be: when guest switches (enables) PAE we need
to re-initialize MMU context (set context->root_level, do
reset_rsvds_bits_mask(), ...) but init_kvm_tdp_mmu() doesn't do that
because we threw away is_pae(vcpu) flag from mmu role. Restore it to
kvm_mmu_extended_role (as we now don't need it in base role) to fix
the issue.

Fixes: 47c42e6b41 ("KVM: x86: fix handling of role.cr4_pae and rename it to 'gpte_size'")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-30 21:03:58 +02:00
Sean Christopherson
8764ed55c9 KVM: x86: Whitelist port 0x7e for pre-incrementing %rip
KVM's recent bug fix to update %rip after emulating I/O broke userspace
that relied on the previous behavior of incrementing %rip prior to
exiting to userspace.  When running a Windows XP guest on AMD hardware,
Qemu may patch "OUT 0x7E" instructions in reaction to the OUT itself.
Because KVM's old behavior was to increment %rip before exiting to
userspace to handle the I/O, Qemu manually adjusted %rip to account for
the OUT instruction.

Arguably this is a userspace bug as KVM requires userspace to re-enter
the kernel to complete instruction emulation before taking any other
actions.  That being said, this is a bit of a grey area and breaking
userspace that has worked for many years is bad.

Pre-increment %rip on OUT to port 0x7e before exiting to userspace to
hack around the issue.

Fixes: 45def77ebf ("KVM: x86: update %rip after emulating IO")
Reported-by: Simon Becherer <simon@becherer.de>
Reported-and-tested-by: Iakov Karpov <srid@rkmail.ru>
Reported-by: Gabriele Balducci <balducci@units.it>
Reported-by: Antti Antinoja <reader@fennosys.fi>
Cc: stable@vger.kernel.org
Cc: Takashi Iwai <tiwai@suse.com>
Cc: Jiri Slaby <jslaby@suse.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-30 21:03:42 +02:00
Rafael J. Wysocki
2c2a2fb1e2 Revert "ACPICA: Clear status of GPEs before enabling them"
Revert commit c8b1917c89 ("ACPICA: Clear status of GPEs before
enabling them") that causes problems with Thunderbolt controllers
to occur if a dock device is connected at init time (the xhci_hcd
and thunderbolt modules crash which prevents peripherals connected
through them from working).

Commit c8b1917c89 effectively causes commit ecc1165b8b ("ACPICA:
Dispatch active GPEs at init time") to get undone, so the problem
addressed by commit ecc1165b8b appears again as a result of it.

Fixes: c8b1917c89 ("ACPICA: Clear status of GPEs before enabling them")
Link: https://lore.kernel.org/lkml/s5hy33siofw.wl-tiwai@suse.de/T/#u
Link: https://bugzilla.opensuse.org/show_bug.cgi?id=1132943
Reported-by: Michael Hirmke <opensuse@mike.franken.de>
Reported-by: Takashi Iwai <tiwai@suse.de>
Cc: 4.17+ <stable@vger.kernel.org> # 4.17+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-04-30 20:03:44 +02:00
Gary Hook
b51ce3744f x86/mm/mem_encrypt: Disable all instrumentation for early SME setup
Enablement of AMD's Secure Memory Encryption feature is determined very
early after start_kernel() is entered. Part of this procedure involves
scanning the command line for the parameter 'mem_encrypt'.

To determine intended state, the function sme_enable() uses library
functions cmdline_find_option() and strncmp(). Their use occurs early
enough such that it cannot be assumed that any instrumentation subsystem
is initialized.

For example, making calls to a KASAN-instrumented function before KASAN
is set up will result in the use of uninitialized memory and a boot
failure.

When AMD's SME support is enabled, conditionally disable instrumentation
of these dependent functions in lib/string.c and arch/x86/lib/cmdline.c.

 [ bp: Get rid of intermediary nostackp var and cleanup whitespace. ]

Fixes: aca20d5462 ("x86/mm: Add support to make use of Secure Memory Encryption")
Reported-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Boris Brezillon <bbrezillon@kernel.org>
Cc: Coly Li <colyli@suse.de>
Cc: "dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: "luto@kernel.org" <luto@kernel.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: "mingo@redhat.com" <mingo@redhat.com>
Cc: "peterz@infradead.org" <peterz@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/155657657552.7116.18363762932464011367.stgit@sosrh3.amd.com
2019-04-30 17:59:08 +02:00
David S. Miller
34259977f2 wireless-drivers fixes for 5.1
Third set of fixes for 5.1.
 
 iwlwifi
 
 * fix an oops when creating debugfs entries
 
 * fix bug when trying to capture debugging info while in rfkill
 
 * prevent potential uninitialized memory dumps into debugging logs
 
 * fix some initialization parameters for AX210 devices
 
 * fix an oops with non-MSIX devices
 
 * fix an oops when we receive a packet with bogus lengths
 
 * fix a bug that prevented 5350 devices from working
 
 * fix a small merge damage from the previous series
 
 mwifiex
 
 * fig regression with resume on SDIO
 
 ath10k
 
 * fix locking problem with crashdump
 
 * fix warnings during suspend and resume
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJcyGH/AAoJEG4XJFUm622b3AIH/R/Af8W3E69s1jLpbwVpcnKX
 jucK1WePDwgTttUI4cmtpCIlLImTiiMxgWd+ku7tpwz4kMJWY8KOjrvwN3ulNGUr
 R0Vbsyc/CXS41DtfZt8IpckF3ru1omtdmBTfMEermoPbZZNtx1j34Odkn0VtEcjW
 7Ml7u6YfZc/tKgWmIO6afhqMRgnChWGhK76ayCF7/9YM17uSoGw+rdLIM8o20+jE
 VwMbhvNuZILwTZ35mbi1Tnbz8dZr/1J7yMqrmfuZFkpLseWixcQ6zK53A8PTJ0Cf
 6B+fiu7+9mUzNaqzuemlYrZ2PsGXprdjrBNaHcnRrc3+GZlilKcDFQQCPWrpmz8=
 =WiMH
 -----END PGP SIGNATURE-----

Merge tag 'wireless-drivers-for-davem-2019-04-30' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers

Kalle Valo says:

====================
wireless-drivers fixes for 5.1

Third set of fixes for 5.1.

iwlwifi

* fix an oops when creating debugfs entries

* fix bug when trying to capture debugging info while in rfkill

* prevent potential uninitialized memory dumps into debugging logs

* fix some initialization parameters for AX210 devices

* fix an oops with non-MSIX devices

* fix an oops when we receive a packet with bogus lengths

* fix a bug that prevented 5350 devices from working

* fix a small merge damage from the previous series

mwifiex

* fig regression with resume on SDIO

ath10k

* fix locking problem with crashdump

* fix warnings during suspend and resume

Also note that this pull conflicts with net-next. And I want to emphasie
that it's really net-next, so when you pull this to net tree it should
go without conflicts. Stephen reported the conflict here:

https://lkml.kernel.org/r/20190429115338.5decb50b@canb.auug.org.au

In iwlwifi oddly commit 154d4899e4 adds the IS_ERR_OR_NULL() in
wireless-drivers but commit c9af7528c3 removes the whole check in
wireless-drivers-next. The fix is easy, just drop the whole check for
mvmvif->dbgfs_dir in iwlwifi/mvm/debugfs-vif.c, it's unneeded anyway.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-30 11:52:17 -04:00
Linus Torvalds
bf3bd966df USB fixes for 5.1-rc8/final
Here are some small USB fixes for a bunch of warnings/errors that the
 syzbot has been finding with it's new-found ability to stress-test the
 USB layer.
 
 All of these are tiny, but fix real issues, and are marked for stable as
 well.  All of these have had lots of testing in linux-next as well.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXMhb1Q8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykHdQCeMK3TfwWjqSSCcgvbTaxhvx8Fwg4AoJRLspLh
 FgVo+vaKf6SFlZdAVSJ8
 =28RC
 -----END PGP SIGNATURE-----

Merge tag 'usb-5.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
 "Here are some small USB fixes for a bunch of warnings/errors that the
  syzbot has been finding with it's new-found ability to stress-test the
  USB layer.

  All of these are tiny, but fix real issues, and are marked for stable
  as well. All of these have had lots of testing in linux-next as well"

* tag 'usb-5.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  USB: w1 ds2490: Fix bug caused by improper use of altsetting array
  USB: yurex: Fix protection fault after device removal
  usb: usbip: fix isoc packet num validation in get_pipe
  USB: core: Fix bug caused by duplicate interface PM usage counter
  USB: dummy-hcd: Fix failure to give back unlinked URBs
  USB: core: Fix unterminated string returned by usb_string()
2019-04-30 08:41:22 -07:00
Stefan Bühler
62977281a6 io_uring: remove unnecessary barrier after unsetting IORING_SQ_NEED_WAKEUP
There is no operation to order with afterwards, and removing the flag is
not critical in any way.

There will always be a "race condition" where the application will
trigger IORING_ENTER_SQ_WAKEUP when it isn't actually needed.

Signed-off-by: Stefan Bühler <source@stbuehler.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-30 09:40:02 -06:00