Commit Graph

6947 Commits

Author SHA1 Message Date
Arnd Bergmann
bd99f9a159 arm64: fix undefined reference to 'printk'
The printk symbol was intended as a generic address that is always
exported, however that turned out to be false with CONFIG_PRINTK=n:

ERROR: "printk" [arch/arm64/kernel/arm64-reloc-test.ko] undefined!

This changes the references to memstart_addr, which should be there
regardless of configuration.

Fixes: a257e02579 ("arm64/kernel: don't ban ADRP to work around Cortex-A53 erratum #843419")
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-19 18:14:25 +00:00
Dave Martin
af40ff687b arm64: signal: Ensure si_code is valid for all fault signals
Currently, as reported by Eric, an invalid si_code value 0 is
passed in many signals delivered to userspace in response to faults
and other kernel errors.  Typically 0 is passed when the fault is
insufficiently diagnosable or when there does not appear to be any
sensible alternative value to choose.

This appears to violate POSIX, and is intuitively wrong for at
least two reasons arising from the fact that 0 == SI_USER:

 1) si_code is a union selector, and SI_USER (and si_code <= 0 in
    general) implies the existence of a different set of fields
    (siginfo._kill) from that which exists for a fault signal
    (siginfo._sigfault).  However, the code raising the signal
    typically writes only the _sigfault fields, and the _kill
    fields make no sense in this case.

    Thus when userspace sees si_code == 0 (SI_USER) it may
    legitimately inspect fields in the inactive union member _kill
    and obtain garbage as a result.

    There appears to be software in the wild relying on this,
    albeit generally only for printing diagnostic messages.

 2) Software that wants to be robust against spurious signals may
    discard signals where si_code == SI_USER (or <= 0), or may
    filter such signals based on the si_uid and si_pid fields of
    siginfo._sigkill.  In the case of fault signals, this means
    that important (and usually fatal) error conditions may be
    silently ignored.

In practice, many of the faults for which arm64 passes si_code == 0
are undiagnosable conditions such as exceptions with syndrome
values in ESR_ELx to which the architecture does not yet assign any
meaning, or conditions indicative of a bug or error in the kernel
or system and thus that are unrecoverable and should never occur in
normal operation.

The approach taken in this patch is to translate all such
undiagnosable or "impossible" synchronous fault conditions to
SIGKILL, since these are at least probably localisable to a single
process.  Some of these conditions should really result in a kernel
panic, but due to the lack of diagnostic information it is
difficult to be certain: this patch does not add any calls to
panic(), but this could change later if justified.

Although si_code will not reach userspace in the case of SIGKILL,
it is still desirable to pass a nonzero value so that the common
siginfo handling code can detect incorrect use of si_code == 0
without false positives.  In this case the si_code dependent
siginfo fields will not be correctly initialised, but since they
are not passed to userspace I deem this not to matter.

A few faults can reasonably occur in realistic userspace scenarios,
and _should_ raise a regular, handleable (but perhaps not
ignorable/blockable) signal: for these, this patch attempts to
choose a suitable standard si_code value for the raised signal in
each case instead of 0.

arm64 was the only arch to define a BUS_FIXME code, so after this
patch nobody defines it.  This patch therefore also removes the
relevant code from siginfo_layout().

Cc: James Morse <james.morse@arm.com>
Reported-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-09 13:58:36 +00:00
Shanker Donthineni
6ae4b6e057 arm64: Add support for new control bits CTR_EL0.DIC and CTR_EL0.IDC
The DCache clean & ICache invalidation requirements for instructions
to be data coherence are discoverable through new fields in CTR_EL0.
The following two control bits DIC and IDC were defined for this
purpose. No need to perform point of unification cache maintenance
operations from software on systems where CPU caches are transparent.

This patch optimize the three functions __flush_cache_user_range(),
clean_dcache_area_pou() and invalidate_icache_range() if the hardware
reports CTR_EL0.IDC and/or CTR_EL0.IDC. Basically it skips the two
instructions 'DC CVAU' and 'IC IVAU', and the associated loop logic
in order to avoid the unnecessary overhead.

CTR_EL0.DIC: Instruction cache invalidation requirements for
 instruction to data coherence. The meaning of this bit[29].
  0: Instruction cache invalidation to the point of unification
     is required for instruction to data coherence.
  1: Instruction cache cleaning to the point of unification is
      not required for instruction to data coherence.

CTR_EL0.IDC: Data cache clean requirements for instruction to data
 coherence. The meaning of this bit[28].
  0: Data cache clean to the point of unification is required for
     instruction to data coherence, unless CLIDR_EL1.LoC == 0b000
     or (CLIDR_EL1.LoUIS == 0b000 && CLIDR_EL1.LoUU == 0b000).
  1: Data cache clean to the point of unification is not required
     for instruction to data coherence.

Co-authored-by: Philip Elcan <pelcan@codeaurora.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-09 13:57:57 +00:00
Ard Biesheuvel
ca79acca27 arm64/kernel: enable A53 erratum #8434319 handling at runtime
Omit patching of ADRP instruction at module load time if the current
CPUs are not susceptible to the erratum.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[will: Drop duplicate initialisation of .def_scope field]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-09 13:23:09 +00:00
Ard Biesheuvel
e8002e02ab arm64/errata: add REVIDR handling to framework
In some cases, core variants that are affected by a certain erratum
also exist in versions that have the erratum fixed, and this fact is
recorded in a dedicated bit in system register REVIDR_EL1.

Since the architecture does not require that a certain bit retains
its meaning across different variants of the same model, each such
REVIDR bit is tightly coupled to a certain revision/variant value,
and so we need a list of revidr_mask/midr pairs to carry this
information.

So add the struct member and the associated macros and handling to
allow REVIDR fixes to be taken into account.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-09 13:23:08 +00:00
Ard Biesheuvel
a257e02579 arm64/kernel: don't ban ADRP to work around Cortex-A53 erratum #843419
Working around Cortex-A53 erratum #843419 involves special handling of
ADRP instructions that end up in the last two instruction slots of a
4k page, or whose output register gets overwritten without having been
read. (Note that the latter instruction sequence is never emitted by
a properly functioning compiler, which is why it is disregarded by the
handling of the same erratum in the bfd.ld linker which we rely on for
the core kernel)

Normally, this gets taken care of by the linker, which can spot such
sequences at final link time, and insert a veneer if the ADRP ends up
at a vulnerable offset. However, linux kernel modules are partially
linked ELF objects, and so there is no 'final link time' other than the
runtime loading of the module, at which time all the static relocations
are resolved.

For this reason, we have implemented the #843419 workaround for modules
by avoiding ADRP instructions altogether, by using the large C model,
and by passing -mpc-relative-literal-loads to recent versions of GCC
that may emit adrp/ldr pairs to perform literal loads. However, this
workaround forces us to keep literal data mixed with the instructions
in the executable .text segment, and literal data may inadvertently
turn into an exploitable speculative gadget depending on the relative
offsets of arbitrary symbols.

So let's reimplement this workaround in a way that allows us to switch
back to the small C model, and to drop the -mpc-relative-literal-loads
GCC switch, by patching affected ADRP instructions at runtime:
- ADRP instructions that do not appear at 4k relative offset 0xff8 or
  0xffc are ignored
- ADRP instructions that are within 1 MB of their target symbol are
  converted into ADR instructions
- remaining ADRP instructions are redirected via a veneer that performs
  the load using an unaffected movn/movk sequence.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[will: tidied up ADRP -> ADR instruction patching.]
[will: use ULL suffix for 64-bit immediate]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-09 13:21:53 +00:00
Ard Biesheuvel
f2b9ba871b arm64/kernel: kaslr: reduce module randomization range to 4 GB
We currently have to rely on the GCC large code model for KASLR for
two distinct but related reasons:
- if we enable full randomization, modules will be loaded very far away
  from the core kernel, where they are out of range for ADRP instructions,
- even without full randomization, the fact that the 128 MB module region
  is now no longer fully reserved for kernel modules means that there is
  a very low likelihood that the normal bottom-up allocation of other
  vmalloc regions may collide, and use up the range for other things.

Large model code is suboptimal, given that each symbol reference involves
a literal load that goes through the D-cache, reducing cache utilization.
But more importantly, literals are not instructions but part of .text
nonetheless, and hence mapped with executable permissions.

So let's get rid of our dependency on the large model for KASLR, by:
- reducing the full randomization range to 4 GB, thereby ensuring that
  ADRP references between modules and the kernel are always in range,
- reduce the spillover range to 4 GB as well, so that we fallback to a
  region that is still guaranteed to be in range
- move the randomization window of the core kernel to the middle of the
  VMALLOC space

Note that KASAN always uses the module region outside of the vmalloc space,
so keep the kernel close to that if KASAN is enabled.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-08 13:49:26 +00:00
Ard Biesheuvel
5e8307b9c6 arm64: module: don't BUG when exceeding preallocated PLT count
When PLTs are emitted at relocation time, we really should not exceed
the number that we counted when parsing the relocation tables, and so
currently, we BUG() on this condition. However, even though this is a
clear bug in this particular piece of code, we can easily recover by
failing to load the module.

So instead, return 0 from module_emit_plt_entry() if this condition
occurs, which is not a valid kernel address, and can hence serve as
a flag value that makes the relocation routine bail out.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-08 13:49:26 +00:00
Will Deacon
e03e61c317 arm64: kaslr: Set TCR_EL1.NFD1 when CONFIG_RANDOMIZE_BASE=y
TCR_EL1.NFD1 was allocated by SVE and ensures that fault-surpressing SVE
memory accesses (e.g. speculative accesses from a first-fault gather load)
which translate via TTBR1_EL1 result in a translation fault if they
miss in the TLB when executed from EL0. This mitigates some timing attacks
against KASLR, where the kernel address space could otherwise be probed
efficiently using the FFR in conjunction with suppressed faults on SVE
loads.

Cc: Dave Martin <Dave.Martin@arm.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-06 18:52:34 +00:00
Douglas Anderson
24153c03d4 arm64/debug: Fix registers on sleeping tasks
This is the equivalent of commit 001bf455d2 ("ARM: 8428/1: kgdb: Fix
registers on sleeping tasks") but for arm64.  Nuff said.

...well, perhaps I could also add that task_pt_regs are userspace
registers and that's not what kgdb is supposed to be reporting.  We're
supposed to be reporting kernel registers.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-06 18:52:34 +00:00
Andrey Konovalov
9597e74396 kasan, arm64: clean up KASAN_SHADOW_SCALE_SHIFT usage
This is a follow up patch to the series I sent recently that cleans up
KASAN_SHADOW_SCALE_SHIFT usage (which value was hardcoded and scattered
all over the code). This fixes the one place that I forgot to fix.

The change is purely aesthetical, instead of hardcoding the value for
KASAN_SHADOW_SCALE_SHIFT in arch/arm64/Makefile, an appropriate variable
is declared and used.

Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-06 18:52:33 +00:00
Catalin Marinas
1f85b42a69 arm64: Revert L1_CACHE_SHIFT back to 6 (64-byte cache line size)
Commit 9730348075 ("arm64: Increase the max granular size") increased
the cache line size to 128 to match Cavium ThunderX, apparently for some
performance benefit which could not be confirmed. This change, however,
has an impact on the network packets allocation in certain
circumstances, requiring slightly over a 4K page with a significant
performance degradation.

This patch reverts L1_CACHE_SHIFT back to 6 (64-byte cache line) while
keeping ARCH_DMA_MINALIGN at 128. The cache_line_size() function was
changed to default to ARCH_DMA_MINALIGN in the absence of a meaningful
CTR_EL0.CWG bit field.

In addition, if a system with ARCH_DMA_MINALIGN < CTR_EL0.CWG is
detected, the kernel will force swiotlb bounce buffering for all
non-coherent devices since DMA cache maintenance on sub-CWG ranges is
not safe, leading to data corruption.

Cc: Tirumalesh Chalamarla <tchalamarla@cavium.com>
Cc: Timur Tabi <timur@codeaurora.org>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-06 18:52:32 +00:00
Will Deacon
6b24442d68 arm64: lse: Pass -fomit-frame-pointer to out-of-line ll/sc atomics
In cases where x30 is used as a temporary in the out-of-line ll/sc atomics
(e.g. atomic_fetch_add), the compiler tends to put out a full stackframe,
which included pointing the x29 at the new frame.

Since these things aren't traceable anyway, we can pass -fomit-frame-pointer
to reduce the work when spilling. Since this is incompatible with -pg, we
also remove that from the CFLAGS for this file.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-06 18:52:32 +00:00
Will Deacon
4e829b6735 arm64: Use arm64_force_sig_info instead of force_sig_info
Using arm64_force_sig_info means that printing messages about unhandled
signals is dealt with for us, so use that in preference to force_sig_info
and remove any homebrew printing code.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-06 18:52:32 +00:00
Will Deacon
a26731d9d1 arm64: Move show_unhandled_signals_ratelimited into traps.c
show_unhandled_signals_ratelimited is only called in traps.c, so move it
out of its macro in the dreaded system_misc.h and into a static function
in traps.c

Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-06 18:52:31 +00:00
Will Deacon
f71016a8a8 arm64: signal: Call arm64_notify_segfault when failing to deliver signal
If we fail to deliver a signal due to taking an unhandled fault on the
stackframe, we can call arm64_notify_segfault to deliver a SEGV can deal
with printing any unhandled signal messages for us, rather than roll our
own printing code.

A side-effect of this change is that we now deliver the frame address
in si_addr along with an si_code of SEGV_{ACC,MAP}ERR, rather than an
si_addr of 0 and an si_code of SI_KERNEL as before.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-06 18:52:25 +00:00
Will Deacon
92ff0674f5 arm64: mm: Rework unhandled user pagefaults to call arm64_force_sig_info
Reporting unhandled user pagefaults via arm64_force_sig_info means
that __do_user_fault can be drastically simplified, since it no longer
has to worry about printing the fault information and can consequently
just take the siginfo as a parameter.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-06 18:52:24 +00:00
Will Deacon
1049c30871 arm64: Pass user fault info to arm64_notify_die instead of printing it
There's no need for callers of arm64_notify_die to print information
about user faults. Instead, they can pass a string to arm64_notify_die
which will be printed subject to show_unhandled_signals.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-06 18:52:24 +00:00
Will Deacon
15b67321e7 arm64: signal: Don't print anything directly in force_signal_inject
arm64_notify_die deals with printing out information regarding unhandled
signals, so there's no need to roll our own code here.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-06 18:52:23 +00:00
Will Deacon
a1ece8216c arm64: Introduce arm64_force_sig_info and hook up in arm64_notify_die
In preparation for consolidating our handling of printing unhandled
signals, introduce a wrapper around force_sig_info which can act as
the canonical place for dealing with show_unhandled_signals.

Initially, we just hook this up to arm64_notify_die.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-06 18:52:23 +00:00
Will Deacon
a7e6f1ca90 arm64: signal: Force SIGKILL for unknown signals in force_signal_inject
For signals other than SIGKILL or those with siginfo_layout(signal, code)
== SIL_FAULT then force_signal_inject does not initialise the siginfo_t
properly. Since the signal number is determined solely by the caller,
simply WARN on unknown signals and force to SIGKILL.

Reported-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-06 18:52:23 +00:00
Will Deacon
2c9120f3a8 arm64: signal: Make force_signal_inject more robust
force_signal_inject is a little flakey:

  * It only knows about SIGILL and SIGSEGV, so can potentially deliver
    other signals based on a partially initialised siginfo_t

  * It sets si_addr to point at the PC for SIGSEGV

  * It always operates on current, so doesn't need the regs argument

This patch fixes these issues by always assigning the si_addr field to
the address parameter of the function and updates the callers (including
those that indirectly call via arm64_notify_segfault) accordingly.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-06 18:52:22 +00:00
Kees Cook
e0f6429dc1 arm64: cpufeature: Remove redundant "feature" in reports
The word "feature" is repeated in the CPU features reporting. This drops it
for improved readability.

Before (redundant "feature" word):

 SMP: Total of 4 processors activated.
 CPU features: detected feature: 32-bit EL0 Support
 CPU features: detected feature: Kernel page table isolation (KPTI)
 CPU features: emulated: Privileged Access Never (PAN) using TTBR0_EL1 switching
 CPU: All CPU(s) started at EL2

After:

 SMP: Total of 4 processors activated.
 CPU features: detected: 32-bit EL0 Support
 CPU features: detected: Kernel page table isolation (KPTI)
 CPU features: emulated: Privileged Access Never (PAN) using TTBR0_EL1 switching
 CPU: All CPU(s) started at EL2

Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-05 12:06:44 +00:00
Kees Cook
2e6f549fe9 arm64: cpufeature: Relocate PAN emulation report
The PAN emulation notification was only happening for non-boot CPUs
if CPU capabilities had already been configured. This seems to be the
wrong place, as it's system-wide and isn't attached to capabilities,
so its reporting didn't normally happen. Instead, report it once from
the boot CPU.

Before (missing PAN emulation report):

 SMP: Total of 4 processors activated.
 CPU features: detected feature: 32-bit EL0 Support
 CPU features: detected feature: Kernel page table isolation (KPTI)
 CPU: All CPU(s) started at EL2

After:

 SMP: Total of 4 processors activated.
 CPU features: detected feature: 32-bit EL0 Support
 CPU features: detected feature: Kernel page table isolation (KPTI)
 CPU features: emulated: Privileged Access Never (PAN) using TTBR0_EL1 switching
 CPU: All CPU(s) started at EL2

Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-05 12:06:43 +00:00
Ard Biesheuvel
6141ac1c27 arm64/kernel: kaslr: drop special Image placement logic
Now that the early kernel mapping logic can tolerate placements of
Image that cross swapper table boundaries, we can remove the logic
that adjusts the offset if the dice roll produced an offset that
puts the kernel right on top of one.

Reviewed-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-05 12:06:43 +00:00
Michael Weiser
532826f371 arm64: Mirror arm for unimplemented compat syscalls
Mirror arm behaviour for unimplemented syscalls: Below 2048 return
-ENOSYS, above 2048 raise SIGILL.

Signed-off-by: Michael Weiser <michael.weiser@gmx.de>
[will: Tweak die string to identify as compat syscall]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-05 12:06:43 +00:00
Linus Torvalds
297ea1b7f7 Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull cleanup patchlet from Thomas Gleixner:
 "A single commit removing a bunch of bogus double semicolons all over
  the tree"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  treewide/trivial: Remove ';;$' typo noise
2018-02-25 16:27:51 -08:00
Linus Torvalds
9cb9c07d6b Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix TTL offset calculation in mac80211 mesh code, from Peter Oh.

 2) Fix races with procfs in ipt_CLUSTERIP, from Cong Wang.

 3) Memory leak fix in lpm_trie BPF map code, from Yonghong Song.

 4) Need to use GFP_ATOMIC in BPF cpumap allocations, from Jason Wang.

 5) Fix potential deadlocks in netfilter getsockopt() code paths, from
    Paolo Abeni.

 6) Netfilter stackpointer size checks really are needed to validate
    user input, from Florian Westphal.

 7) Missing timer init in x_tables, from Paolo Abeni.

 8) Don't use WQ_MEM_RECLAIM in mac80211 hwsim, from Johannes Berg.

 9) When an ibmvnic device is brought down then back up again, it can be
    sent queue entries from a previous session, handle this properly
    instead of crashing. From Thomas Falcon.

10) Fix TCP checksum on LRO buffers in mlx5e, from Gal Pressman.

11) When we are dumping filters in cls_api, the output SKB is empty, and
    the filter we are dumping is too large for the space in the SKB, we
    should return -EMSGSIZE like other netlink dump operations do.
    Otherwise userland has no signal that is needs to increase the size
    of its read buffer. From Roman Kapl.

12) Several XDP fixes for virtio_net, from Jesper Dangaard Brouer.

13) Module refcount leak in netlink when a dump start fails, from Jason
    Donenfeld.

14) Handle sub-optimal GSO sizes better in TCP BBR congestion control,
    from Eric Dumazet.

15) Releasing bpf per-cpu arraymaps can take a long time, add a
    condtional scheduling point. From Eric Dumazet.

16) Implement retpolines for tail calls in x64 and arm64 bpf JITs. From
    Daniel Borkmann.

17) Fix page leak in gianfar driver, from Andy Spencer.

18) Missed clearing of estimator scratch buffer, from Eric Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (76 commits)
  net_sched: gen_estimator: fix broken estimators based on percpu stats
  gianfar: simplify FCS handling and fix memory leak
  ipv6 sit: work around bogus gcc-8 -Wrestrict warning
  macvlan: fix use-after-free in macvlan_common_newlink()
  bpf, arm64: fix out of bounds access in tail call
  bpf, x64: implement retpoline for tail call
  rxrpc: Fix send in rxrpc_send_data_packet()
  net: aquantia: Fix error handling in aq_pci_probe()
  bpf: fix rcu lockdep warning for lpm_trie map_free callback
  bpf: add schedule points in percpu arrays management
  regulatory: add NUL to request alpha2
  ibmvnic: Fix early release of login buffer
  net/smc9194: Remove bogus CONFIG_MAC reference
  net: ipv4: Set addr_type in hash_keys for forwarded case
  tcp_bbr: better deal with suboptimal GSO
  smsc75xx: fix smsc75xx_set_features()
  netlink: put module reference if dump start fails
  selftests/bpf/test_maps: exit child process without error in ENOMEM case
  selftests/bpf: update gitignore with test_libbpf_open
  selftests/bpf: tcpbpf_kern: use in6_* macros from glibc
  ..
2018-02-23 15:14:17 -08:00
Pratyush Anand
9f416319f4 arm64: fix unwind_frame() for filtered out fn for function graph tracing
do_task_stat() calls get_wchan(), which further does unwind_frame().
unwind_frame() restores frame->pc to original value in case function
graph tracer has modified a return address (LR) in a stack frame to hook
a function return. However, if function graph tracer has hit a filtered
function, then we can't unwind it as ftrace_push_return_trace() has
biased the index(frame->graph) with a 'huge negative'
offset(-FTRACE_NOTRACE_DEPTH).

Moreover, arm64 stack walker defines index(frame->graph) as unsigned
int, which can not compare a -ve number.

Similar problem we can have with calling of walk_stackframe() from
save_stack_trace_tsk() or dump_backtrace().

This patch fixes unwind_frame() to test the index for -ve value and
restore index accordingly before we can restore frame->pc.

Reproducer:

cd /sys/kernel/debug/tracing/
echo schedule > set_graph_notrace
echo 1 > options/display-graph
echo wakeup > current_tracer
ps -ef | grep -i agent

Above commands result in:
Unable to handle kernel paging request at virtual address ffff801bd3d1e000
pgd = ffff8003cbe97c00
[ffff801bd3d1e000] *pgd=0000000000000000, *pud=0000000000000000
Internal error: Oops: 96000006 [#1] SMP
[...]
CPU: 5 PID: 11696 Comm: ps Not tainted 4.11.0+ #33
[...]
task: ffff8003c21ba000 task.stack: ffff8003cc6c0000
PC is at unwind_frame+0x12c/0x180
LR is at get_wchan+0xd4/0x134
pc : [<ffff00000808892c>] lr : [<ffff0000080860b8>] pstate: 60000145
sp : ffff8003cc6c3ab0
x29: ffff8003cc6c3ab0 x28: 0000000000000001
x27: 0000000000000026 x26: 0000000000000026
x25: 00000000000012d8 x24: 0000000000000000
x23: ffff8003c1c04000 x22: ffff000008c83000
x21: ffff8003c1c00000 x20: 000000000000000f
x19: ffff8003c1bc0000 x18: 0000fffffc593690
x17: 0000000000000000 x16: 0000000000000001
x15: 0000b855670e2b60 x14: 0003e97f22cf1d0f
x13: 0000000000000001 x12: 0000000000000000
x11: 00000000e8f4883e x10: 0000000154f47ec8
x9 : 0000000070f367c0 x8 : 0000000000000000
x7 : 00008003f7290000 x6 : 0000000000000018
x5 : 0000000000000000 x4 : ffff8003c1c03cb0
x3 : ffff8003c1c03ca0 x2 : 00000017ffe80000
x1 : ffff8003cc6c3af8 x0 : ffff8003d3e9e000

Process ps (pid: 11696, stack limit = 0xffff8003cc6c0000)
Stack: (0xffff8003cc6c3ab0 to 0xffff8003cc6c4000)
[...]
[<ffff00000808892c>] unwind_frame+0x12c/0x180
[<ffff000008305008>] do_task_stat+0x864/0x870
[<ffff000008305c44>] proc_tgid_stat+0x3c/0x48
[<ffff0000082fde0c>] proc_single_show+0x5c/0xb8
[<ffff0000082b27e0>] seq_read+0x160/0x414
[<ffff000008289e6c>] __vfs_read+0x58/0x164
[<ffff00000828b164>] vfs_read+0x88/0x144
[<ffff00000828c2e8>] SyS_read+0x60/0xc0
[<ffff0000080834a0>] __sys_trace_return+0x0/0x4

Fixes: 20380bb390 (arm64: ftrace: fix a stack tracer's output under function graph tracer)
Signed-off-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
[catalin.marinas@arm.com: replace WARN_ON with WARN_ON_ONCE]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-02-23 13:46:38 +00:00
Daniel Borkmann
16338a9b3a bpf, arm64: fix out of bounds access in tail call
I recently noticed a crash on arm64 when feeding a bogus index
into BPF tail call helper. The crash would not occur when the
interpreter is used, but only in case of JIT. Output looks as
follows:

  [  347.007486] Unable to handle kernel paging request at virtual address fffb850e96492510
  [...]
  [  347.043065] [fffb850e96492510] address between user and kernel address ranges
  [  347.050205] Internal error: Oops: 96000004 [#1] SMP
  [...]
  [  347.190829] x13: 0000000000000000 x12: 0000000000000000
  [  347.196128] x11: fffc047ebe782800 x10: ffff808fd7d0fd10
  [  347.201427] x9 : 0000000000000000 x8 : 0000000000000000
  [  347.206726] x7 : 0000000000000000 x6 : 001c991738000000
  [  347.212025] x5 : 0000000000000018 x4 : 000000000000ba5a
  [  347.217325] x3 : 00000000000329c4 x2 : ffff808fd7cf0500
  [  347.222625] x1 : ffff808fd7d0fc00 x0 : ffff808fd7cf0500
  [  347.227926] Process test_verifier (pid: 4548, stack limit = 0x000000007467fa61)
  [  347.235221] Call trace:
  [  347.237656]  0xffff000002f3a4fc
  [  347.240784]  bpf_test_run+0x78/0xf8
  [  347.244260]  bpf_prog_test_run_skb+0x148/0x230
  [  347.248694]  SyS_bpf+0x77c/0x1110
  [  347.251999]  el0_svc_naked+0x30/0x34
  [  347.255564] Code: 9100075a d280220a 8b0a002a d37df04b (f86b694b)
  [...]

In this case the index used in BPF r3 is the same as in r1
at the time of the call, meaning we fed a pointer as index;
here, it had the value 0xffff808fd7cf0500 which sits in x2.

While I found tail calls to be working in general (also for
hitting the error cases), I noticed the following in the code
emission:

  # bpftool p d j i 988
  [...]
  38:   ldr     w10, [x1,x10]
  3c:   cmp     w2, w10
  40:   b.ge    0x000000000000007c              <-- signed cmp
  44:   mov     x10, #0x20                      // #32
  48:   cmp     x26, x10
  4c:   b.gt    0x000000000000007c
  50:   add     x26, x26, #0x1
  54:   mov     x10, #0x110                     // #272
  58:   add     x10, x1, x10
  5c:   lsl     x11, x2, #3
  60:   ldr     x11, [x10,x11]                  <-- faulting insn (f86b694b)
  64:   cbz     x11, 0x000000000000007c
  [...]

Meaning, the tests passed because commit ddb55992b0 ("arm64:
bpf: implement bpf_tail_call() helper") was using signed compares
instead of unsigned which as a result had the test wrongly passing.

Change this but also the tail call count test both into unsigned
and cap the index as u32. Latter we did as well in 90caccdd8c
("bpf: fix bpf_tail_call() x64 JIT") and is needed in addition here,
too. Tested on HiSilicon Hi1616.

Result after patch:

  # bpftool p d j i 268
  [...]
  38:	ldr	w10, [x1,x10]
  3c:	add	w2, w2, #0x0
  40:	cmp	w2, w10
  44:	b.cs	0x0000000000000080
  48:	mov	x10, #0x20                  	// #32
  4c:	cmp	x26, x10
  50:	b.hi	0x0000000000000080
  54:	add	x26, x26, #0x1
  58:	mov	x10, #0x110                 	// #272
  5c:	add	x10, x1, x10
  60:	lsl	x11, x2, #3
  64:	ldr	x11, [x10,x11]
  68:	cbz	x11, 0x0000000000000080
  [...]

Fixes: ddb55992b0 ("arm64: bpf: implement bpf_tail_call() helper")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-02-22 16:06:28 -08:00
Will Deacon
15122ee2c5 arm64: Enforce BBM for huge IO/VMAP mappings
ioremap_page_range doesn't honour break-before-make and attempts to put
down huge mappings (using p*d_set_huge) over the top of pre-existing
table entries. This leads to us leaking page table memory and also gives
rise to TLB conflicts and spurious aborts, which have been seen in
practice on Cortex-A75.

Until this has been resolved, refuse to put block mappings when the
existing entry is found to be present.

Fixes: 324420bf91 ("arm64: add support for ioremap() block mappings")
Reported-by: Hanjun Guo <hanjun.guo@linaro.org>
Reported-by: Lei Li <lious.lilei@hisilicon.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-02-22 11:25:53 +00:00
Ingo Molnar
ed7158bae4 treewide/trivial: Remove ';;$' typo noise
On lkml suggestions were made to split up such trivial typo fixes into per subsystem
patches:

  --- a/arch/x86/boot/compressed/eboot.c
  +++ b/arch/x86/boot/compressed/eboot.c
  @@ -439,7 +439,7 @@ setup_uga32(void **uga_handle, unsigned long size, u32 *width, u32 *height)
          struct efi_uga_draw_protocol *uga = NULL, *first_uga;
          efi_guid_t uga_proto = EFI_UGA_PROTOCOL_GUID;
          unsigned long nr_ugas;
  -       u32 *handles = (u32 *)uga_handle;;
  +       u32 *handles = (u32 *)uga_handle;
          efi_status_t status = EFI_INVALID_PARAMETER;
          int i;

This patch is the result of the following script:

  $ sed -i 's/;;$/;/g' $(git grep -E ';;$'  | grep "\.[ch]:"  | grep -vwE 'for|ia64' | cut -d: -f1 | sort | uniq)

... followed by manual review to make sure it's all good.

Splitting this up is just crazy talk, let's get over with this and just do it.

Reported-by: Pavel Machek <pavel@ucw.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-22 10:59:33 +01:00
Mark Rutland
0331365edb arm64: perf: correct PMUVer probing
The ID_AA64DFR0_EL1.PMUVer field doesn't follow the usual ID registers
scheme. While value 0xf indicates a non-architected PMU is implemented,
values 0x1 to 0xe indicate an increasingly featureful architected PMU,
as if the field were unsigned.

For more details, see ARM DDI 0487C.a, D10.1.4, "Alternative ID scheme
used for the Performance Monitors Extension version".

Currently, we treat the field as signed, and erroneously bail out for
values 0x8 to 0xe. Let's correct that.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-02-20 11:34:54 +00:00
Will Deacon
a06f818a70 arm64: __show_regs: Only resolve kernel symbols when running at EL1
__show_regs pretty prints PC and LR by attempting to map them to kernel
function names to improve the utility of crash reports. Unfortunately,
this mapping is applied even when the pt_regs corresponds to user mode,
resulting in a KASLR oracle.

Avoid this issue by only looking up the function symbols when the register
state indicates that we're actually running at EL1.

Cc: <stable@vger.kernel.org>
Reported-by: NCSC Security <security@ncsc.gov.uk>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-02-19 17:07:12 +00:00
Michael Weiser
1962682d2b arm64: Remove unimplemented syscall log message
Stop printing a (ratelimited) kernel message for each instance of an
unimplemented syscall being called. Userland making an unimplemented
syscall is not necessarily misbehaviour and to be expected with a
current userland running on an older kernel. Also, the current message
looks scary to users but does not actually indicate a real problem nor
help them narrow down the cause. Just rely on sys_ni_syscall() to return
-ENOSYS.

Cc: <stable@vger.kernel.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Michael Weiser <michael.weiser@gmx.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-02-19 17:05:53 +00:00
Michael Weiser
5ee39a71fd arm64: Disable unhandled signal log messages by default
aarch64 unhandled signal kernel messages are very verbose, suggesting
them to be more of a debugging aid:

sigsegv[33]: unhandled level 2 translation fault (11) at 0x00000000, esr
0x92000046, in sigsegv[400000+71000]
CPU: 1 PID: 33 Comm: sigsegv Tainted: G        W        4.15.0-rc3+ #3
Hardware name: linux,dummy-virt (DT)
pstate: 60000000 (nZCv daif -PAN -UAO)
pc : 0x4003f4
lr : 0x4006bc
sp : 0000fffffe94a060
x29: 0000fffffe94a070 x28: 0000000000000000
x27: 0000000000000000 x26: 0000000000000000
x25: 0000000000000000 x24: 00000000004001b0
x23: 0000000000486ac8 x22: 00000000004001c8
x21: 0000000000000000 x20: 0000000000400be8
x19: 0000000000400b30 x18: 0000000000484728
x17: 000000000865ffc8 x16: 000000000000270f
x15: 00000000000000b0 x14: 0000000000000002
x13: 0000000000000001 x12: 0000000000000000
x11: 0000000000000000 x10: 0008000020008008
x9 : 000000000000000f x8 : ffffffffffffffff
x7 : 0004000000000000 x6 : ffffffffffffffff
x5 : 0000000000000000 x4 : 0000000000000000
x3 : 00000000004003e4 x2 : 0000fffffe94a1e8
x1 : 000000000000000a x0 : 0000000000000000

Disable them by default, so they can be enabled using
/proc/sys/debug/exception-trace.

Cc: <stable@vger.kernel.org>
Signed-off-by: Michael Weiser <michael.weiser@gmx.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-02-19 17:05:26 +00:00
Will Deacon
be68a8aaf9 arm64: cpufeature: Fix CTR_EL0 field definitions
Our field definitions for CTR_EL0 suffer from a number of problems:

  - The IDC and DIC fields are missing, which causes us to enable CTR
    trapping on CPUs with either of these returning non-zero values.

  - The ERG is FTR_LOWER_SAFE, whereas it should be treated like CWG as
    FTR_HIGHER_SAFE so that applications can use it to avoid false sharing.

  - [nit] A RES1 field is described as "RAO"

This patch updates the CTR_EL0 field definitions to fix these issues.

Cc: <stable@vger.kernel.org>
Cc: Shanker Donthineni <shankerd@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-02-19 17:02:09 +00:00
Robin Murphy
9085b34d0e arm64: uaccess: Formalise types for access_ok()
In converting __range_ok() into a static inline, I inadvertently made
it more type-safe, but without considering the ordering of the relevant
conversions. This leads to quite a lot of Sparse noise about the fact
that we use __chk_user_ptr() after addr has already been converted from
a user pointer to an unsigned long.

Rather than just adding another cast for the sake of shutting Sparse up,
it seems reasonable to rework the types to make logical sense (although
the resulting codegen for __range_ok() remains identical). The only
callers this affects directly are our compat traps where the inferred
"user-pointer-ness" of a register value now warrants explicit casting.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-02-19 13:59:58 +00:00
Bhupesh Sharma
04c4927359 arm64: Fix compilation error while accessing MPIDR_HWID_BITMASK from .S files
Since commit e1a50de378 (arm64: cputype: Silence Sparse warnings),
compilation of arm64 architecture is broken with the following error
messages:

  AR      arch/arm64/kernel/built-in.o
  arch/arm64/kernel/head.S: Assembler messages:
  arch/arm64/kernel/head.S:677: Error: found 'L', expected: ')'
  arch/arm64/kernel/head.S:677: Error: found 'L', expected: ')'
  arch/arm64/kernel/head.S:677: Error: found 'L', expected: ')'
  arch/arm64/kernel/head.S:677: Error: junk at end of line, first
  unrecognized character is `L'
  arch/arm64/kernel/head.S:677: Error: unexpected characters following
  instruction at operand 2 -- `movz x1,:abs_g1_s:0xff00ffffffUL'
  arch/arm64/kernel/head.S:677: Error: unexpected characters following
  instruction at operand 2 -- `movk x1,:abs_g0_nc:0xff00ffffffUL'

This patch fixes the same by using the UL() macro correctly for
assigning the MPIDR_HWID_BITMASK macro value.

Fixes: e1a50de378 ("arm64: cputype: Silence Sparse warnings")
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-02-19 12:13:29 +00:00
Robin Murphy
e1a50de378 arm64: cputype: Silence Sparse warnings
Sparse makes a fair bit of noise about our MPIDR mask being implicitly
long - let's explicitly describe it as such rather than just relying on
the value forcing automatic promotion.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-02-17 08:37:05 +00:00
Will Deacon
20a004e7b0 arm64: mm: Use READ_ONCE/WRITE_ONCE when accessing page tables
In many cases, page tables can be accessed concurrently by either another
CPU (due to things like fast gup) or by the hardware page table walker
itself, which may set access/dirty bits. In such cases, it is important
to use READ_ONCE/WRITE_ONCE when accessing page table entries so that
entries cannot be torn, merged or subject to apparent loss of coherence
due to compiler transformations.

Whilst there are some scenarios where this cannot happen (e.g. pinned
kernel mappings for the linear region), the overhead of using READ_ONCE
/WRITE_ONCE everywhere is minimal and makes the code an awful lot easier
to reason about. This patch consistently uses these macros in the arch
code, as well as explicitly namespacing pointers to page table entries
from the entries themselves by using adopting a 'p' suffix for the former
(as is sometimes used elsewhere in the kernel source).

Tested-by: Yury Norov <ynorov@caviumnetworks.com>
Tested-by: Richard Ruigrok <rruigrok@codeaurora.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-02-16 18:13:57 +00:00
Will Deacon
2ce77f6d8a arm64: proc: Set PTE_NG for table entries to avoid traversing them twice
When KASAN is enabled, the swapper page table contains many identical
mappings of the zero page, which can lead to a stall during boot whilst
the G -> nG code continually walks the same page table entries looking
for global mappings.

This patch sets the nG bit (bit 11, which is IGNORED) in table entries
after processing the subtree so we can easily skip them if we see them
a second time.

Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-02-14 18:58:20 +00:00
Shanker Donthineni
16e574d762 arm64: Add missing Falkor part number for branch predictor hardening
References to CPU part number MIDR_QCOM_FALKOR were dropped from the
mailing list patch due to mainline/arm64 branch dependency. So this
patch adds the missing part number.

Fixes: ec82b567a7 ("arm64: Implement branch predictor hardening for Falkor")
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-02-12 11:28:45 +00:00
Linus Torvalds
15303ba5d1 KVM changes for 4.16
ARM:
 - Include icache invalidation optimizations, improving VM startup time
 
 - Support for forwarded level-triggered interrupts, improving
   performance for timers and passthrough platform devices
 
 - A small fix for power-management notifiers, and some cosmetic changes
 
 PPC:
 - Add MMIO emulation for vector loads and stores
 
 - Allow HPT guests to run on a radix host on POWER9 v2.2 CPUs without
   requiring the complex thread synchronization of older CPU versions
 
 - Improve the handling of escalation interrupts with the XIVE interrupt
   controller
 
 - Support decrement register migration
 
 - Various cleanups and bugfixes.
 
 s390:
 - Cornelia Huck passed maintainership to Janosch Frank
 
 - Exitless interrupts for emulated devices
 
 - Cleanup of cpuflag handling
 
 - kvm_stat counter improvements
 
 - VSIE improvements
 
 - mm cleanup
 
 x86:
 - Hypervisor part of SEV
 
 - UMIP, RDPID, and MSR_SMI_COUNT emulation
 
 - Paravirtualized TLB shootdown using the new KVM_VCPU_PREEMPTED bit
 
 - Allow guests to see TOPOEXT, GFNI, VAES, VPCLMULQDQ, and more AVX512
   features
 
 - Show vcpu id in its anonymous inode name
 
 - Many fixes and cleanups
 
 - Per-VCPU MSR bitmaps (already merged through x86/pti branch)
 
 - Stable KVM clock when nesting on Hyper-V (merged through x86/hyperv)
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABCAAGBQJafvMtAAoJEED/6hsPKofo6YcH/Rzf2RmshrWaC3q82yfIV0Qz
 Z8N8yJHSaSdc3Jo6cmiVj0zelwAxdQcyjwlT7vxt5SL2yML+/Q0st9Hc3EgGGXPm
 Il99eJEl+2MYpZgYZqV8ff3mHS5s5Jms+7BITAeh6Rgt+DyNbykEAvzt+MCHK9cP
 xtsIZQlvRF7HIrpOlaRzOPp3sK2/MDZJ1RBE7wYItK3CUAmsHim/LVYKzZkRTij3
 /9b4LP1yMMbziG+Yxt1o682EwJB5YIat6fmDG9uFeEVI5rWWN7WFubqs8gCjYy/p
 FX+BjpOdgTRnX+1m9GIj0Jlc/HKMXryDfSZS07Zy4FbGEwSiI5SfKECub4mDhuE=
 =C/uD
 -----END PGP SIGNATURE-----

Merge tag 'kvm-4.16-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Radim Krčmář:
 "ARM:

   - icache invalidation optimizations, improving VM startup time

   - support for forwarded level-triggered interrupts, improving
     performance for timers and passthrough platform devices

   - a small fix for power-management notifiers, and some cosmetic
     changes

  PPC:

   - add MMIO emulation for vector loads and stores

   - allow HPT guests to run on a radix host on POWER9 v2.2 CPUs without
     requiring the complex thread synchronization of older CPU versions

   - improve the handling of escalation interrupts with the XIVE
     interrupt controller

   - support decrement register migration

   - various cleanups and bugfixes.

  s390:

   - Cornelia Huck passed maintainership to Janosch Frank

   - exitless interrupts for emulated devices

   - cleanup of cpuflag handling

   - kvm_stat counter improvements

   - VSIE improvements

   - mm cleanup

  x86:

   - hypervisor part of SEV

   - UMIP, RDPID, and MSR_SMI_COUNT emulation

   - paravirtualized TLB shootdown using the new KVM_VCPU_PREEMPTED bit

   - allow guests to see TOPOEXT, GFNI, VAES, VPCLMULQDQ, and more
     AVX512 features

   - show vcpu id in its anonymous inode name

   - many fixes and cleanups

   - per-VCPU MSR bitmaps (already merged through x86/pti branch)

   - stable KVM clock when nesting on Hyper-V (merged through
     x86/hyperv)"

* tag 'kvm-4.16-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (197 commits)
  KVM: PPC: Book3S: Add MMIO emulation for VMX instructions
  KVM: PPC: Book3S HV: Branch inside feature section
  KVM: PPC: Book3S HV: Make HPT resizing work on POWER9
  KVM: PPC: Book3S HV: Fix handling of secondary HPTEG in HPT resizing code
  KVM: PPC: Book3S PR: Fix broken select due to misspelling
  KVM: x86: don't forget vcpu_put() in kvm_arch_vcpu_ioctl_set_sregs()
  KVM: PPC: Book3S PR: Fix svcpu copying with preemption enabled
  KVM: PPC: Book3S HV: Drop locks before reading guest memory
  kvm: x86: remove efer_reload entry in kvm_vcpu_stat
  KVM: x86: AMD Processor Topology Information
  x86/kvm/vmx: do not use vm-exit instruction length for fast MMIO when running nested
  kvm: embed vcpu id to dentry of vcpu anon inode
  kvm: Map PFN-type memory regions as writable (if possible)
  x86/kvm: Make it compile on 32bit and with HYPYERVISOR_GUEST=n
  KVM: arm/arm64: Fixup userspace irqchip static key optimization
  KVM: arm/arm64: Fix userspace_irqchip_in_use counting
  KVM: arm/arm64: Fix incorrect timer_is_pending logic
  MAINTAINERS: update KVM/s390 maintainers
  MAINTAINERS: add Halil as additional vfio-ccw maintainer
  MAINTAINERS: add David as a reviewer for KVM/s390
  ...
2018-02-10 13:16:35 -08:00
Linus Torvalds
54ce685cae More ACPI updates for v4.16-rc1
- Update the ACPICA kernel code to upstream revision 20180105 including:
    * Assorted fixes (Jung-uk Kim).
    * Support for X32 ABI compilation (Anuj Mittal).
    * Update of ACPICA copyrights to 2018 (Bob Moore).
 
  - Prepare for future modifications to avoid executing the _STA control
    method too early (Hans de Goede).
 
  - Make the processor performance control library code ignore _PPC
    notifications if they cannot be handled and fix up the C1 idle
    state definition when it is used as a fallback state (Chen Yu,
    Yazen Ghannam).
 
  - Make it possible to use the SPCR table on x86 and to replace the
    original IORT table with a new one from initrd (Prarit Bhargava,
    Shunyong Yang).
 
  - Add battery-related quirks for Asus UX360UA and UX410UAK and add
    quirks for table parsing on Dell XPS 9570 and Precision M5530
    (Kai Heng Feng).
 
  - Address static checker warnings in the CPPC code (Gustavo Silva).
 
  - Avoid printing a raw pointer to the kernel log in the smart
    battery driver (Greg Kroah-Hartman).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJafGvJAAoJEILEb/54YlRxiusQAKUa+OM/oxTJkOEfGGRM8NlS
 Hq/PaL/TnAj3nCoZN9fM38mI4gkxqu3eVMv6kfiqRe8VYmUX9r9tRbQ9kxvEYa7n
 s6Dl+wdC9UND20QJkYVzPlaXbPuZyLFHt4Fkb1hp+HAGgNNYqc4e0lJvI82F2pdo
 im1UFI84jg9UQV4WpUJL6ny2c/RMNtpUV5fOKFD8lkvBvVe7mtZTZ+1nZDeqXGkV
 jzdrVTHLUEDhjS1o0TBmEsJGNeGOqnK/f+m8Rq4397guPAQQq18MYNC68SzhuGjP
 iqhvIvI9sF197i66l/qgsubBifOV4At8Wb0LA5cU8CQLLpEW8GDktz/kucVHyzJ4
 cVKuPXptBwwtPbNFHWO8reTUFMAnP7IpjtC31ntr6xWRQCiXv0/i2hRRN54g9T7e
 FAOBmmys5DKFOq50OB5WdD3/Qz5OUuVgdbrSxNFARIZpQFtUn7Np2/nmNpPgrrcl
 77hO8dpeXUTVvM4HpRQN1+r0KOTLfTAvWV7LYLAjCF9ivc0Vop/tYZQ2VEMSUEFD
 SGKC30mGC4pphAjxcSYV282JR7Jx7arQ71ZA5uYTRRuxnEQd/2MC71fNjrFmCgUW
 1Pumw0Pw6eZRjj1FZ/pj0X5lm7AlZj0dVzsJFgNb0FcJW0nOhN3czQrA4igoSVng
 B2sRv9U8YDnDtzHyTPrY
 =rVdp
 -----END PGP SIGNATURE-----

Merge tag 'acpi-part2-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull more ACPI updates from Rafael Wysocki:
 "These are mostly fixes and cleanups, a few new quirks, a couple of
  updates related to the handling of ACPI tables and ACPICA copyrights
  refreshment.

  Specifics:

   - Update the ACPICA kernel code to upstream revision 20180105
     including:
       * Assorted fixes (Jung-uk Kim)
       * Support for X32 ABI compilation (Anuj Mittal)
       * Update of ACPICA copyrights to 2018 (Bob Moore)

   - Prepare for future modifications to avoid executing the _STA
     control method too early (Hans de Goede)

   - Make the processor performance control library code ignore _PPC
     notifications if they cannot be handled and fix up the C1 idle
     state definition when it is used as a fallback state (Chen Yu,
     Yazen Ghannam)

   - Make it possible to use the SPCR table on x86 and to replace the
     original IORT table with a new one from initrd (Prarit Bhargava,
     Shunyong Yang)

   - Add battery-related quirks for Asus UX360UA and UX410UAK and add
     quirks for table parsing on Dell XPS 9570 and Precision M5530 (Kai
     Heng Feng)

   - Address static checker warnings in the CPPC code (Gustavo Silva)

   - Avoid printing a raw pointer to the kernel log in the smart battery
     driver (Greg Kroah-Hartman)"

* tag 'acpi-part2-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: sbshc: remove raw pointer from printk() message
  ACPI: SPCR: Make SPCR available to x86
  ACPI / CPPC: Use 64-bit arithmetic instead of 32-bit
  ACPI / tables: Add IORT to injectable table list
  ACPI / bus: Parse tables as term_list for Dell XPS 9570 and Precision M5530
  ACPICA: Update version to 20180105
  ACPICA: All acpica: Update copyrights to 2018
  ACPI / processor: Set default C1 idle state description
  ACPI / battery: Add quirk for Asus UX360UA and UX410UAK
  ACPI: processor_perflib: Do not send _PPC change notification if not ready
  ACPI / scan: Use acpi_bus_get_status() to initialize ACPI_TYPE_DEVICE devs
  ACPI / bus: Do not call _STA on battery devices with unmet dependencies
  PCI: acpiphp_ibm: prepare for acpi_get_object_info() no longer returning status
  ACPI: export acpi_bus_get_status_handle()
  ACPICA: Add a missing pair of parentheses
  ACPICA: Prefer ACPI_TO_POINTER() over ACPI_ADD_PTR()
  ACPICA: Avoid NULL pointer arithmetic
  ACPICA: Linux: add support for X32 ABI compilation
  ACPI / video: Use true for boolean value
2018-02-09 09:44:25 -08:00
Linus Torvalds
c013632192 2nd set of arm64 updates for 4.16:
Spectre v1 mitigation:
 - back-end version of array_index_mask_nospec()
 - masking of the syscall number to restrict speculation through the
   syscall table
 - masking of __user pointers prior to deference in uaccess routines
 
 Spectre v2 mitigation update:
 - using the new firmware SMC calling convention specification update
 - removing the current PSCI GET_VERSION firmware call mitigation as
   vendors are deploying new SMCCC-capable firmware
 - additional branch predictor hardening for synchronous exceptions and
   interrupts while in user mode
 
 Meltdown v3 mitigation update for Cavium Thunder X: unaffected but
 hardware erratum gets in the way. The kernel now starts with the page
 tables mapped as global and switches to non-global if kpti needs to be
 enabled.
 
 Other:
 - Theoretical trylock bug fixed
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAlp8lqcACgkQa9axLQDI
 XvH2lxAAnsYqthpGQ11MtDJB+/UiBAFkg9QWPDkwrBDvNhgpll+J0VQuCN1QJ2GX
 qQ8rkv8uV+y4Fqr8hORGJy5At+0aI63ZCJ72RGkZTzJAtbFbFGIDHP7RhAEIGJBS
 Lk9kDZ7k39wLEx30UXIFYTTVzyHar397TdI7vkTcngiTzZ8MdFATfN/hiKO906q3
 14pYnU9Um4aHUdcJ+FocL3dxvdgniuuMBWoNiYXyOCZXjmbQOnDNU2UrICroV8lS
 mB+IHNEhX1Gl35QzNBtC0ET+aySfHBMJmM5oln+uVUljIGx6En1WLj6mrHYcx8U2
 rIBm5qO/X/4iuzYPGkxwQtpjq3wPYxsSUnMdKJrsUZqAfy2QeIhFx6XUtJsZPB2J
 /lgls5xSXMOS7oiOQtmVjcDLBURDmYXGwljXR4n4jLm4CT1V9qSLcKHu1gdFU9Mq
 VuMUdPOnQub1vqKndi154IoYDTo21jAib2ktbcxpJfSJnDYoit4Gtnv7eWY+M3Pd
 Toaxi8htM2HSRwbvslHYGW8ZcVpI79Jit+ti7CsFg7m9Lvgs0zxcnNui4uPYDymT
 jh2JYxuirIJbX9aGGhnmkNhq9REaeZJg9LA2JM8S77FCHN3bnlSdaG6wy899J6EI
 lK4anCuPQKKKhUia/dc1MeKwrmmC18EfPyGUkOzywg/jGwGCmZM=
 =Y0TT
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull more arm64 updates from Catalin Marinas:
 "As I mentioned in the last pull request, there's a second batch of
  security updates for arm64 with mitigations for Spectre/v1 and an
  improved one for Spectre/v2 (via a newly defined firmware interface
  API).

  Spectre v1 mitigation:

   - back-end version of array_index_mask_nospec()

   - masking of the syscall number to restrict speculation through the
     syscall table

   - masking of __user pointers prior to deference in uaccess routines

  Spectre v2 mitigation update:

   - using the new firmware SMC calling convention specification update

   - removing the current PSCI GET_VERSION firmware call mitigation as
     vendors are deploying new SMCCC-capable firmware

   - additional branch predictor hardening for synchronous exceptions
     and interrupts while in user mode

  Meltdown v3 mitigation update:

    - Cavium Thunder X is unaffected but a hardware erratum gets in the
      way. The kernel now starts with the page tables mapped as global
      and switches to non-global if kpti needs to be enabled.

  Other:

   - Theoretical trylock bug fixed"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (38 commits)
  arm64: Kill PSCI_GET_VERSION as a variant-2 workaround
  arm64: Add ARM_SMCCC_ARCH_WORKAROUND_1 BP hardening support
  arm/arm64: smccc: Implement SMCCC v1.1 inline primitive
  arm/arm64: smccc: Make function identifiers an unsigned quantity
  firmware/psci: Expose SMCCC version through psci_ops
  firmware/psci: Expose PSCI conduit
  arm64: KVM: Add SMCCC_ARCH_WORKAROUND_1 fast handling
  arm64: KVM: Report SMCCC_ARCH_WORKAROUND_1 BP hardening support
  arm/arm64: KVM: Turn kvm_psci_version into a static inline
  arm/arm64: KVM: Advertise SMCCC v1.1
  arm/arm64: KVM: Implement PSCI 1.0 support
  arm/arm64: KVM: Add smccc accessors to PSCI code
  arm/arm64: KVM: Add PSCI_VERSION helper
  arm/arm64: KVM: Consolidate the PSCI include files
  arm64: KVM: Increment PC after handling an SMC trap
  arm: KVM: Fix SMCCC handling of unimplemented SMC/HVC calls
  arm64: KVM: Fix SMCCC handling of unimplemented SMC/HVC calls
  arm64: entry: Apply BP hardening for suspicious interrupts from EL0
  arm64: entry: Apply BP hardening for high-priority synchronous exceptions
  arm64: futex: Mask __user pointers prior to dereference
  ...
2018-02-08 10:44:25 -08:00
Prarit Bhargava
0231d00082 ACPI: SPCR: Make SPCR available to x86
SPCR is currently only enabled or ARM64 and x86 can use SPCR to setup
an early console.

General fixes include updating Documentation & Kconfig (for x86),
updating comments, and changing parse_spcr() to acpi_parse_spcr(),
and earlycon_init_is_deferred to earlycon_acpi_spcr_enable to be
more descriptive.

On x86, many systems have a valid SPCR table but the table version is
not 2 so the table version check must be a warning.

On ARM64 when the kernel parameter earlycon is used both the early console
and console are enabled.  On x86, only the earlycon should be enabled by
by default.  Modify acpi_parse_spcr() to allow options for initializing
the early console and console separately.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Mark Salter <msalter@redhat.com>
Tested-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-02-07 11:39:58 +01:00
Linus Torvalds
a2e5790d84 Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:

 - kasan updates

 - procfs

 - lib/bitmap updates

 - other lib/ updates

 - checkpatch tweaks

 - rapidio

 - ubsan

 - pipe fixes and cleanups

 - lots of other misc bits

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (114 commits)
  Documentation/sysctl/user.txt: fix typo
  MAINTAINERS: update ARM/QUALCOMM SUPPORT patterns
  MAINTAINERS: update various PALM patterns
  MAINTAINERS: update "ARM/OXNAS platform support" patterns
  MAINTAINERS: update Cortina/Gemini patterns
  MAINTAINERS: remove ARM/CLKDEV SUPPORT file pattern
  MAINTAINERS: remove ANDROID ION pattern
  mm: docs: add blank lines to silence sphinx "Unexpected indentation" errors
  mm: docs: fix parameter names mismatch
  mm: docs: fixup punctuation
  pipe: read buffer limits atomically
  pipe: simplify round_pipe_size()
  pipe: reject F_SETPIPE_SZ with size over UINT_MAX
  pipe: fix off-by-one error when checking buffer limits
  pipe: actually allow root to exceed the pipe buffer limits
  pipe, sysctl: remove pipe_proc_fn()
  pipe, sysctl: drop 'min' parameter from pipe-max-size converter
  kasan: rework Kconfig settings
  crash_dump: is_kdump_kernel can be boolean
  kernel/mutex: mutex_is_locked can be boolean
  ...
2018-02-06 22:15:42 -08:00
Yury Norov
3aa56885e5 bitmap: replace bitmap_{from,to}_u32array
with bitmap_{from,to}_arr32 over the kernel. Additionally to it:
* __check_eq_bitmap() now takes single nbits argument.
* __check_eq_u32_array is not used in new test but may be used in
  future. So I don't remove it here, but annotate as __used.

Tested on arm64 and 32-bit BE mips.

[arnd@arndb.de: perf: arm_dsu_pmu: convert to bitmap_from_arr32]
  Link: http://lkml.kernel.org/r/20180201172508.5739-2-ynorov@caviumnetworks.com
[ynorov@caviumnetworks.com: fix net/core/ethtool.c]
  Link: http://lkml.kernel.org/r/20180205071747.4ekxtsbgxkj5b2fz@yury-thinkpad
Link: http://lkml.kernel.org/r/20171228150019.27953-2-ynorov@caviumnetworks.com
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: David Decotigny <decot@googlers.com>,
Cc: David S. Miller <davem@davemloft.net>,
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-06 18:32:44 -08:00
Andrey Konovalov
917538e212 kasan: clean up KASAN_SHADOW_SCALE_SHIFT usage
Right now the fact that KASAN uses a single shadow byte for 8 bytes of
memory is scattered all over the code.

This change defines KASAN_SHADOW_SCALE_SHIFT early in asm include files
and makes use of this constant where necessary.

[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/34937ca3b90736eaad91b568edf5684091f662e3.1515775666.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-06 18:32:43 -08:00