Commit Graph

6476 Commits

Author SHA1 Message Date
Ursula Braun
41c80be24b s390/net: move pnet constants
There is no need to define these PNETID related constants in
the pnet.h file, since they are just used locally within pnet.c.
Just code cleanup, no functional change.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-07 18:06:18 -08:00
Martin Schwidefsky
142c52d7bc s390: add alignment hints to vector load and store
The z14 introduced alignment hints to increase the performance of
vector loads and stores. The kernel uses an implicit alignmenet
of 8 bytes for the vector registers, set the alignment hint to 3.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-02-07 11:57:10 +01:00
YueHaibing
f8b11e089a s390: remove unused including <linux/version.h>
Remove including <linux/version.h> that don't need it.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-02-07 11:57:09 +01:00
Julian Wiedmann
bdf117674e s390/qdio: make SBAL address array type-safe
There is no need to use void pointers, all drivers are in agreement
about the underlying data structure of the SBAL arrays.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-02-07 11:57:07 +01:00
Sebastian Ott
cfbb4a7ab6 s390/pci: map IOV resources
Map IOV resources such that pci common code recognizes the IOV
capability of PFs.

Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-02-07 11:57:06 +01:00
Sebastian Ott
e8e25a7718 s390/pci: improve bar check
Improve the bar check in pci_iomap_range to cover functions
for which we recognize more bars than what we can access due
to AR restrictions.

Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-02-07 11:57:04 +01:00
Martin Schwidefsky
a0308c1315 s390/mmap: take stack_guard_gap into account for mmap_base
The s390 version of the mmap_base function is ignorant of stack_guard_gap
which can lead to a placement of the stack vs. the mmap base that does not
leave enough space for the stack rlimit.

Add the stack_guard_gap to the calculation and while we are at it the
check for gap+pad overflows as well.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-02-07 11:57:01 +01:00
Gerald Schaefer
d4192437d7 s390: remove dead code
Remove some dead code from head64.S, which was left over since commit
da292bbe1f ("[S390] eliminate ipl_device from lowcore") removed
ipl_device from lowcore.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-02-07 11:56:57 +01:00
Gerald Schaefer
ea0ca93d6a s390/setup: remove obsolete #ifdef
The #ifdef CONFIG_DMA_API_DEBUG check in reserve_kernel() is no longer
needed, since commit ea535e418c ("dma-debug: switch check from _text
to _stext") changed the logic in lib/dma-debug.c, so remove it.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-02-07 11:56:55 +01:00
Sebastian Ott
614db26954 Revert "s390/pci: remove bit_lock usage in interrupt handler"
This reverts commit 9594ca6b87.

With the handle_simple_irq irq_flow_handler it must be ensured to
not call generic_handle_irq with the same IRQ number on 2 CPUs at
the same time (interrupts are floating on s390).
Contrary to my initial investigation the irq_desc's lock usage in
handle_simple_irq does not ensure this. Thus re-introduce the bit-
lock usage in s390's pci handler.

Reported-by: Ursula Braun <ubraun@linux.ibm.com>
Reported-by: Alexander Schmidt <alexs@linux.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-02-07 11:56:29 +01:00
Arnd Bergmann
48166e6ea4 y2038: add 64-bit time_t syscalls to all 32-bit architectures
This adds 21 new system calls on each ABI that has 32-bit time_t
today. All of these have the exact same semantics as their existing
counterparts, and the new ones all have macro names that end in 'time64'
for clarification.

This gets us to the point of being able to safely use a C library
that has 64-bit time_t in user space. There are still a couple of
loose ends to tie up in various areas of the code, but this is the
big one, and should be entirely uncontroversial at this point.

In particular, there are four system calls (getitimer, setitimer,
waitid, and getrusage) that don't have a 64-bit counterpart yet,
but these can all be safely implemented in the C library by wrapping
around the existing system calls because the 32-bit time_t they
pass only counts elapsed time, not time since the epoch. They
will be dealt with later.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
2019-02-07 00:13:28 +01:00
Arnd Bergmann
d33c577ccc y2038: rename old time and utime syscalls
The time, stime, utime, utimes, and futimesat system calls are only
used on older architectures, and we do not provide y2038 safe variants
of them, as they are replaced by clock_gettime64, clock_settime64,
and utimensat_time64.

However, for consistency it seems better to have the 32-bit architectures
that still use them call the "time32" entry points (leaving the
traditional handlers for the 64-bit architectures), like we do for system
calls that now require two versions.

Note: We used to always define __ARCH_WANT_SYS_TIME and
__ARCH_WANT_SYS_UTIME and only set __ARCH_WANT_COMPAT_SYS_TIME and
__ARCH_WANT_SYS_UTIME32 for compat mode on 64-bit kernels. Now this is
reversed: only 64-bit architectures set __ARCH_WANT_SYS_TIME/UTIME, while
we need __ARCH_WANT_SYS_TIME32/UTIME32 for 32-bit architectures and compat
mode. The resulting asm/unistd.h changes look a bit counterintuitive.

This is only a cleanup patch and it should not change any behavior.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2019-02-07 00:13:28 +01:00
Arnd Bergmann
805089c2f7 syscalls: remove obsolete __IGNORE_ macros
These are all for ignoring the lack of obsolete system calls,
which have been marked the same way in scripts/checksyscall.sh,
so these can be removed.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2019-02-07 00:13:27 +01:00
Arnd Bergmann
8dabe7245b y2038: syscalls: rename y2038 compat syscalls
A lot of system calls that pass a time_t somewhere have an implementation
using a COMPAT_SYSCALL_DEFINEx() on 64-bit architectures, and have
been reworked so that this implementation can now be used on 32-bit
architectures as well.

The missing step is to redefine them using the regular SYSCALL_DEFINEx()
to get them out of the compat namespace and make it possible to build them
on 32-bit architectures.

Any system call that ends in 'time' gets a '32' suffix on its name for
that version, while the others get a '_time32' suffix, to distinguish
them from the normal version, which takes a 64-bit time argument in the
future.

In this step, only 64-bit architectures are changed, doing this rename
first lets us avoid touching the 32-bit architectures twice.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-02-07 00:13:27 +01:00
Mathieu Poirier
840018668c perf/aux: Make perf_event accessible to setup_aux()
When pmu::setup_aux() is called the coresight PMU needs to know which
sink to use for the session by looking up the information in the
event's attr::config2 field.

As such simply replace the cpu information by the complete perf_event
structure and change all affected customers.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-s390@vger.kernel.org
Link: http://lkml.kernel.org/r/20190131184714.20388-2-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-06 10:00:39 -03:00
Michael Mueller
b9fa6d6ee9 KVM: s390: fix possible null pointer dereference in pending_irqs()
Assure a GISA is in use before accessing the IPM to avoid a
null pointer dereference issue.

Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
Reported-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190131085247.13826-16-mimu@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2019-02-05 14:29:24 +01:00
Michael Mueller
b1d1e76ed9 KVM: s390: start using the GIB
By initializing the GIB, it will be used by the kvm host.

Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Message-Id: <20190131085247.13826-15-mimu@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2019-02-05 14:29:24 +01:00
Michael Mueller
9f30f62163 KVM: s390: add gib_alert_irq_handler()
The patch implements a handler for GIB alert interruptions
on the host. Its task is to alert guests that interrupts are
pending for them.

A GIB alert interrupt statistic counter is added as well:

$ cat /proc/interrupts
          CPU0       CPU1
  ...
  GAL:      23         37   [I/O] GIB Alert
  ...

Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Message-Id: <20190131085247.13826-14-mimu@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2019-02-05 14:29:23 +01:00
Michael Mueller
174dd4f888 KVM: s390: kvm_s390_gisa_clear() now clears the IPM only
Function kvm_s390_gisa_clear() now clears the Interruption
Pending Mask of the GISA asap. If the GISA is in the alert
list at this time it stays in the list but is removed by
process_gib_alert_list().

Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Message-Id: <20190131085247.13826-13-mimu@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2019-02-05 14:29:23 +01:00
Michael Mueller
6cff2e1046 KVM: s390: add functions to (un)register GISC with GISA
Add the Interruption Alert Mask (IAM) to the architecture specific
kvm struct. This mask in the GISA is used to define for which ISC
a GIB alert will be issued.

The functions kvm_s390_gisc_register() and kvm_s390_gisc_unregister()
are used to (un)register a GISC (guest ISC) with a virtual machine and
its GISA.

Upon successful completion, kvm_s390_gisc_register() returns the
ISC to be used for GIB alert interruptions. A negative return code
indicates an error during registration.

Theses functions will be used by other adapter types like AP and PCI to
request pass-through interruption support.

Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
Acked-by: Pierre Morel <pmorel@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190131085247.13826-12-mimu@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2019-02-05 14:29:23 +01:00
Michael Mueller
25c84dbaec KVM: s390: add kvm reference to struct sie_page2
Adding the kvm reference to struct sie_page2 will allow to
determine the kvm a given gisa belongs to:

  container_of(gisa, struct sie_page2, gisa)->kvm

This functionality will be required to process a gisa in
gib alert interruption context.

Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Message-Id: <20190131085247.13826-11-mimu@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2019-02-05 14:29:23 +01:00
Michael Mueller
1282c21eb3 KVM: s390: add the GIB and its related life-cyle functions
The Guest Information Block (GIB) links the GISA of all guests
that have adapter interrupts pending. These interrupts cannot be
delivered because all vcpus of these guests are currently in WAIT
state or have masked the respective Interruption Sub Class (ISC).
If enabled, a GIB alert is issued on the host to schedule these
guests to run suitable vcpus to consume the pending interruptions.

This mechanism allows to process adapter interrupts for currently
not running guests.

The GIB is created during host initialization and associated with
the Adapter Interruption Facility in case an Adapter Interruption
Virtualization Facility is available.

The GIB initialization and thus the activation of the related code
will be done in an upcoming patch of this series.

Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Message-Id: <20190131085247.13826-10-mimu@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2019-02-05 14:29:23 +01:00
Michael Mueller
3dec192217 s390/cio: add function chsc_sgib()
This patch implements the Set Guest Information Block operation
to request association or disassociation of a Guest Information
Block (GIB) with the Adapter Interruption Facility. The operation
is required to receive GIB alert interrupts for guest adapters
in conjunction with AIV and GISA.

Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
Reviewed-by: Sebastian Ott <sebott@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Acked-by: Janosch Frank <frankja@linux.ibm.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20190131085247.13826-9-mimu@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2019-02-05 14:29:23 +01:00
Michael Mueller
982cff4259 KVM: s390: introduce struct kvm_s390_gisa_interrupt
Use this struct analog to the kvm interruption structs
for kvm emulated floating and local interruptions.

GIB handling will add further fields to this structure as
required.

Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Message-Id: <20190131085247.13826-8-mimu@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2019-02-05 14:29:22 +01:00
Michael Mueller
bb2fb8cdcf KVM: s390: remove kvm_s390_ from gisa static inline functions
This will shorten the length of code lines. All GISA related
static inline functions are local to interrupt.c.

Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Message-Id: <20190131085247.13826-7-mimu@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2019-02-05 14:29:22 +01:00
Michael Mueller
96723d323a KVM: s390: use pending_irqs_no_gisa() where appropriate
Interruption types that are not represented in GISA shall
use pending_irqs_no_gisa() to test pending interruptions.

Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Message-Id: <20190131085247.13826-6-mimu@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2019-02-05 14:29:22 +01:00
Michael Mueller
672128bfee KVM: s390: coding style kvm_s390_gisa_init/clear()
The change helps to reduce line length and
increases code readability.

Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Message-Id: <20190131085247.13826-5-mimu@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2019-02-05 14:29:22 +01:00
Michael Mueller
246b72183b KVM: s390: move bitmap idle_mask into arch struct top level
The vcpu idle_mask state is used by but not specific
to the emulated floating interruptions. The state is
relevant to gisa related interruptions as well.

Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Message-Id: <20190131085247.13826-4-mimu@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2019-02-05 14:29:22 +01:00
Michael Mueller
689bdf9e9c KVM: s390: make bitmap declaration consistent
Use a consistent bitmap declaration throughout the code.

Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Message-Id: <20190131085247.13826-3-mimu@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2019-02-05 14:29:21 +01:00
Michael Mueller
b7d4557129 KVM: s390: drop obsolete else path
The explicit else path specified in set_intercept_indicators_io
is not required as the function returns in case the first branch
is taken anyway.

Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Message-Id: <20190131085247.13826-2-mimu@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2019-02-05 14:29:21 +01:00
Michael Mueller
8d43d57036 KVM: s390: clarify kvm related kernel message
As suggested by our ID dept. here are some kernel message
updates.

Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2019-02-05 14:28:35 +01:00
Heiko Carstens
ecc15f113c s390: bpf: fix JMP32 code-gen
Commit 626a5f66da ("s390: bpf: implement jitting of JMP32") added
JMP32 code-gen support for s390. However it triggers the warning below
due to some unusual gotos in the original s390 bpf jit code.

Add a couple of additional "is_jmp32" initializations to fix this.
Also fix the wrong opcode for the "llilf" instruction that was
introduced with the same commit.

arch/s390/net/bpf_jit_comp.c: In function 'bpf_jit_insn':
arch/s390/net/bpf_jit_comp.c:248:55: warning: 'is_jmp32' may be used uninitialized in this function [-Wmaybe-uninitialized]
  _EMIT6(op1 | reg(b1, b2) << 16 | (rel & 0xffff), op2 | mask); \
                                                       ^
arch/s390/net/bpf_jit_comp.c:1211:8: note: 'is_jmp32' was declared here
   bool is_jmp32 = BPF_CLASS(insn->code) == BPF_JMP32;

Fixes: 626a5f66da ("s390: bpf: implement jitting of JMP32")
Cc: Jiong Wang <jiong.wang@netronome.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Jiong Wang <jiong.wang@netronome.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-02-04 09:45:09 -08:00
Deepa Dinamani
2edfd8e061 arch: Use asm-generic/socket.h when possible
Many architectures maintain an arch specific copy of the
file even though there are no differences with the asm-generic
one. Allow these architectures to use the generic one instead.

Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Cc: chris@zankel.net
Cc: fenghua.yu@intel.com
Cc: tglx@linutronix.de
Cc: schwidefsky@de.ibm.com
Cc: linux-ia64@vger.kernel.org
Cc: linux-xtensa@linux-xtensa.org
Cc: linux-s390@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-03 11:17:30 -08:00
David S. Miller
ec7146db15 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2019-01-29

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Teach verifier dead code removal, this also allows for optimizing /
   removing conditional branches around dead code and to shrink the
   resulting image. Code store constrained architectures like nfp would
   have hard time doing this at JIT level, from Jakub.

2) Add JMP32 instructions to BPF ISA in order to allow for optimizing
   code generation for 32-bit sub-registers. Evaluation shows that this
   can result in code reduction of ~5-20% compared to 64 bit-only code
   generation. Also add implementation for most JITs, from Jiong.

3) Add support for __int128 types in BTF which is also needed for
   vmlinux's BTF conversion to work, from Yonghong.

4) Add a new command to bpftool in order to dump a list of BPF-related
   parameters from the system or for a specific network device e.g. in
   terms of available prog/map types or helper functions, from Quentin.

5) Add AF_XDP sock_diag interface for querying sockets from user
   space which provides information about the RX/TX/fill/completion
   rings, umem, memory usage etc, from Björn.

6) Add skb context access for skb_shared_info->gso_segs field, from Eric.

7) Add support for testing flow dissector BPF programs by extending
   existing BPF_PROG_TEST_RUN infrastructure, from Stanislav.

8) Split BPF kselftest's test_verifier into various subgroups of tests
   in order better deal with merge conflicts in this area, from Jakub.

9) Add support for queue/stack manipulations in bpftool, from Stanislav.

10) Document BTF, from Yonghong.

11) Dump supported ELF section names in libbpf on program load
    failure, from Taeung.

12) Silence a false positive compiler warning in verifier's BTF
    handling, from Peter.

13) Fix help string in bpftool's feature probing, from Prashant.

14) Remove duplicate includes in BPF kselftests, from Yue.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-28 19:38:33 -08:00
Greg Kroah-Hartman
7dd541a3fb s390: no need to check return value of debugfs_create functions
When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: linux-s390@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-01-28 15:58:55 +01:00
Greg Kroah-Hartman
d7f2f7c7fc s390: pci: no need to check return value of debugfs_create functions
When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Sebastian Ott <sebott@linux.ibm.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: linux-s390@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-01-28 15:58:54 +01:00
Greg Kroah-Hartman
f36108c462 s390/hypfs: no need to check return value of debugfs_create functions
When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux-s390@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-01-28 15:58:54 +01:00
Collin Walling
4ad78b8651 s390/setup: set control program code via diag 318
The s390x diagnose 318 instruction sets the control program name code (CPNC)
and control program version code (CPVC) to provide useful information
regarding the OS during debugging. The CPNC is explicitly set to 4 to
indicate a Linux/KVM environment.

The CPVC is a 7-byte value containing:

 - 3-byte Linux version code, currently set to 0
 - 3-byte unique value, currently set to 0
 - 1-byte trailing null

Signed-off-by: Collin Walling <walling@linux.ibm.com>
Acked-by: Janosch Frank <frankja@linux.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <1544135405-22385-2-git-send-email-walling@linux.ibm.com>
[set version code to 0 until the structure is fully defined]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-01-28 15:58:53 +01:00
Martin Schwidefsky
634692ab70 s390/suspend: fix stack setup in swsusp_arch_suspend
The patch that added support for the virtually mapped kernel stacks changed
swsusp_arch_suspend to switch to the nodat-stack as the vmap stack is not
available while going in and out of suspend.

Unfortunately the switch to the nodat-stack is incorrect which breaks
suspend to disk.

Cc: stable@vger.kernel.org # v4.20
Fixes: ce3dc44749 ("s390: add support for virtually mapped kernel stacks")
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-01-28 15:43:17 +01:00
Masahiro Yamada
afa974b771 kbuild: add real-prereqs shorthand for $(filter-out FORCE,$^)
In Kbuild, if_changed and friends must have FORCE as a prerequisite.

Hence, $(filter-out FORCE,$^) or $(filter-out $(PHONY),$^) is a common
idiom to get the names of all the prerequisites except phony targets.

Add real-prereqs as a shorthand.

Note:
We cannot replace $(filter %.o,$^) in cmd_link_multi-m because $^ may
include auto-generated dependencies from the .*.cmd file when a single
object module is changed into a multi object module. Refer to commit
69ea912fda ("kbuild: remove unneeded link_multi_deps"). I added some
comment to avoid accidental breakage.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Rob Herring <robh@kernel.org>
2019-01-28 09:11:17 +09:00
Masahiro Yamada
5d680056cb s390: make built-in.a not directly depend on *.o.chkbss files
When I was refactoring cmd_ar_builtin in scripts/Makefile.build,
I noticed the build breakage of s390.

Some Makefiles of s390 add extra dependencies to built-in.a;
built-in.a depends on timestamp files *.o.chkbss, but $(AR) does
not want to include them into built-in.a.

Insert a phony target 'chkbss' in between so that $(AR) can take
$(filter-out $(PHONY), $^) as input.

While I was here, I refactored Makefile.chkbss a little bit.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2019-01-28 09:11:17 +09:00
David S. Miller
1d68101367 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-01-27 10:43:17 -08:00
Jiong Wang
626a5f66da s390: bpf: implement jitting of JMP32
This patch implements code-gen for new JMP32 instructions on s390.

Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-01-26 13:33:02 -08:00
Arnd Bergmann
b41c51c8e1 arch: add pkey and rseq syscall numbers everywhere
Most architectures define system call numbers for the rseq and pkey system
calls, even when they don't support the features, and perhaps never will.

Only a few architectures are missing these, so just define them anyway
for consistency. If we decide to add them later to one of these, the
system call numbers won't get out of sync then.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
2019-01-25 17:22:50 +01:00
Arnd Bergmann
0d6040d468 arch: add split IPC system calls where needed
The IPC system call handling is highly inconsistent across architectures,
some use sys_ipc, some use separate calls, and some use both.  We also
have some architectures that require passing IPC_64 in the flags, and
others that set it implicitly.

For the addition of a y2038 safe semtimedop() system call, I chose to only
support the separate entry points, but that requires first supporting
the regular ones with their own syscall numbers.

The IPC_64 is now implied by the new semctl/shmctl/msgctl system
calls even on the architectures that require passing it with the ipc()
multiplexer.

I'm not adding the new semtimedop() or semop() on 32-bit architectures,
those will get implemented using the new semtimedop_time64() version
that gets added along with the other time64 calls.
Three 64-bit architectures (powerpc, s390 and sparc) get semtimedop().

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2019-01-25 17:22:50 +01:00
Eric Biggers
231baecdef crypto: clarify name of WEAK_KEY request flag
CRYPTO_TFM_REQ_WEAK_KEY confuses newcomers to the crypto API because it
sounds like it is requesting a weak key.  Actually, it is requesting
that weak keys be forbidden (for algorithms that have the notion of
"weak keys"; currently only DES and XTS do).

Also it is only one letter away from CRYPTO_TFM_RES_WEAK_KEY, with which
it can be easily confused.  (This in fact happened in the UX500 driver,
though just in some debugging messages.)

Therefore, make the intent clear by renaming it to
CRYPTO_TFM_REQ_FORBID_WEAK_KEYS.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-01-25 18:41:52 +08:00
Chandan Rajendra
643fa9612b fscrypt: remove filesystem specific build config option
In order to have a common code base for fscrypt "post read" processing
for all filesystems which support encryption, this commit removes
filesystem specific build config option (e.g. CONFIG_EXT4_FS_ENCRYPTION)
and replaces it with a build option (i.e. CONFIG_FS_ENCRYPTION) whose
value affects all the filesystems making use of fscrypt.

Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
2019-01-23 23:56:43 -05:00
Martin Schwidefsky
58661489a8 Merge branch 'compat' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux into features
Pull tip branch with Arnds compat system call wrapper rework.
2019-01-23 12:39:35 +01:00
Heiko Carstens
fb8bfca06c s390: fix system call tracing
When converting to autogenerated compat syscall wrappers all system
call entry points got a different symbol name: they all got a __s390x_
prefix.

This caused breakage with system call tracing, since an appropriate
arch_syscall_match_sym_name() was not provided. Add this function, and
while at it also add code to avoid compat system call tracing. s390
has different system call tables for native 64 bit system calls and
compat system calls. This isn't really supported in the common
code. However there are hardly any compat binaries left, therefore
just ignore compat system calls, like x86 and arm64 also do for the
same reason.

Fixes: aa0d6e70d3 ("s390: autogenerate compat syscall wrappers")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-01-23 12:38:07 +01:00
Gustavo A. R. Silva
c6ac875446 s390/hypfs: Use struct_size() in kzalloc()
One of the more common cases of allocation size calculations is finding the
size of a structure that has a zero-sized array at the end, along with memory
for some number of elements for that array. For example:

struct foo {
    int stuff;
    void *entry[];
};

instance = kzalloc(sizeof(struct foo) + sizeof(void *) * count, GFP_KERNEL);

Instead of leaving these open-coded and prone to type mistakes, we can now
use the new struct_size() helper:

instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL);

This code was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-01-18 09:34:19 +01:00
Vasily Gorbik
7e0d92f002 s390/kasan: improve string/memory functions checks
Avoid using arch specific implementations of string/memory functions
with KASAN since gcc cannot instrument asm code memory accesses and
many bugs could be missed.

Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-01-18 09:34:18 +01:00
Christoph Hellwig
32b77252f4 s390: remove the ptep_modify_prot_{start,commit} exports
These two functions are only used by core MM code, so no need to export
them.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-01-18 09:34:17 +01:00
Arnd Bergmann
90856087da s390: remove compat_wrapper.c
Now that all these wrappers are automatically generated, we can
remove the entire file, and instead point to the regualar syscalls
like all other architectures do.

The 31-bit pointer extension is now handled in the __s390_sys_*()
wrappers.

Link: https://lore.kernel.org/lkml/20190116131527.2071570-6-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-01-18 09:33:20 +01:00
Arnd Bergmann
aa0d6e70d3 s390: autogenerate compat syscall wrappers
Any system call that takes a pointer argument on s390 requires
a wrapper function to do a 31-to-64 zero-extension, these are
currently generated in arch/s390/kernel/compat_wrapper.c.

On arm64 and x86, we already generate similar wrappers for all
system calls in the place of their definition, just for a different
purpose (they load the arguments from pt_regs).

We can do the same thing here, by adding an asm/syscall_wrapper.h
file with a copy of all the relevant macros to override the generic
version. Besides the addition of the compat entry point, these also
rename the entry points with a __s390_ or __s390x_ prefix, similar
to what we do on arm64 and x86. This in turn requires renaming
a few things, and adding a proper ni_syscall() entry point.

In order to still compile system call definitions that pass an
loff_t argument, the __SC_COMPAT_CAST() macro checks for that
and forces an -ENOSYS error, which was the best I could come up
with. Those functions must obviously not get called from user
space, but instead require hand-written compat_sys_*() handlers,
which fortunately already exist.

Link: https://lore.kernel.org/lkml/20190116131527.2071570-5-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
[heiko.carstens@de.ibm.com: compile fix for !CONFIG_COMPAT]
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-01-18 09:33:19 +01:00
Arnd Bergmann
fef747bab3 s390: use generic UID16 implementation
s390 has an almost identical copy of the code in kernel/uid16.c.

The problem here is that it requires calling the regular system calls,
which the generic implementation handles correctly, but the internal
interfaces are not declared in a global header for this.

The best way forward here seems to be to just use the generic code and
delete the s390 specific implementation.

I keep the changes to uapi/asm/posix_types.h inside of an #ifdef check
so user space does not observe any changes. As some of the system calls
pass pointers, we also need wrappers in compat_wrapper.c, which I add
for all calls with at least one argument. All those wrappers can be
removed in a later step.

Link: https://lore.kernel.org/lkml/20190116131527.2071570-4-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-01-18 09:33:18 +01:00
Arnd Bergmann
58fa4a410f ipc: introduce ksys_ipc()/compat_ksys_ipc() for s390
The sys_ipc() and compat_ksys_ipc() functions are meant to only
be used from the system call table, not called by another function.

Introduce ksys_*() interfaces for this purpose, as we have done
for many other system calls.

Link: https://lore.kernel.org/lkml/20190116131527.2071570-3-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
[heiko.carstens@de.ibm.com: compile fix for !CONFIG_COMPAT]
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-01-18 09:33:18 +01:00
Arnd Bergmann
1ecff5ef0a s390: open-code s390_personality syscall
Patch series "s390: rework compat wrapper generation".

As promised, I gave this a go and changed the SYSCALL_DEFINEx()
infrastructure to always include the wrappers for doing the
31-bit argument conversion on s390 compat mode.

This does three main things:

- The UID16 rework saved a lot of duplicated code, and would
  probably make sense by itself, but is also required as
  we can no longer call sys_*() functions directly after the
  last step.

- Removing the compat_wrapper.c file is of course the main
  goal here, in order to remove the need to maintain the
  compat_wrapper.c file when new system calls get added.
  Unfortunately, this requires adding some complexity in
  syscall_wrapper.h, and trades a small reduction in source
  code lines for a small increase in binary size for
  unused wrappers.

- As an added benefit, the use of syscall_wrapper.h now makes
  it easy to change the syscall wrappers so they no longer
  see all user space register contents, similar to changes
  done in commits fa697140f9 ("syscalls/x86: Use 'struct pt_regs'
  based syscall calling convention for 64-bit syscalls") and
  4378a7d4be ("arm64: implement syscall wrappers").
  I leave the actual implementation of this for you, if you
  want to do it later.

I did not test the changes at runtime, but I looked at the
generated object code, which seems fine here and includes
the same conversions as before.

This patch(of 5):

The sys_personality function is not meant to be called from other system
calls. We could introduce an intermediate ksys_personality function,
but it does almost nothing, so this just moves the implementation into
the caller.

Link: https://lore.kernel.org/lkml/20190116131527.2071570-1-arnd@arndb.de
Link: https://lore.kernel.org/lkml/20190116131527.2071570-2-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-01-18 09:33:17 +01:00
David Herrmann
f5dd3d0c96 net: introduce SO_BINDTOIFINDEX sockopt
This introduces a new generic SOL_SOCKET-level socket option called
SO_BINDTOIFINDEX. It behaves similar to SO_BINDTODEVICE, but takes a
network interface index as argument, rather than the network interface
name.

User-space often refers to network-interfaces via their index, but has
to temporarily resolve it to a name for a call into SO_BINDTODEVICE.
This might pose problems when the network-device is renamed
asynchronously by other parts of the system. When this happens, the
SO_BINDTODEVICE might either fail, or worse, it might bind to the wrong
device.

In most cases user-space only ever operates on devices which they
either manage themselves, or otherwise have a guarantee that the device
name will not change (e.g., devices that are UP cannot be renamed).
However, particularly in libraries this guarantee is non-obvious and it
would be nice if that race-condition would simply not exist. It would
make it easier for those libraries to operate even in situations where
the device-name might change under the hood.

A real use-case that we recently hit is trying to start the network
stack early in the initrd but make it survive into the real system.
Existing distributions rename network-interfaces during the transition
from initrd into the real system. This, obviously, cannot affect
devices that are up and running (unless you also consider moving them
between network-namespaces). However, the network manager now has to
make sure its management engine for dormant devices will not run in
parallel to these renames. Particularly, when you offload operations
like DHCP into separate processes, these might setup their sockets
early, and thus have to resolve the device-name possibly running into
this race-condition.

By avoiding a call to resolve the device-name, we no longer depend on
the name and can run network setup of dormant devices in parallel to
the transition off the initrd. The SO_BINDTOIFINDEX ioctl plugs this
race.

Reviewed-by: Tom Gundersen <teg@jklm.no>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-17 14:55:51 -08:00
David Hildenbrand
60f1bf29c0 s390/smp: Fix calling smp_call_ipl_cpu() from ipl CPU
When calling smp_call_ipl_cpu() from the IPL CPU, we will try to read
from pcpu_devices->lowcore. However, due to prefixing, that will result
in reading from absolute address 0 on that CPU. We have to go via the
actual lowcore instead.

This means that right now, we will read lc->nodat_stack == 0 and
therfore work on a very wrong stack.

This BUG essentially broke rebooting under QEMU TCG (which will report
a low address protection exception). And checking under KVM, it is
also broken under KVM. With 1 VCPU it can be easily triggered.

:/# echo 1 > /proc/sys/kernel/sysrq
:/# echo b > /proc/sysrq-trigger
[   28.476745] sysrq: SysRq : Resetting
[   28.476793] Kernel stack overflow.
[   28.476817] CPU: 0 PID: 424 Comm: sh Not tainted 5.0.0-rc1+ #13
[   28.476820] Hardware name: IBM 2964 NE1 716 (KVM/Linux)
[   28.476826] Krnl PSW : 0400c00180000000 0000000000115c0c (pcpu_delegate+0x12c/0x140)
[   28.476861]            R:0 T:1 IO:0 EX:0 Key:0 M:0 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
[   28.476863] Krnl GPRS: ffffffffffffffff 0000000000000000 000000000010dff8 0000000000000000
[   28.476864]            0000000000000000 0000000000000000 0000000000ab7090 000003e0006efbf0
[   28.476864]            000000000010dff8 0000000000000000 0000000000000000 0000000000000000
[   28.476865]            000000007fffc000 0000000000730408 000003e0006efc58 0000000000000000
[   28.476887] Krnl Code: 0000000000115bfe: 4170f000            la      %r7,0(%r15)
[   28.476887]            0000000000115c02: 41f0a000            la      %r15,0(%r10)
[   28.476887]           #0000000000115c06: e370f0980024        stg     %r7,152(%r15)
[   28.476887]           >0000000000115c0c: c0e5fffff86e        brasl   %r14,114ce8
[   28.476887]            0000000000115c12: 41f07000            la      %r15,0(%r7)
[   28.476887]            0000000000115c16: a7f4ffa8            brc     15,115b66
[   28.476887]            0000000000115c1a: 0707                bcr     0,%r7
[   28.476887]            0000000000115c1c: 0707                bcr     0,%r7
[   28.476901] Call Trace:
[   28.476902] Last Breaking-Event-Address:
[   28.476920]  [<0000000000a01c4a>] arch_call_rest_init+0x22/0x80
[   28.476927] Kernel panic - not syncing: Corrupt kernel stack, can't continue.
[   28.476930] CPU: 0 PID: 424 Comm: sh Not tainted 5.0.0-rc1+ #13
[   28.476932] Hardware name: IBM 2964 NE1 716 (KVM/Linux)
[   28.476932] Call Trace:

Fixes: 2f859d0dad ("s390/smp: reduce size of struct pcpu")
Cc: stable@vger.kernel.org # 4.0+
Reported-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-01-11 17:12:03 +01:00
Vasily Gorbik
190f056fba s390/vdso: correct vdso mapping for compat tasks
While "s390/vdso: avoid 64-bit vdso mapping for compat tasks" fixed
64-bit vdso mapping for compat tasks under gdb it introduced another
problem. "compat_mm" flag is not inherited during fork and when
31-bit process forks a child (but does not perform exec) it ends up
with 64-bit vdso. To address that, init_new_context (which is called
during fork and exec) now initialize compat_mm based on thread TIF_31BIT
flag. Later compat_mm is adjusted in arch_setup_additional_pages, which
is called during exec.

Fixes: d1befa6582 ("s390/vdso: avoid 64-bit vdso mapping for compat tasks")
Reported-by: Stefan Liebler <stli@linux.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: <stable@vger.kernel.org> # v4.20+
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-01-11 17:12:02 +01:00
Gerald Schaefer
b7cb707c37 s390/smp: fix CPU hotplug deadlock with CPU rescan
smp_rescan_cpus() is called without the device_hotplug_lock, which can lead
to a dedlock when a new CPU is found and immediately set online by a udev
rule.

This was observed on an older kernel version, where the cpu_hotplug_begin()
loop was still present, and it resulted in hanging chcpu and systemd-udev
processes. This specific deadlock will not show on current kernels. However,
there may be other possible deadlocks, and since smp_rescan_cpus() can still
trigger a CPU hotplug operation, the device_hotplug_lock should be held.

For reference, this was the deadlock with the old cpu_hotplug_begin() loop:

        chcpu (rescan)                       systemd-udevd

 echo 1 > /sys/../rescan
 -> smp_rescan_cpus()
 -> (*) get_online_cpus()
    (increases refcount)
 -> smp_add_present_cpu()
    (new CPU found)
 -> register_cpu()
 -> device_add()
 -> udev "add" event triggered -----------> udev rule sets CPU online
                                         -> echo 1 > /sys/.../online
                                         -> lock_device_hotplug_sysfs()
                                            (this is missing in rescan path)
                                         -> device_online()
                                         -> (**) device_lock(new CPU dev)
                                         -> cpu_up()
                                         -> cpu_hotplug_begin()
                                            (loops until refcount == 0)
                                            -> deadlock with (*)
 -> bus_probe_device()
 -> device_attach()
 -> device_lock(new CPU dev)
    -> deadlock with (**)

Fix this by taking the device_hotplug_lock in the CPU rescan path.

Cc: <stable@vger.kernel.org>
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-01-11 17:12:02 +01:00
Martin Schwidefsky
a38662084c s390/mm: always force a load of the primary ASCE on context switch
The ASCE of an mm_struct can be modified after a task has been created,
e.g. via crst_table_downgrade for a compat process. The active_mm logic
to avoid the switch_mm call if the next task is a kernel thread can
lead to a situation where switch_mm is called where 'prev == next' is
true but 'prev->context.asce == next->context.asce' is not.

This can lead to a situation where a CPU uses the outdated ASCE to run
a task. The result can be a crash, endless loops and really subtle
problem due to TLBs being created with an invalid ASCE.

Cc: stable@kernel.org # v3.15+
Fixes: 53e857f308 ("s390/mm,tlb: race of lazy TLB flush vs. recreation")
Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-01-11 17:12:02 +01:00
Christian Borntraeger
03aa047ef2 s390/early: improve machine detection
Right now the early machine detection code check stsi 3.2.2 for "KVM"
and set MACHINE_IS_VM if this is different. As the console detection
uses diagnose 8 if MACHINE_IS_VM returns true this will crash Linux
early for any non z/VM system that sets a different value than KVM.
So instead of assuming z/VM, do not set any of MACHINE_IS_LPAR,
MACHINE_IS_VM, or MACHINE_IS_KVM.

CC: stable@vger.kernel.org
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-01-11 17:12:02 +01:00
Linus Torvalds
85e1ffbd42 Kbuild late updates for v4.21
- improve boolinit.cocci and use_after_iter.cocci semantic patches
 
 - fix alignment for kallsyms
 
 - move 'asm goto' compiler test to Kconfig and clean up jump_label
   CONFIG option
 
 - generate asm-generic wrappers automatically if arch does not implement
   mandatory UAPI headers
 
 - remove redundant generic-y defines
 
 - misc cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJcMV5GAAoJED2LAQed4NsGs9gQAI/oGg8wJgk9a7+dJCX245W5
 F4ReftnQd4AFptFCi9geJkr+sfViXNgwPLqlJxiXz8Qe8XP7z3LcArDw3FUzwvGn
 bMSBiN9ggwWkOFgF523XesYgUVtcLpkNch/Migzf1Ac0FHk0G9o7gjcdsvAWHkUu
 qFwtNcUB6PElRbhsHsh5qCY1/6HaAXgf/7O7wztnaKRe9myN6f2HzT4wANS9HHde
 1e1r0LcIQeGWfG+3va3fZl6SDxSI/ybl244OcDmDyYl6RA1skSDlHbIBIFgUPoS0
 cLyzoVj+GkfI1fRFEIfou+dj7lpukoAXHsggHo0M+ofqtbMF+VB2T3jvg4txanCP
 TXzDc+04QUguK5yVnBfcnyC64Htrhnbq0eGy43kd1VZWAEGApl+680P8CRsWU3ZV
 kOiFvZQ6RP/Ssw+a42yU3SHr31WD7feuQqHU65osQt4rdyL5wnrfU1vaUvJSkltF
 cyPr9Kz/Ism0kPodhpFkuKxwtlKOw6/uwdCQoQHtxAPkvkcydhYx93x3iE0nxObS
 CRMximiRyE12DOcv/3uv69n0JOPn6AsITcMNp8XryASYrR2/52txhGKGhvo3+Zoq
 5pwc063JsuxJ/5/dcOw/erQar5d1eBRaBJyEWnXroxUjbsLPAznE+UIN8tmvyVly
 SunlxNOXBdYeWN6t6S3H
 =I+r6
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v4.21-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull more Kbuild updates from Masahiro Yamada:

 - improve boolinit.cocci and use_after_iter.cocci semantic patches

 - fix alignment for kallsyms

 - move 'asm goto' compiler test to Kconfig and clean up jump_label
   CONFIG option

 - generate asm-generic wrappers automatically if arch does not
   implement mandatory UAPI headers

 - remove redundant generic-y defines

 - misc cleanups

* tag 'kbuild-v4.21-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kconfig: rename generated .*conf-cfg to *conf-cfg
  kbuild: remove unnecessary stubs for archheader and archscripts
  kbuild: use assignment instead of define ... endef for filechk_* rules
  arch: remove redundant UAPI generic-y defines
  kbuild: generate asm-generic wrappers if mandatory headers are missing
  arch: remove stale comments "UAPI Header export list"
  riscv: remove redundant kernel-space generic-y
  kbuild: change filechk to surround the given command with { }
  kbuild: remove redundant target cleaning on failure
  kbuild: clean up rule_dtc_dt_yaml
  kbuild: remove UIMAGE_IN and UIMAGE_OUT
  jump_label: move 'asm goto' support test to Kconfig
  kallsyms: lower alignment on ARM
  scripts: coccinelle: boolinit: drop warnings on named constants
  scripts: coccinelle: check for redeclaration
  kconfig: remove unused "file" field of yylval union
  nds32: remove redundant kernel-space generic-y
  nios2: remove unneeded HAS_DMA define
2019-01-06 16:33:10 -08:00
Linus Torvalds
926b02d3eb pci-v4.21-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAlwtMCIUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vwQUQ/+P5/VDpo4abjudGO2c7FU1bJOwvfN
 cfV5dvDCw0kpx0Em5SmnpAD7Punllxxvb/04K75lqarGyx/Txqaw+lbIF+qSj6my
 GsQ16Iy8T48x5hr+Pf6vTh1eE+NaAVZfOPDOt7CyTNAgwfzHeVNyfNvz7pfKTIIJ
 Mk/jRE4kkeWo60jsY5p3sFo3OVOxBOsRdN+2sruaQuWFXrKHLyNDR+7Z9ZPxubFk
 cCO/TYPhNXmmKhCAR4V/rGiqz9OL2wyFixGhGhmD3tnC9nAb/wTMzjARsyBopBPi
 b/KkR2eLFEyXN0HJrwqxiURo4J3nveAYEuNXH5KjRBQZnoBCGSCIlqFhlrp9vdBk
 B4KIdT8h/M6LsVGeVSEIxIEXCp67YE31kxraFrk4Vsggdh2TFQ0llh1sajj8IFJB
 XekutedAOlTSOaM1/jvVPUJYg04X90bp3uXn3IU45XlQ8nBOG3immFVITRLkvd3w
 ywH+SEdeZAhWl3RGy8SHhqdeCJ7nNQbcRaRJ5CoWJBDNJTBGF1X+zJD2Swi6H9vA
 nWGNRlb3CPPIMPF127nADnOE7Cj2FlpAEIEu52HpcpIrhEdrGvLkGeQfgdWBjbyU
 aHwC04bLWnvsA9SEFVnuMIBaFQmJ1RuaWAHdtscyyO2uoeCtN8Aa+BX6jXFbVZQN
 9eFzpiv0kUiXlAQ=
 =g1ia
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.21-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:

 - Remove unused lists from ASPM pcie_link_state (Frederick Lawler)

 - Fix Broadcom CNB20LE host bridge unintended sign extension (Colin Ian
   King)

 - Expand Kconfig "PF" acronyms (Randy Dunlap)

 - Update MAINTAINERS for arch/x86/kernel/early-quirks.c (Bjorn Helgaas)

 - Add missing include to drivers/pci.h (Alexandru Gagniuc)

 - Override Synopsys USB 3.x HAPS device class so dwc3-haps can claim it
   instead of xhci (Thinh Nguyen)

 - Clean up P2PDMA documentation (Randy Dunlap)

 - Allow runtime PM even if driver doesn't supply callbacks (Jarkko
   Nikula)

 - Remove status check after submitting Switchtec MRPC Firmware Download
   commands to avoid Completion Timeouts (Kelvin Cao)

 - Set Switchtec coherent DMA mask to allow 64-bit DMA (Boris Glimcher)

 - Fix Switchtec SWITCHTEC_IOCTL_EVENT_IDX_ALL flag overwrite issue
   (Joey Zhang)

 - Enable write combining for Switchtec MRPC Input buffers (Kelvin Cao)

 - Add Switchtec MRPC DMA mode support (Wesley Sheng)

 - Skip VF scanning on powerpc, which does this in firmware (Sebastian
   Ott)

 - Add Amlogic Meson PCIe controller driver and DT bindings (Yue Wang)

 - Constify histb dw_pcie_host_ops structure (Julia Lawall)

 - Support multiple power domains for imx6 (Leonard Crestez)

 - Constify layerscape driver data (Stefan Agner)

 - Update imx6 Kconfig to allow imx6 PCIe in imx7 kernel (Trent Piepho)

 - Support armada8k GPIO reset (Baruch Siach)

 - Support suspend/resume support on imx6 (Leonard Crestez)

 - Don't hard-code DesignWare DBI/ATU offst (Stephen Warren)

 - Skip i.MX6 PHY setup on i.MX7D (Andrey Smirnov)

 - Remove Jianguo Sun from HiSilicon STB maintainers (Lorenzo Pieralisi)

 - Mask DesignWare interrupts instead of disabling them to avoid lost
   interrupts (Marc Zyngier)

 - Add locking when acking DesignWare interrupts (Marc Zyngier)

 - Ack DesignWare interrupts in the proper callbacks (Marc Zyngier)

 - Use devm resource parser in mediatek (Honghui Zhang)

 - Remove unused mediatek "num-lanes" DT property (Honghui Zhang)

 - Add UniPhier PCIe controller driver and DT bindings (Kunihiko
   Hayashi)

 - Enable MSI for imx6 downstream components (Richard Zhu)

* tag 'pci-v4.21-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (40 commits)
  PCI: imx: Enable MSI from downstream components
  s390/pci: skip VF scanning
  PCI/IOV: Add flag so platforms can skip VF scanning
  PCI/IOV: Factor out sriov_add_vfs()
  PCI: uniphier: Add UniPhier PCIe host controller support
  dt-bindings: PCI: Add UniPhier PCIe host controller description
  PCI: amlogic: Add the Amlogic Meson PCIe controller driver
  dt-bindings: PCI: meson: add DT bindings for Amlogic Meson PCIe controller
  arm64: dts: mt7622: Remove un-used property for PCIe
  arm: dts: mt7623: Remove un-used property for PCIe
  dt-bindings: PCI: MediaTek: Remove un-used property
  PCI: mediatek: Remove un-used variant in struct mtk_pcie_port
  MAINTAINERS: Remove Jianguo Sun from HiSilicon STB DWC entry
  PCI: dwc: Don't hard-code DBI/ATU offset
  PCI: imx: Add imx6sx suspend/resume support
  PCI: armada8k: Add support for gpio controlled reset signal
  PCI: dwc: Adjust Kconfig to allow IMX6 PCIe host on IMX7
  PCI: dwc: layerscape: Constify driver data
  PCI: imx: Add multi-pd support
  PCI: Override Synopsys USB 3.x HAPS device class
  ...
2019-01-05 17:57:34 -08:00
Masahiro Yamada
ba97df4558 kbuild: use assignment instead of define ... endef for filechk_* rules
You do not have to use define ... endef for filechk_* rules.

For simple cases, the use of assignment looks cleaner, IMHO.

I updated the usage for scripts/Kbuild.include in case somebody
misunderstands the 'define ... endif' is the requirement.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2019-01-06 10:22:35 +09:00
Masahiro Yamada
d6e4b3e326 arch: remove redundant UAPI generic-y defines
Now that Kbuild automatically creates asm-generic wrappers for missing
mandatory headers, it is redundant to list the same headers in
generic-y and mandatory-y.

Suggested-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
2019-01-06 10:22:15 +09:00
Masahiro Yamada
d4ce5458ea arch: remove stale comments "UAPI Header export list"
These comments are leftovers of commit fcc8487d47 ("uapi: export all
headers under uapi directories").

Prior to that commit, exported headers must be explicitly added to
header-y. Now, all headers under the uapi/ directories are exported.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-01-06 09:46:51 +09:00
Masahiro Yamada
ad77408635 kbuild: change filechk to surround the given command with { }
filechk_* rules often consist of multiple 'echo' lines. They must be
surrounded with { } or ( ) to work correctly. Otherwise, only the
string from the last 'echo' would be written into the target.

Let's take care of that in the 'filechk' in scripts/Kbuild.include
to clean up filechk_* rules.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-01-06 09:46:51 +09:00
Masahiro Yamada
e9666d10a5 jump_label: move 'asm goto' support test to Kconfig
Currently, CONFIG_JUMP_LABEL just means "I _want_ to use jump label".

The jump label is controlled by HAVE_JUMP_LABEL, which is defined
like this:

  #if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_JUMP_LABEL)
  # define HAVE_JUMP_LABEL
  #endif

We can improve this by testing 'asm goto' support in Kconfig, then
make JUMP_LABEL depend on CC_HAS_ASM_GOTO.

Ugly #ifdef HAVE_JUMP_LABEL will go away, and CONFIG_JUMP_LABEL will
match to the real kernel capability.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
2019-01-06 09:46:51 +09:00
Linus Torvalds
a65981109f Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:

 - procfs updates

 - various misc bits

 - lib/ updates

 - epoll updates

 - autofs

 - fatfs

 - a few more MM bits

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (58 commits)
  mm/page_io.c: fix polled swap page in
  checkpatch: add Co-developed-by to signature tags
  docs: fix Co-Developed-by docs
  drivers/base/platform.c: kmemleak ignore a known leak
  fs: don't open code lru_to_page()
  fs/: remove caller signal_pending branch predictions
  mm/: remove caller signal_pending branch predictions
  arch/arc/mm/fault.c: remove caller signal_pending_branch predictions
  kernel/sched/: remove caller signal_pending branch predictions
  kernel/locking/mutex.c: remove caller signal_pending branch predictions
  mm: select HAVE_MOVE_PMD on x86 for faster mremap
  mm: speed up mremap by 20x on large regions
  mm: treewide: remove unused address argument from pte_alloc functions
  initramfs: cleanup incomplete rootfs
  scripts/gdb: fix lx-version string output
  kernel/kcov.c: mark write_comp_data() as notrace
  kernel/sysctl: add panic_print into sysctl
  panic: add options to print system info when panic happens
  bfs: extra sanity checking and static inode bitmap
  exec: separate MM_ANONPAGES and RLIMIT_STACK accounting
  ...
2019-01-05 09:16:18 -08:00
Joel Fernandes (Google)
4cf5892495 mm: treewide: remove unused address argument from pte_alloc functions
Patch series "Add support for fast mremap".

This series speeds up the mremap(2) syscall by copying page tables at
the PMD level even for non-THP systems.  There is concern that the extra
'address' argument that mremap passes to pte_alloc may do something
subtle architecture related in the future that may make the scheme not
work.  Also we find that there is no point in passing the 'address' to
pte_alloc since its unused.  This patch therefore removes this argument
tree-wide resulting in a nice negative diff as well.  Also ensuring
along the way that the enabled architectures do not do anything funky
with the 'address' argument that goes unnoticed by the optimization.

Build and boot tested on x86-64.  Build tested on arm64.  The config
enablement patch for arm64 will be posted in the future after more
testing.

The changes were obtained by applying the following Coccinelle script.
(thanks Julia for answering all Coccinelle questions!).
Following fix ups were done manually:
* Removal of address argument from  pte_fragment_alloc
* Removal of pte_alloc_one_fast definitions from m68k and microblaze.

// Options: --include-headers --no-includes
// Note: I split the 'identifier fn' line, so if you are manually
// running it, please unsplit it so it runs for you.

virtual patch

@pte_alloc_func_def depends on patch exists@
identifier E2;
identifier fn =~
"^(__pte_alloc|pte_alloc_one|pte_alloc|__pte_alloc_kernel|pte_alloc_one_kernel)$";
type T2;
@@

 fn(...
- , T2 E2
 )
 { ... }

@pte_alloc_func_proto_noarg depends on patch exists@
type T1, T2, T3, T4;
identifier fn =~ "^(__pte_alloc|pte_alloc_one|pte_alloc|__pte_alloc_kernel|pte_alloc_one_kernel)$";
@@

(
- T3 fn(T1, T2);
+ T3 fn(T1);
|
- T3 fn(T1, T2, T4);
+ T3 fn(T1, T2);
)

@pte_alloc_func_proto depends on patch exists@
identifier E1, E2, E4;
type T1, T2, T3, T4;
identifier fn =~
"^(__pte_alloc|pte_alloc_one|pte_alloc|__pte_alloc_kernel|pte_alloc_one_kernel)$";
@@

(
- T3 fn(T1 E1, T2 E2);
+ T3 fn(T1 E1);
|
- T3 fn(T1 E1, T2 E2, T4 E4);
+ T3 fn(T1 E1, T2 E2);
)

@pte_alloc_func_call depends on patch exists@
expression E2;
identifier fn =~
"^(__pte_alloc|pte_alloc_one|pte_alloc|__pte_alloc_kernel|pte_alloc_one_kernel)$";
@@

 fn(...
-,  E2
 )

@pte_alloc_macro depends on patch exists@
identifier fn =~
"^(__pte_alloc|pte_alloc_one|pte_alloc|__pte_alloc_kernel|pte_alloc_one_kernel)$";
identifier a, b, c;
expression e;
position p;
@@

(
- #define fn(a, b, c) e
+ #define fn(a, b) e
|
- #define fn(a, b) e
+ #define fn(a) e
)

Link: http://lkml.kernel.org/r/20181108181201.88826-2-joelaf@google.com
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Suggested-by: Kirill A. Shutemov <kirill@shutemov.name>
Acked-by: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Julia Lawall <Julia.Lawall@lip6.fr>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04 13:13:47 -08:00
Matthew Wilcox
3fc2579e6f fls: change parameter to unsigned int
When testing in userspace, UBSAN pointed out that shifting into the sign
bit is undefined behaviour.  It doesn't really make sense to ask for the
highest set bit of a negative value, so just turn the argument type into
an unsigned int.

Some architectures (eg ppc) already had it declared as an unsigned int,
so I don't expect too many problems.

Link: http://lkml.kernel.org/r/20181105221117.31828-1-willy@infradead.org
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04 13:13:46 -08:00
Linus Torvalds
96d4f267e4 Remove 'type' argument from access_ok() function
Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument
of the user address range verification function since we got rid of the
old racy i386-only code to walk page tables by hand.

It existed because the original 80386 would not honor the write protect
bit when in kernel mode, so you had to do COW by hand before doing any
user access.  But we haven't supported that in a long time, and these
days the 'type' argument is a purely historical artifact.

A discussion about extending 'user_access_begin()' to do the range
checking resulted this patch, because there is no way we're going to
move the old VERIFY_xyz interface to that model.  And it's best done at
the end of the merge window when I've done most of my merges, so let's
just get this done once and for all.

This patch was mostly done with a sed-script, with manual fix-ups for
the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form.

There were a couple of notable cases:

 - csky still had the old "verify_area()" name as an alias.

 - the iter_iov code had magical hardcoded knowledge of the actual
   values of VERIFY_{READ,WRITE} (not that they mattered, since nothing
   really used it)

 - microblaze used the type argument for a debug printout

but other than those oddities this should be a total no-op patch.

I tried to fix up all architectures, did fairly extensive grepping for
access_ok() uses, and the changes are trivial, but I may have missed
something.  Any missed conversion should be trivially fixable, though.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-03 18:57:57 -08:00
Linus Torvalds
04a17edeca s390 updates for the 4.21 merge window
- A larger update for the zcrypt / AP bus code
    + Update two inline assemblies in the zcrypt driver to make gcc happy
    + Add a missing reply code for invalid special commands for zcrypt
    + Allow AP device reset to be triggered from user space
    + Split the AP scan function into smaller, more readable functions
 
  - Updates for vfio-ccw and vfio-ap
    + Add maintainers and reviewer for vfio-ccw
    + Include facility.h in vfio_ap_drv.c to avoid fragile include chain
    + Simplicy vfio-ccw state machine
 
  - Use the common code version of bust_spinlocks
 
  - Make use of the DEFINE_SHOW_ATTRIBUTE
 
  - Fix three incorrect file permissions in the DASD driver
 
  - Remove bit spin-lock from the PCI interrupt handler
 
  - Fix GFP_ATOMIC vs GFP_KERNEL in the PCI code
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJcLGoIAAoJEDjwexyKj9rgyN8IANaQvHbVBA3vz/Ssb6ZiR/K6
 rTBoXjJQqyJ/cf6RZeFi1b4Douv4QWJw3s06KXbrdmK/ONm5rypXVfXlAhY71pg5
 40BUb92MGXhJw6JFDQ50Udd6Z5r7r6RYR1puyg4tzHmBuNVL7FB5RqFm92UOkMOD
 ZI03G1sfA6/1XUKhNfCfNBB6Jt6V+iAAex8bgrp09wAeoGnAO20oFuis9u7pLlNm
 a5Cp9n7faXEN+qes1iBtVDr5o7opuhanwWKnhvsYTAbpOo7jGJ/47IPKT2Wfmurd
 wkMZBEC+Ntk/IfkaBzp7azeISZD5EbucTcgo/I9nzq/aWeflfXXeYl7My0aQB48=
 =Lqrh
 -----END PGP SIGNATURE-----

Merge tag 's390-4.21-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Martin Schwidefsky:

 - A larger update for the zcrypt / AP bus code:
    + Update two inline assemblies in the zcrypt driver to make gcc happy
    + Add a missing reply code for invalid special commands for zcrypt
    + Allow AP device reset to be triggered from user space
    + Split the AP scan function into smaller, more readable functions

 - Updates for vfio-ccw and vfio-ap
    + Add maintainers and reviewer for vfio-ccw
    + Include facility.h in vfio_ap_drv.c to avoid fragile include chain
    + Simplicy vfio-ccw state machine

 - Use the common code version of bust_spinlocks

 - Make use of the DEFINE_SHOW_ATTRIBUTE

 - Fix three incorrect file permissions in the DASD driver

 - Remove bit spin-lock from the PCI interrupt handler

 - Fix GFP_ATOMIC vs GFP_KERNEL in the PCI code

* tag 's390-4.21-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/zcrypt: rework ap scan bus code
  s390/zcrypt: make sysfs reset attribute trigger queue reset
  s390/pci: fix sleeping in atomic during hotplug
  s390/pci: remove bit_lock usage in interrupt handler
  s390/drivers: fix proc/debugfs file permissions
  s390: convert to DEFINE_SHOW_ATTRIBUTE
  MAINTAINERS/vfio-ccw: add Farhan and Eric, make Halil Reviewer
  vfio: ccw: Merge BUSY and BOXED states
  s390: use common bust_spinlocks()
  s390/zcrypt: improve special ap message cmd handling
  s390/ap: rework assembler functions to use unions for in/out register variables
  s390: vfio-ap: include <asm/facility> for test_facility()
2019-01-02 18:37:01 -08:00
Linus Torvalds
d9a7fa67b4 Merge branch 'next-seccomp' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull seccomp updates from James Morris:

 - Add SECCOMP_RET_USER_NOTIF

 - seccomp fixes for sparse warnings and s390 build (Tycho)

* 'next-seccomp' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  seccomp, s390: fix build for syscall type change
  seccomp: fix poor type promotion
  samples: add an example of seccomp user trap
  seccomp: add a return code to trap to userspace
  seccomp: switch system call argument type to void *
  seccomp: hoist struct seccomp_data recalculation higher
2019-01-02 09:48:13 -08:00
Sebastian Ott
7dc20ab1b9 s390/pci: skip VF scanning
Set the flag to skip scanning for VFs after SR-IOV enablement.  VF creation
will be triggered by the hotplug code.

Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-01-01 19:06:48 -06:00
Linus Torvalds
195303136f Kconfig file consolidation for v4.21
Consolidation of bus (PCI, PCMCIA, EISA, RapidIO) config entries
 by Christoph Hellwig.
 
 Currently, every architecture that wants to provide common peripheral
 busses needs to add some boilerplate code and include the right Kconfig
 files. This series instead just selects the presence (when needed) and
 then handles everything in the bus-specific Kconfig file under drivers/.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJcJilwAAoJED2LAQed4NsGt1YP/RMTEUqbCSwS/CnTLrE+aVTC
 O2aWwB80ZlVwpeBbHLW5/M88OvOev0UaCr+gyzgpFRl5ITzS7Jevb8VbpGzblbH7
 bFxIEyZFGQiy9oEWw3Lfu9JRSsLm3jNo7hkmdBSn2Rw3KkEd/YF7K3q9GuA7BpCS
 ZxAirebvEpr4KYEzkuc57NqCYx2Tc8G+JWr5D7pZCFaq9vxYt3TddGqw/c7iQVSQ
 1Og1809IdhGyCSlA/ExfaqaBMaJHMRAOHX5GgkqZw1EbFcizUFhAAsKCrGL5nBtX
 NiWF9jhgHR1M+L69jfctOstrmGQD2KicNgWQf1aS5RQkPfjuqIKGT/i9g6J1pVyX
 TaW1J36Hcl8PpsKoPBnnrixd1T41O3/PuqtEJRm7LCBYOQiwS9sEmLO09RDRjER8
 SPAAyvkhE8oq+0RHiTYN4tm8dyJc1djZ5wzgLnwFPAnU6SR+mbN02RzBMsYZXD+x
 RNbBSGBRJFQDBw6Rn+ktcIQvcKYmUqe1k1YNHMy6kG3QqvhBaDy+8PA/YjIKPQYQ
 B/NNUAMEJMys1OQrRL2UDXb2ysaCpzwMmlrBW2IwYsQrX5OwbPkNuQ5Mbe1Lr+mc
 4NXR+HubvojsHaAby+OhFbrUX2Jcz3wqYj7aannb9sMRmw0VJXV5dPYUqje3ZhPS
 P2AovKT8O9nWsEttqER5
 =WxId
 -----END PGP SIGNATURE-----

Merge tag 'kconfig-v4.21-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kconfig file consolidation from Masahiro Yamada:
 "Consolidation of bus (PCI, PCMCIA, EISA, RapidIO) config entries by
  Christoph Hellwig.

  Currently, every architecture that wants to provide common peripheral
  busses needs to add some boilerplate code and include the right
  Kconfig files. This series instead just selects the presence (when
  needed) and then handles everything in the bus-specific Kconfig file
  under drivers/"

* tag 'kconfig-v4.21-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  pcmcia: remove per-arch PCMCIA config entry
  eisa: consolidate EISA Kconfig entry in drivers/eisa
  rapidio: consolidate RAPIDIO config entry in drivers/rapidio
  pcmcia: allow PCMCIA support independent of the architecture
  PCI: consolidate the PCI_SYSCALL symbol
  PCI: consolidate the PCI_DOMAINS and PCI_DOMAINS_GENERIC config options
  PCI: consolidate PCI config entry in drivers/pci
  MIPS: remove the HT_PCI config option
2018-12-29 13:40:29 -08:00
Linus Torvalds
769e47094d Kconfig updates for v4.21
- support -y option for merge_config.sh to avoid downgrading =y to =m
 
  - remove S_OTHER symbol type, and touch include/config/*.h files correctly
 
  - fix file name and line number in lexer warnings
 
  - fix memory leak when EOF is encountered in quotation
 
  - resolve all shift/reduce conflicts of the parser
 
  - warn no new line at end of file
 
  - make 'source' statement more strict to take only string literal
 
  - rewrite the lexer and remove the keyword lookup table
 
  - convert to SPDX License Identifier
 
  - compile C files independently instead of including them from zconf.y
 
  - fix various warnings of gconfig
 
  - misc cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJcJieuAAoJED2LAQed4NsGHlIP/1s0fQ86XD9dIMyHzAO0gh2f
 7rylfe2kEXJgIzJ0DyZdLu4iZtwbkEUqTQrRS1abriNGVemPkfBAnZdM5d92lOQX
 3iREa700AJ2xo7V7gYZ6AbhZoG3p0S9U9Q2qE5S+tFTe8c2Gy4xtjnODF+Vel85r
 S0P8tF5sE1/d00lm+yfMI/CJVfDjyNaMm+aVEnL0kZTPiRkaktjWgo6Fc2p4z1L5
 HFmMMP6/iaXmRZ+tHJGPQ2AT70GFVZw5ePxPcl50EotUP25KHbuUdzs8wDpYm3U/
 rcESVsIFpgqHWmTsdBk6dZk0q8yFZNkMlkaP/aYukVZpUn/N6oAXgTFckYl8dmQL
 fQBkQi6DTfr9EBPVbj18BKm7xI3Y4DdQ2fzTfYkJ2XwNRGFA5r9N3sjd7ZTVGjxC
 aeeMHCwvGdSx1x8PeZAhZfsUHW8xVDMSQiT713+ljBY+6cwzA+2NF0kP7B6OAqwr
 ETFzd4Xu2/lZcL7gQRH8WU3L2S5iedmDG6RnZgJMXI0/9V4qAA+nlsWaCgnl1TgA
 mpxYlLUMrd6AUJevE34FlnyFdk8IMn9iKRFsvF0f3doO5C7QzTVGqFdJu5a0CuWO
 4NBJvZjFT8/4amoWLfnDlfApWXzTfwLbKG+r6V2F30fLuXpYg5LxWhBoGRPYLZSq
 oi4xN1Mpx3TvXz6WcKVZ
 =r3Fl
 -----END PGP SIGNATURE-----

Merge tag 'kconfig-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kconfig updates from Masahiro Yamada:

 - support -y option for merge_config.sh to avoid downgrading =y to =m

 - remove S_OTHER symbol type, and touch include/config/*.h files correctly

 - fix file name and line number in lexer warnings

 - fix memory leak when EOF is encountered in quotation

 - resolve all shift/reduce conflicts of the parser

 - warn no new line at end of file

 - make 'source' statement more strict to take only string literal

 - rewrite the lexer and remove the keyword lookup table

 - convert to SPDX License Identifier

 - compile C files independently instead of including them from zconf.y

 - fix various warnings of gconfig

 - misc cleanups

* tag 'kconfig-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (39 commits)
  kconfig: surround dbg_sym_flags with #ifdef DEBUG to fix gconf warning
  kconfig: split images.c out of qconf.cc/gconf.c to fix gconf warnings
  kconfig: add static qualifiers to fix gconf warnings
  kconfig: split the lexer out of zconf.y
  kconfig: split some C files out of zconf.y
  kconfig: convert to SPDX License Identifier
  kconfig: remove keyword lookup table entirely
  kconfig: update current_pos in the second lexer
  kconfig: switch to ASSIGN_VAL state in the second lexer
  kconfig: stop associating kconf_id with yylval
  kconfig: refactor end token rules
  kconfig: stop supporting '.' and '/' in unquoted words
  treewide: surround Kconfig file paths with double quotes
  microblaze: surround string default in Kconfig with double quotes
  kconfig: use T_WORD instead of T_VARIABLE for variables
  kconfig: use specific tokens instead of T_ASSIGN for assignments
  kconfig: refactor scanning and parsing "option" properties
  kconfig: use distinct tokens for type and default properties
  kconfig: remove redundant token defines
  kconfig: rename depends_list to comment_option_list
  ...
2018-12-29 13:03:29 -08:00
Linus Torvalds
f346b0becb Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:

 - large KASAN update to use arm's "software tag-based mode"

 - a few misc things

 - sh updates

 - ocfs2 updates

 - just about all of MM

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (167 commits)
  kernel/fork.c: mark 'stack_vm_area' with __maybe_unused
  memcg, oom: notify on oom killer invocation from the charge path
  mm, swap: fix swapoff with KSM pages
  include/linux/gfp.h: fix typo
  mm/hmm: fix memremap.h, move dev_page_fault_t callback to hmm
  hugetlbfs: Use i_mmap_rwsem to fix page fault/truncate race
  hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization
  memory_hotplug: add missing newlines to debugging output
  mm: remove __hugepage_set_anon_rmap()
  include/linux/vmstat.h: remove unused page state adjustment macro
  mm/page_alloc.c: allow error injection
  mm: migrate: drop unused argument of migrate_page_move_mapping()
  blkdev: avoid migration stalls for blkdev pages
  mm: migrate: provide buffer_migrate_page_norefs()
  mm: migrate: move migrate_page_lock_buffers()
  mm: migrate: lock buffers before migrate_page_move_mapping()
  mm: migration: factor out code to compute expected number of page references
  mm, page_alloc: enable pcpu_drain with zone capability
  kmemleak: add config to select auto scan
  mm/page_alloc.c: don't call kasan_free_pages() at deferred mem init
  ...
2018-12-28 16:55:46 -08:00
Linus Torvalds
af7ddd8a62 DMA mapping updates for Linux 4.21
A huge update this time, but a lot of that is just consolidating or
 removing code:
 
  - provide a common DMA_MAPPING_ERROR definition and avoid indirect
    calls for dma_map_* error checking
  - use direct calls for the DMA direct mapping case, avoiding huge
    retpoline overhead for high performance workloads
  - merge the swiotlb dma_map_ops into dma-direct
  - provide a generic remapping DMA consistent allocator for architectures
    that have devices that perform DMA that is not cache coherent. Based
    on the existing arm64 implementation and also used for csky now.
  - improve the dma-debug infrastructure, including dynamic allocation
    of entries (Robin Murphy)
  - default to providing chaining scatterlist everywhere, with opt-outs
    for the few architectures (alpha, parisc, most arm32 variants) that
    can't cope with it
  - misc sparc32 dma-related cleanups
  - remove the dma_mark_clean arch hook used by swiotlb on ia64 and
    replace it with the generic noncoherent infrastructure
  - fix the return type of dma_set_max_seg_size (Niklas Söderlund)
  - move the dummy dma ops for not DMA capable devices from arm64 to
    common code (Robin Murphy)
  - ensure dma_alloc_coherent returns zeroed memory to avoid kernel data
    leaks through userspace.  We already did this for most common
    architectures, but this ensures we do it everywhere.
    dma_zalloc_coherent has been deprecated and can hopefully be
    removed after -rc1 with a coccinelle script.
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAlwctQgLHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYMxgQ//dBpAfS4/J76CdAbYry2zqgcOUU9hIrD6NHiEMWov
 ltJxyvEl3LsUmIdEj3aCrYL9jZN0qsnCzn5BVj2c3jDIVgD64fAr7HDf/PbEEfKb
 j6/GgEnVLPZV+sQMvhNA5jOzHrkseaqPa4/pNLFZ/l8jnuZ2d+btusDWJpMoVDer
 TXVwtIfgeIu0gTygYOShLYXd5qptWKWsZEpbTZOO2sE6+x+ZJX7yQYUxYDTlcOIj
 JWVO2l5QNHPc5T9o2at+6L5aNUvnZOxT79sWgyZLn0Kc+FagKAVwfLqUEl0v7foG
 8k/xca5/8p3afB1DfrIrtplJqis7cVgdyGxriwuuoO8X4F0nPyWwpGmxsBhrWwwl
 xTqC4UorEJ7QwoP6Azopk/vYI2QXIUBLjuCJCuFXZj9+2BGf4IfvBY1S2cLM9qLs
 HMcxQonuXJii044KEFS96ePEuiT+igVINweIFBKWcgNCEG0UQtyL6RQ1U5297ipF
 JiWZAqD+p9X52UdKS+oKfAiZEekMXn6Xyo97+YCiNpfOo0GP5eEcwhL+JpY4AiRq
 apPXtsRy2o1s8yfjdraUIM2Mc2n62vFKb35oUbGCd/QO9piPrFQHl6T0HHcHk4YR
 XrUXcHieFZBCYqh7ZVa4RL8Msq1wvGuTL4Dxl43mXdsMoUFRR6eSNWLoAV4IpOLZ
 WgA=
 =in72
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-4.21' of git://git.infradead.org/users/hch/dma-mapping

Pull DMA mapping updates from Christoph Hellwig:
 "A huge update this time, but a lot of that is just consolidating or
  removing code:

   - provide a common DMA_MAPPING_ERROR definition and avoid indirect
     calls for dma_map_* error checking

   - use direct calls for the DMA direct mapping case, avoiding huge
     retpoline overhead for high performance workloads

   - merge the swiotlb dma_map_ops into dma-direct

   - provide a generic remapping DMA consistent allocator for
     architectures that have devices that perform DMA that is not cache
     coherent. Based on the existing arm64 implementation and also used
     for csky now.

   - improve the dma-debug infrastructure, including dynamic allocation
     of entries (Robin Murphy)

   - default to providing chaining scatterlist everywhere, with opt-outs
     for the few architectures (alpha, parisc, most arm32 variants) that
     can't cope with it

   - misc sparc32 dma-related cleanups

   - remove the dma_mark_clean arch hook used by swiotlb on ia64 and
     replace it with the generic noncoherent infrastructure

   - fix the return type of dma_set_max_seg_size (Niklas Söderlund)

   - move the dummy dma ops for not DMA capable devices from arm64 to
     common code (Robin Murphy)

   - ensure dma_alloc_coherent returns zeroed memory to avoid kernel
     data leaks through userspace. We already did this for most common
     architectures, but this ensures we do it everywhere.
     dma_zalloc_coherent has been deprecated and can hopefully be
     removed after -rc1 with a coccinelle script"

* tag 'dma-mapping-4.21' of git://git.infradead.org/users/hch/dma-mapping: (73 commits)
  dma-mapping: fix inverted logic in dma_supported
  dma-mapping: deprecate dma_zalloc_coherent
  dma-mapping: zero memory returned from dma_alloc_*
  sparc/iommu: fix ->map_sg return value
  sparc/io-unit: fix ->map_sg return value
  arm64: default to the direct mapping in get_arch_dma_ops
  PCI: Remove unused attr variable in pci_dma_configure
  ia64: only select ARCH_HAS_DMA_COHERENT_TO_PFN if swiotlb is enabled
  dma-mapping: bypass indirect calls for dma-direct
  vmd: use the proper dma_* APIs instead of direct methods calls
  dma-direct: merge swiotlb_dma_ops into the dma_direct code
  dma-direct: use dma_direct_map_page to implement dma_direct_map_sg
  dma-direct: improve addressability error reporting
  swiotlb: remove dma_mark_clean
  swiotlb: remove SWIOTLB_MAP_ERROR
  ACPI / scan: Refactor _CCA enforcement
  dma-mapping: factor out dummy DMA ops
  dma-mapping: always build the direct mapping code
  dma-mapping: move dma_cache_sync out of line
  dma-mapping: move various slow path functions out of line
  ...
2018-12-28 14:12:21 -08:00
Oscar Salvador
2c2a5af6fe mm, memory_hotplug: add nid parameter to arch_remove_memory
Patch series "Do not touch pages in hot-remove path", v2.

This patchset aims for two things:

 1) A better definition about offline and hot-remove stage
 2) Solving bugs where we can access non-initialized pages
    during hot-remove operations [2] [3].

This is achieved by moving all page/zone handling to the offline
stage, so we do not need to access pages when hot-removing memory.

[1] https://patchwork.kernel.org/cover/10691415/
[2] https://patchwork.kernel.org/patch/10547445/
[3] https://www.spinics.net/lists/linux-mm/msg161316.html

This patch (of 5):

This is a preparation for the following-up patches.  The idea of passing
the nid is that it will allow us to get rid of the zone parameter
afterwards.

Link: http://lkml.kernel.org/r/20181127162005.15833-2-osalvador@suse.de
Signed-off-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-28 12:11:49 -08:00
Arun KS
ca79b0c211 mm: convert totalram_pages and totalhigh_pages variables to atomic
totalram_pages and totalhigh_pages are made static inline function.

Main motivation was that managed_page_count_lock handling was complicating
things.  It was discussed in length here,
https://lore.kernel.org/patchwork/patch/995739/#1181785 So it seemes
better to remove the lock and convert variables to atomic, with preventing
poteintial store-to-read tearing as a bonus.

[akpm@linux-foundation.org: coding style fixes]
Link: http://lkml.kernel.org/r/1542090790-21750-4-git-send-email-arunks@codeaurora.org
Signed-off-by: Arun KS <arunks@codeaurora.org>
Suggested-by: Michal Hocko <mhocko@suse.com>
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-28 12:11:47 -08:00
Andrey Konovalov
9577dd7486 kasan: rename kasan_zero_page to kasan_early_shadow_page
With tag based KASAN mode the early shadow value is 0xff and not 0x00, so
this patch renames kasan_zero_(page|pte|pmd|pud|p4d) to
kasan_early_shadow_(page|pte|pmd|pud|p4d) to avoid confusion.

Link: http://lkml.kernel.org/r/3fed313280ebf4f88645f5b89ccbc066d320e177.1544099024.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-28 12:11:43 -08:00
Linus Torvalds
b71acb0e37 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
 "API:
   - Add 1472-byte test to tcrypt for IPsec
   - Reintroduced crypto stats interface with numerous changes
   - Support incremental algorithm dumps

  Algorithms:
   - Add xchacha12/20
   - Add nhpoly1305
   - Add adiantum
   - Add streebog hash
   - Mark cts(cbc(aes)) as FIPS allowed

  Drivers:
   - Improve performance of arm64/chacha20
   - Improve performance of x86/chacha20
   - Add NEON-accelerated nhpoly1305
   - Add SSE2 accelerated nhpoly1305
   - Add AVX2 accelerated nhpoly1305
   - Add support for 192/256-bit keys in gcmaes AVX
   - Add SG support in gcmaes AVX
   - ESN for inline IPsec tx in chcr
   - Add support for CryptoCell 703 in ccree
   - Add support for CryptoCell 713 in ccree
   - Add SM4 support in ccree
   - Add SM3 support in ccree
   - Add support for chacha20 in caam/qi2
   - Add support for chacha20 + poly1305 in caam/jr
   - Add support for chacha20 + poly1305 in caam/qi2
   - Add AEAD cipher support in cavium/nitrox"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (130 commits)
  crypto: skcipher - remove remnants of internal IV generators
  crypto: cavium/nitrox - Fix build with !CONFIG_DEBUG_FS
  crypto: salsa20-generic - don't unnecessarily use atomic walk
  crypto: skcipher - add might_sleep() to skcipher_walk_virt()
  crypto: x86/chacha - avoid sleeping under kernel_fpu_begin()
  crypto: cavium/nitrox - Added AEAD cipher support
  crypto: mxc-scc - fix build warnings on ARM64
  crypto: api - document missing stats member
  crypto: user - remove unused dump functions
  crypto: chelsio - Fix wrong error counter increments
  crypto: chelsio - Reset counters on cxgb4 Detach
  crypto: chelsio - Handle PCI shutdown event
  crypto: chelsio - cleanup:send addr as value in function argument
  crypto: chelsio - Use same value for both channel in single WR
  crypto: chelsio - Swap location of AAD and IV sent in WR
  crypto: chelsio - remove set but not used variable 'kctx_len'
  crypto: ux500 - Use proper enum in hash_set_dma_transfer
  crypto: ux500 - Use proper enum in cryp_set_dma_transfer
  crypto: aesni - Add scatter/gather avx stubs, and use them in C
  crypto: aesni - Introduce partial block macro
  ..
2018-12-27 13:53:32 -08:00
Linus Torvalds
e0c38a4d1f Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) New ipset extensions for matching on destination MAC addresses, from
    Stefano Brivio.

 2) Add ipv4 ttl and tos, plus ipv6 flow label and hop limit offloads to
    nfp driver. From Stefano Brivio.

 3) Implement GRO for plain UDP sockets, from Paolo Abeni.

 4) Lots of work from Michał Mirosław to eliminate the VLAN_TAG_PRESENT
    bit so that we could support the entire vlan_tci value.

 5) Rework the IPSEC policy lookups to better optimize more usecases,
    from Florian Westphal.

 6) Infrastructure changes eliminating direct manipulation of SKB lists
    wherever possible, and to always use the appropriate SKB list
    helpers. This work is still ongoing...

 7) Lots of PHY driver and state machine improvements and
    simplifications, from Heiner Kallweit.

 8) Various TSO deferral refinements, from Eric Dumazet.

 9) Add ntuple filter support to aquantia driver, from Dmitry Bogdanov.

10) Batch dropping of XDP packets in tuntap, from Jason Wang.

11) Lots of cleanups and improvements to the r8169 driver from Heiner
    Kallweit, including support for ->xmit_more. This driver has been
    getting some much needed love since he started working on it.

12) Lots of new forwarding selftests from Petr Machata.

13) Enable VXLAN learning in mlxsw driver, from Ido Schimmel.

14) Packed ring support for virtio, from Tiwei Bie.

15) Add new Aquantia AQtion USB driver, from Dmitry Bezrukov.

16) Add XDP support to dpaa2-eth driver, from Ioana Ciocoi Radulescu.

17) Implement coalescing on TCP backlog queue, from Eric Dumazet.

18) Implement carrier change in tun driver, from Nicolas Dichtel.

19) Support msg_zerocopy in UDP, from Willem de Bruijn.

20) Significantly improve garbage collection of neighbor objects when
    the table has many PERMANENT entries, from David Ahern.

21) Remove egdev usage from nfp and mlx5, and remove the facility
    completely from the tree as it no longer has any users. From Oz
    Shlomo and others.

22) Add a NETDEV_PRE_CHANGEADDR so that drivers can veto the change and
    therefore abort the operation before the commit phase (which is the
    NETDEV_CHANGEADDR event). From Petr Machata.

23) Add indirect call wrappers to avoid retpoline overhead, and use them
    in the GRO code paths. From Paolo Abeni.

24) Add support for netlink FDB get operations, from Roopa Prabhu.

25) Support bloom filter in mlxsw driver, from Nir Dotan.

26) Add SKB extension infrastructure. This consolidates the handling of
    the auxiliary SKB data used by IPSEC and bridge netfilter, and is
    designed to support the needs to MPTCP which could be integrated in
    the future.

27) Lots of XDP TX optimizations in mlx5 from Tariq Toukan.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1845 commits)
  net: dccp: fix kernel crash on module load
  drivers/net: appletalk/cops: remove redundant if statement and mask
  bnx2x: Fix NULL pointer dereference in bnx2x_del_all_vlans() on some hw
  net/net_namespace: Check the return value of register_pernet_subsys()
  net/netlink_compat: Fix a missing check of nla_parse_nested
  ieee802154: lowpan_header_create check must check daddr
  net/mlx4_core: drop useless LIST_HEAD
  mlxsw: spectrum: drop useless LIST_HEAD
  net/mlx5e: drop useless LIST_HEAD
  iptunnel: Set tun_flags in the iptunnel_metadata_reply from src
  net/mlx5e: fix semicolon.cocci warnings
  staging: octeon: fix build failure with XFRM enabled
  net: Revert recent Spectre-v1 patches.
  can: af_can: Fix Spectre v1 vulnerability
  packet: validate address length if non-zero
  nfc: af_nfc: Fix Spectre v1 vulnerability
  phonet: af_phonet: Fix Spectre v1 vulnerability
  net: core: Fix Spectre v1 vulnerability
  net: minor cleanup in skb_ext_add()
  net: drop the unused helper skb_ext_get()
  ...
2018-12-27 13:04:52 -08:00
Linus Torvalds
792bf4d871 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnar:
 "The biggest RCU changes in this cycle were:

   - Convert RCU's BUG_ON() and similar calls to WARN_ON() and similar.

   - Replace calls of RCU-bh and RCU-sched update-side functions to
     their vanilla RCU counterparts. This series is a step towards
     complete removal of the RCU-bh and RCU-sched update-side functions.

     ( Note that some of these conversions are going upstream via their
       respective maintainers. )

   - Documentation updates, including a number of flavor-consolidation
     updates from Joel Fernandes.

   - Miscellaneous fixes.

   - Automate generation of the initrd filesystem used for rcutorture
     testing.

   - Convert spin_is_locked() assertions to instead use lockdep.

     ( Note that some of these conversions are going upstream via their
       respective maintainers. )

   - SRCU updates, especially including a fix from Dennis Krein for a
     bag-on-head-class bug.

   - RCU torture-test updates"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (112 commits)
  rcutorture: Don't do busted forward-progress testing
  rcutorture: Use 100ms buckets for forward-progress callback histograms
  rcutorture: Recover from OOM during forward-progress tests
  rcutorture: Print forward-progress test age upon failure
  rcutorture: Print time since GP end upon forward-progress failure
  rcutorture: Print histogram of CB invocation at OOM time
  rcutorture: Print GP age upon forward-progress failure
  rcu: Print per-CPU callback counts for forward-progress failures
  rcu: Account for nocb-CPU callback counts in RCU CPU stall warnings
  rcutorture: Dump grace-period diagnostics upon forward-progress OOM
  rcutorture: Prepare for asynchronous access to rcu_fwd_startat
  torture: Remove unnecessary "ret" variables
  rcutorture: Affinity forward-progress test to avoid housekeeping CPUs
  rcutorture: Break up too-long rcu_torture_fwd_prog() function
  rcutorture: Remove cbflood facility
  torture: Bring any extra CPUs online during kernel startup
  rcutorture: Add call_rcu() flooding forward-progress tests
  rcutorture/formal: Replace synchronize_sched() with synchronize_rcu()
  tools/kernel.h: Replace synchronize_sched() with synchronize_rcu()
  net/decnet: Replace rcu_barrier_bh() with rcu_barrier()
  ...
2018-12-26 13:07:19 -08:00
Linus Torvalds
42b00f122c * ARM: selftests improvements, large PUD support for HugeTLB,
single-stepping fixes, improved tracing, various timer and vGIC
 fixes
 
 * x86: Processor Tracing virtualization, STIBP support, some correctness fixes,
 refactorings and splitting of vmx.c, use the Hyper-V range TLB flush hypercall,
 reduce order of vcpu struct, WBNOINVD support, do not use -ftrace for __noclone
 functions, nested guest support for PAUSE filtering on AMD, more Hyper-V
 enlightenments (direct mode for synthetic timers)
 
 * PPC: nested VFIO
 
 * s390: bugfixes only this time
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJcH0vFAAoJEL/70l94x66Dw/wH/2FZp1YOM5OgiJzgqnXyDbyf
 dNEfWo472MtNiLsuf+ZAfJojVIu9cv7wtBfXNzW+75XZDfh/J88geHWNSiZDm3Fe
 aM4MOnGG0yF3hQrRQyEHe4IFhGFNERax8Ccv+OL44md9CjYrIrsGkRD08qwb+gNh
 P8T/3wJEKwUcVHA/1VHEIM8MlirxNENc78p6JKd/C7zb0emjGavdIpWFUMr3SNfs
 CemabhJUuwOYtwjRInyx1y34FzYwW3Ejuc9a9UoZ+COahUfkuxHE8u+EQS7vLVF6
 2VGVu5SA0PqgmLlGhHthxLqVgQYo+dB22cRnsLtXlUChtVAq8q9uu5sKzvqEzuE=
 =b4Jx
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "ARM:
   - selftests improvements
   - large PUD support for HugeTLB
   - single-stepping fixes
   - improved tracing
   - various timer and vGIC fixes

  x86:
   - Processor Tracing virtualization
   - STIBP support
   - some correctness fixes
   - refactorings and splitting of vmx.c
   - use the Hyper-V range TLB flush hypercall
   - reduce order of vcpu struct
   - WBNOINVD support
   - do not use -ftrace for __noclone functions
   - nested guest support for PAUSE filtering on AMD
   - more Hyper-V enlightenments (direct mode for synthetic timers)

  PPC:
   -  nested VFIO

  s390:
   - bugfixes only this time"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (171 commits)
  KVM: x86: Add CPUID support for new instruction WBNOINVD
  kvm: selftests: ucall: fix exit mmio address guessing
  Revert "compiler-gcc: disable -ftracer for __noclone functions"
  KVM: VMX: Move VM-Enter + VM-Exit handling to non-inline sub-routines
  KVM: VMX: Explicitly reference RCX as the vmx_vcpu pointer in asm blobs
  KVM: x86: Use jmp to invoke kvm_spurious_fault() from .fixup
  MAINTAINERS: Add arch/x86/kvm sub-directories to existing KVM/x86 entry
  KVM/x86: Use SVM assembly instruction mnemonics instead of .byte streams
  KVM/MMU: Flush tlb directly in the kvm_zap_gfn_range()
  KVM/MMU: Flush tlb directly in kvm_set_pte_rmapp()
  KVM/MMU: Move tlb flush in kvm_set_pte_rmapp() to kvm_mmu_notifier_change_pte()
  KVM: Make kvm_set_spte_hva() return int
  KVM: Replace old tlb flush function with new one to flush a specified range.
  KVM/MMU: Add tlb flush with range helper function
  KVM/VMX: Add hv tlb range flush support
  x86/hyper-v: Add HvFlushGuestAddressList hypercall support
  KVM: Add tlb_remote_flush_with_range callback in kvm_x86_ops
  KVM: x86: Disable Intel PT when VMXON in L1 guest
  KVM: x86: Set intercept for Intel PT MSRs read/write
  KVM: x86: Implement Intel PT MSRs read/write emulation
  ...
2018-12-26 11:46:28 -08:00
Linus Torvalds
5694cecdb0 arm64 festive updates for 4.21
In the end, we ended up with quite a lot more than I expected:
 
 - Support for ARMv8.3 Pointer Authentication in userspace (CRIU and
   kernel-side support to come later)
 
 - Support for per-thread stack canaries, pending an update to GCC that
   is currently undergoing review
 
 - Support for kexec_file_load(), which permits secure boot of a kexec
   payload but also happens to improve the performance of kexec
   dramatically because we can avoid the sucky purgatory code from
   userspace. Kdump will come later (requires updates to libfdt).
 
 - Optimisation of our dynamic CPU feature framework, so that all
   detected features are enabled via a single stop_machine() invocation
 
 - KPTI whitelisting of Cortex-A CPUs unaffected by Meltdown, so that
   they can benefit from global TLB entries when KASLR is not in use
 
 - 52-bit virtual addressing for userspace (kernel remains 48-bit)
 
 - Patch in LSE atomics for per-cpu atomic operations
 
 - Custom preempt.h implementation to avoid unconditional calls to
   preempt_schedule() from preempt_enable()
 
 - Support for the new 'SB' Speculation Barrier instruction
 
 - Vectorised implementation of XOR checksumming and CRC32 optimisations
 
 - Workaround for Cortex-A76 erratum #1165522
 
 - Improved compatibility with Clang/LLD
 
 - Support for TX2 system PMUS for profiling the L3 cache and DMC
 
 - Reflect read-only permissions in the linear map by default
 
 - Ensure MMIO reads are ordered with subsequent calls to Xdelay()
 
 - Initial support for memory hotplug
 
 - Tweak the threshold when we invalidate the TLB by-ASID, so that
   mremap() performance is improved for ranges spanning multiple PMDs.
 
 - Minor refactoring and cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABCgAGBQJcE4TmAAoJELescNyEwWM0Nr0H/iaU7/wQSzHyNXtZoImyKTul
 Blu2ga4/EqUrTU7AVVfmkl/3NBILWlgQVpY6tH6EfXQuvnxqD7CizbHyLdyO+z0S
 B5PsFUH2GLMNAi48AUNqGqkgb2knFbg+T+9IimijDBkKg1G/KhQnRg6bXX32mLJv
 Une8oshUPBVJMsHN1AcQknzKariuoE3u0SgJ+eOZ9yA2ZwKxP4yy1SkDt3xQrtI0
 lojeRjxcyjTP1oGRNZC+BWUtGOT35p7y6cGTnBd/4TlqBGz5wVAJUcdoxnZ6JYVR
 O8+ob9zU+4I0+SKt80s7pTLqQiL9rxkKZ5joWK1pr1g9e0s5N5yoETXKFHgJYP8=
 =sYdt
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 festive updates from Will Deacon:
 "In the end, we ended up with quite a lot more than I expected:

   - Support for ARMv8.3 Pointer Authentication in userspace (CRIU and
     kernel-side support to come later)

   - Support for per-thread stack canaries, pending an update to GCC
     that is currently undergoing review

   - Support for kexec_file_load(), which permits secure boot of a kexec
     payload but also happens to improve the performance of kexec
     dramatically because we can avoid the sucky purgatory code from
     userspace. Kdump will come later (requires updates to libfdt).

   - Optimisation of our dynamic CPU feature framework, so that all
     detected features are enabled via a single stop_machine()
     invocation

   - KPTI whitelisting of Cortex-A CPUs unaffected by Meltdown, so that
     they can benefit from global TLB entries when KASLR is not in use

   - 52-bit virtual addressing for userspace (kernel remains 48-bit)

   - Patch in LSE atomics for per-cpu atomic operations

   - Custom preempt.h implementation to avoid unconditional calls to
     preempt_schedule() from preempt_enable()

   - Support for the new 'SB' Speculation Barrier instruction

   - Vectorised implementation of XOR checksumming and CRC32
     optimisations

   - Workaround for Cortex-A76 erratum #1165522

   - Improved compatibility with Clang/LLD

   - Support for TX2 system PMUS for profiling the L3 cache and DMC

   - Reflect read-only permissions in the linear map by default

   - Ensure MMIO reads are ordered with subsequent calls to Xdelay()

   - Initial support for memory hotplug

   - Tweak the threshold when we invalidate the TLB by-ASID, so that
     mremap() performance is improved for ranges spanning multiple PMDs.

   - Minor refactoring and cleanups"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (125 commits)
  arm64: kaslr: print PHYS_OFFSET in dump_kernel_offset()
  arm64: sysreg: Use _BITUL() when defining register bits
  arm64: cpufeature: Rework ptr auth hwcaps using multi_entry_cap_matches
  arm64: cpufeature: Reduce number of pointer auth CPU caps from 6 to 4
  arm64: docs: document pointer authentication
  arm64: ptr auth: Move per-thread keys from thread_info to thread_struct
  arm64: enable pointer authentication
  arm64: add prctl control for resetting ptrauth keys
  arm64: perf: strip PAC when unwinding userspace
  arm64: expose user PAC bit positions via ptrace
  arm64: add basic pointer authentication support
  arm64/cpufeature: detect pointer authentication
  arm64: Don't trap host pointer auth use to EL2
  arm64/kvm: hide ptrauth from guests
  arm64/kvm: consistently handle host HCR_EL2 flags
  arm64: add pointer authentication register bits
  arm64: add comments about EC exception levels
  arm64: perf: Treat EXCLUDE_EL* bit definitions as unsigned
  arm64: kpti: Whitelist Cortex-A CPUs that don't implement the CSV3 field
  arm64: enable per-task stack canaries
  ...
2018-12-25 17:41:56 -08:00
Masahiro Yamada
8636a1f967 treewide: surround Kconfig file paths with double quotes
The Kconfig lexer supports special characters such as '.' and '/' in
the parameter context. In my understanding, the reason is just to
support bare file paths in the source statement.

I do not see a good reason to complicate Kconfig for the room of
ambiguity.

The majority of code already surrounds file paths with double quotes,
and it makes sense since file paths are constant string literals.

Make it treewide consistent now.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Wolfram Sang <wsa@the-dreams.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
2018-12-22 00:25:54 +09:00
Christoph Hellwig
518a2f1925 dma-mapping: zero memory returned from dma_alloc_*
If we want to map memory from the DMA allocator to userspace it must be
zeroed at allocation time to prevent stale data leaks.   We already do
this on most common architectures, but some architectures don't do this
yet, fix them up, either by passing GFP_ZERO when we use the normal page
allocator or doing a manual memset otherwise.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
Acked-by: Sam Ravnborg <sam@ravnborg.org> [sparc]
2018-12-20 08:13:52 +01:00
Paolo Bonzini
e9f2e05a5f KVM: s390: Fixes for 4.21
Just two small fixes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJcGgnhAAoJEBF7vIC1phx8kfgP/iiXHJo94IK/rqXbYFGnw259
 ehRaWmzXJAdU6G7RgaAqyNkEudOjIoPx9QDe0WRl/vRkAiQ2iejoovGFfa/wDkV4
 N4uCKdKJ8U39ixonC7/b90798p+Fgc1MfNHtrsvgjj9d4kjzx6L0Qq9G+8t9EcU+
 BJZNuK6L2+AY/o/yysVTCp5yI/Pqf0vtKrtglsGe7Eg1FES8MWR3A0OIeOar5Bcq
 uGFIUhEy2tHDNFYSdrmKCF4DGkJ+RmBgAEq/Lp2RqChD00CfVE/pHNZfQHGXmPFA
 MuWvUohuhhF7Ly3OrQKNdILqxQkqUov3pNeWSzTb4Awy/GY3F1j9K4ysF4/uQLFr
 97kjySVUpK1qhDVVS2lGZp1gOAmjByVfw9j7/Jq+MPDsHmNRISTfbCjdkzyhHxcd
 joPS9/StC1r/kFN9pyfDr+S+8KgG4jx5Jk6Jjwt+BOUi2pummP9UcrAfxQd+6QKZ
 3s2qrgAbkaJfYXpTqEw0WkxncYsNC+WVL3tmL7IQdBo6C+rPtUPpiSgT4Mbwy9Tk
 s7KGX9u33mDuw4vvz3LFZcgcXdM+hItzsHsE/l8PFOea5jqKIvyyuaK9zjGS25b1
 VTP/2RckdopTHEy+iFz3tmRzHB2n36U3cEeOCow3/wDzEbJKy2qK7SeIUTuQloyG
 ZZChydpdoc3I/5m6ecCc
 =GVCX
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-next-4.21-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390: Fixes for 4.21

Just two small fixes.
2018-12-19 22:17:09 +01:00
Michael Mueller
7aedd9d48f KVM: s390: fix kmsg component kvm-s390
Relocate #define statement for kvm related kernel messages
before the include of printk to become effective.

Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-12-18 10:18:27 +01:00
Michael Mueller
308c3e6673 KVM: s390: unregister debug feature on failing arch init
Make sure the debug feature and its allocated resources get
released upon unsuccessful architecture initialization.

A related indication of the issue will be reported as kernel
message.

Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20181130143215.69496-2-mimu@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-12-18 10:18:27 +01:00
Paolo Bonzini
e5d83c74a5 kvm: make KVM_CAP_ENABLE_CAP_VM architecture agnostic
The first such capability to be handled in virt/kvm/ will be manual
dirty page reprotection.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-12-14 12:34:18 +01:00
Tycho Andersen
4fc96ee908 seccomp, s390: fix build for syscall type change
A recent patch landed in the security tree [1] that changed the type of the
seccomp syscall. Unfortunately, I didn't quite get every instance of the
forward declarations, and thus there is a build failure. Here's the last
one that I could find, for s390. It should go through the security tree,
although hopefully some s390 people can check and make sure it looks
reasonable?

The only oddity is the trailing semicolon; some lines around this patch
have it, and some lines don't. I've left this one as-is.

[1]: https://lore.kernel.org/lkml/20181212231630.GA31584@beast/T/#u

Signed-off-by: Tycho Andersen <tycho@tycho.ws>
Fixes: 6a21cc50f0 ("seccomp: add a return code to trap to userspace")
Signed-off-by: Kees Cook <keescook@chromium.org>
2018-12-13 16:51:01 -08:00
Christoph Hellwig
3731c3d477 dma-mapping: always build the direct mapping code
All architectures except for sparc64 use the dma-direct code in some
form, and even for sparc64 we had the discussion of a direct mapping
mode a while ago.  In preparation for directly calling the direct
mapping code don't bother having it optionally but always build the
code in.  This is a minor hardship for some powerpc and arm configs
that don't pull it in yet (although they should in a relase ot two),
and sparc64 which currently doesn't need it at all, but it will
reduce the ifdef mess we'd otherwise need significantly.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Tony Luck <tony.luck@intel.com>
2018-12-13 21:06:11 +01:00
Sebastian Ott
98dfd32620 s390/pci: fix sleeping in atomic during hotplug
When triggered by pci hotplug (PEC 0x306) clp_get_state is called
with spinlocks held resulting in the following warning:

zpci: n/a: Event 0x306 reconfigured PCI function 0x0
BUG: sleeping function called from invalid context at mm/page_alloc.c:4324
in_atomic(): 1, irqs_disabled(): 0, pid: 98, name: kmcheck
2 locks held by kmcheck/98:

Change the allocation to use GFP_ATOMIC.

Cc: stable@vger.kernel.org # 4.13+
Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-12-13 10:42:26 +01:00
Sebastian Ott
9594ca6b87 s390/pci: remove bit_lock usage in interrupt handler
The interrupt handler uses bit_spin_lock around a call to retrieve
per irq data (the irq number). However this per irq data is only
set during irq setup time and never changed until the irq is freed.

Thus it's safe to remove the lock usage.

Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-12-13 10:42:25 +01:00
David S. Miller
addb067983 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2018-12-11

The following pull-request contains BPF updates for your *net-next* tree.

It has three minor merge conflicts, resolutions:

1) tools/testing/selftests/bpf/test_verifier.c

 Take first chunk with alignment_prevented_execution.

2) net/core/filter.c

  [...]
  case bpf_ctx_range_ptr(struct __sk_buff, flow_keys):
  case bpf_ctx_range(struct __sk_buff, wire_len):
        return false;
  [...]

3) include/uapi/linux/bpf.h

  Take the second chunk for the two cases each.

The main changes are:

1) Add support for BPF line info via BTF and extend libbpf as well
   as bpftool's program dump to annotate output with BPF C code to
   facilitate debugging and introspection, from Martin.

2) Add support for BPF_ALU | BPF_ARSH | BPF_{K,X} in interpreter
   and all JIT backends, from Jiong.

3) Improve BPF test coverage on archs with no efficient unaligned
   access by adding an "any alignment" flag to the BPF program load
   to forcefully disable verifier alignment checks, from David.

4) Add a new bpf_prog_test_run_xattr() API to libbpf which allows for
   proper use of BPF_PROG_TEST_RUN with data_out, from Lorenz.

5) Extend tc BPF programs to use a new __sk_buff field called wire_len
   for more accurate accounting of packets going to wire, from Petar.

6) Improve bpftool to allow dumping the trace pipe from it and add
   several improvements in bash completion and map/prog dump,
   from Quentin.

7) Optimize arm64 BPF JIT to always emit movn/movk/movk sequence for
   kernel addresses and add a dedicated BPF JIT backend allocator,
   from Ard.

8) Add a BPF helper function for IR remotes to report mouse movements,
   from Sean.

9) Various cleanups in BPF prog dump e.g. to make UAPI bpf_prog_info
   member naming consistent with existing conventions, from Yonghong
   and Song.

10) Misc cleanups and improvements in allowing to pass interface name
    via cmdline for xdp1 BPF example, from Matteo.

11) Fix a potential segfault in BPF sample loader's kprobes handling,
    from Daniel T.

12) Fix SPDX license in libbpf's README.rst, from Andrey.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-10 18:00:43 -08:00
Will Deacon
d34664f63b Merge branch 'for-next/kexec' into aarch64/for-next/core
Merge in kexec_file_load() support from Akashi Takahiro.
2018-12-10 18:57:17 +00:00
Jiong Wang
f860203b01 s390: bpf: implement jitting of BPF_ALU | BPF_ARSH | BPF_*
This patch implements code-gen for BPF_ALU | BPF_ARSH | BPF_*.

Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-07 13:30:48 -08:00
Will Deacon
08861d33d6 preempt: Move PREEMPT_NEED_RESCHED definition into arch code
PREEMPT_NEED_RESCHED is never used directly, so move it into the arch
code where it can potentially be implemented using either a different
bit in the preempt count or as an entirely separate entity.

Cc: Robert Love <rml@tech9.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-07 12:35:46 +00:00
Christoph Hellwig
7c703e54cc arch: switch the default on ARCH_HAS_SG_CHAIN
These days architectures are mostly out of the business of dealing with
struct scatterlist at all, unless they have architecture specific iommu
drivers.  Replace the ARCH_HAS_SG_CHAIN symbol with a ARCH_NO_SG_CHAIN
one only enabled for architectures with horrible legacy iommu drivers
like alpha and parisc, and conditionally for arm which wants to keep it
disable for legacy platforms.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Palmer Dabbelt <palmer@sifive.com>
2018-12-06 07:04:56 -08:00
Christoph Hellwig
44899aa31f s390: remove the mapping_error dma_map_ops method
S390 already returns (~(dma_addr_t)0x0) on mapping failures, so we can
switch over to returning DMA_MAPPING_ERROR and let the core dma-mapping
code handle the rest.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-06 06:56:39 -08:00
AKASHI Takahiro
b6664ba42f s390, kexec_file: drop arch_kexec_mem_walk()
Since s390 already knows where to locate buffers, calling
arch_kexec_mem_walk() has no sense. So we can just drop it as kbuf->mem
indicates this while all other architectures sets it to 0 initially.

This change is a preparatory work for the next patch, where all the
variant memory walks, either on system resource or memblock, will be
put in one common place so that it will satisfy all the architectures'
need.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Reviewed-by: Philipp Rudo <prudo@linux.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06 14:38:49 +00:00
Ingo Molnar
4bbfd7467c Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull RCU changes from Paul E. McKenney:

- Convert RCU's BUG_ON() and similar calls to WARN_ON() and similar.

- Replace calls of RCU-bh and RCU-sched update-side functions
  to their vanilla RCU counterparts.  This series is a step
  towards complete removal of the RCU-bh and RCU-sched update-side
  functions.

  ( Note that some of these conversions are going upstream via their
    respective maintainers. )

- Documentation updates, including a number of flavor-consolidation
  updates from Joel Fernandes.

- Miscellaneous fixes.

- Automate generation of the initrd filesystem used for
  rcutorture testing.

- Convert spin_is_locked() assertions to instead use lockdep.

  ( Note that some of these conversions are going upstream via their
    respective maintainers. )

- SRCU updates, especially including a fix from Dennis Krein
  for a bag-on-head-class bug.

- RCU torture-test updates.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-12-04 07:52:30 +01:00
Linus Torvalds
0f1f692375 While rewriting the function graph tracer, I discovered a design flaw that
was introduced by a patch that tried to fix one bug, but by doing so created
 another bug. As both bugs corrupt the output (but they do not crash the
 kernel), I decided to fix the design such that it could have both bugs
 fixed. The original fix, fixed time reporting of the function graph tracer
 when doing a max_depth of one. This was code that can test how much the
 kernel interferes with userspace. But in doing so, it could corrupt the time
 keeping of the function profiler.
 
 The issue is that the curr_ret_stack variable was being used for two
 different meanings. One was to keep track of the stack pointer on the
 ret_stack (shadow stack used by the function graph tracer), and the other
 use case was the graph call depth.  Although, the two may be closely
 related, where they got updated was the issue that lead to the two different
 bugs that required the two use cases to be updated differently.
 
 The big issue with this fix is that it requires changing each architecture.
 The good news is, I was able to remove a lot of code that was duplicated
 within the architectures and place it into a single location. Then I could
 make the fix in one place.
 
 I pushed this code into linux-next to let it settle over a week, and before
 doing so, I cross compiled all the affected architectures to make sure that
 they built fine.
 
 In the mean time, I also pulled in a patch that fixes the sched_switch
 previous tasks state output, that was not actually correct.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCW/4NPhQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qnWAAQCyUIRLgYImr81eTl52lxNRsULk+aiI
 U29kRFWWU0c40AEA1X9sDF0MgOItbRGfZtnHTZEousXRDaDf4Fge2kF7Egg=
 =liQ0
 -----END PGP SIGNATURE-----

Merge tag 'trace-v4.20-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:
 "While rewriting the function graph tracer, I discovered a design flaw
  that was introduced by a patch that tried to fix one bug, but by doing
  so created another bug.

  As both bugs corrupt the output (but they do not crash the kernel), I
  decided to fix the design such that it could have both bugs fixed. The
  original fix, fixed time reporting of the function graph tracer when
  doing a max_depth of one. This was code that can test how much the
  kernel interferes with userspace. But in doing so, it could corrupt
  the time keeping of the function profiler.

  The issue is that the curr_ret_stack variable was being used for two
  different meanings. One was to keep track of the stack pointer on the
  ret_stack (shadow stack used by the function graph tracer), and the
  other use case was the graph call depth. Although, the two may be
  closely related, where they got updated was the issue that lead to the
  two different bugs that required the two use cases to be updated
  differently.

  The big issue with this fix is that it requires changing each
  architecture. The good news is, I was able to remove a lot of code
  that was duplicated within the architectures and place it into a
  single location. Then I could make the fix in one place.

  I pushed this code into linux-next to let it settle over a week, and
  before doing so, I cross compiled all the affected architectures to
  make sure that they built fine.

  In the mean time, I also pulled in a patch that fixes the sched_switch
  previous tasks state output, that was not actually correct"

* tag 'trace-v4.20-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  sched, trace: Fix prev_state output in sched_switch tracepoint
  function_graph: Have profiler use curr_ret_stack and not depth
  function_graph: Reverse the order of pushing the ret_stack and the callback
  function_graph: Move return callback before update of curr_ret_stack
  function_graph: Use new curr_ret_depth to manage depth instead of curr_ret_stack
  function_graph: Make ftrace_push_return_trace() static
  sparc/function_graph: Simplify with function_graph_enter()
  sh/function_graph: Simplify with function_graph_enter()
  s390/function_graph: Simplify with function_graph_enter()
  riscv/function_graph: Simplify with function_graph_enter()
  powerpc/function_graph: Simplify with function_graph_enter()
  parisc: function_graph: Simplify with function_graph_enter()
  nds32: function_graph: Simplify with function_graph_enter()
  MIPS: function_graph: Simplify with function_graph_enter()
  microblaze: function_graph: Simplify with function_graph_enter()
  arm64: function_graph: Simplify with function_graph_enter()
  ARM: function_graph: Simplify with function_graph_enter()
  x86/function_graph: Simplify with function_graph_enter()
  function_graph: Create function_graph_enter() to consolidate architecture code
2018-11-30 09:32:34 -08:00
Sergey Senozhatsky
5b39fc049c s390: use common bust_spinlocks()
s390 is the only architecture that is using own bust_spinlocks()
variant, while other arch-s seem to be OK with the common
implementation.

Heiko Carstens [1] said he would prefer s390 to use the common
bust_spinlocks() as well:
  I did some code archaeology and this function is unchanged since ~17
  years. When it was introduced it was close to identical to the x86
  variant. All other architectures use the common code variant in the
  meantime. So if we change this I'd prefer that we switch s390 to the
  common code variant as well. Right now I can't see a reason for not
  doing that

This patch removes s390 bust_spinlocks() and drops the weak attribute
from the common bust_spinlocks() version.

[1] lkml.kernel.org/r/20181025062800.GB4037@osiris
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-11-30 07:22:05 +01:00
Harald Freudenberger
be53479101 s390/zcrypt: improve special ap message cmd handling
There exist very few ap messages which need to have the 'special' flag
enabled. This flag tells the firmware layer to do some pre- and maybe
postprocessing. However, it may happen that this special flag is
enabled but the firmware is unable to deal with this kind of message
and thus returns with reply code 0x41. For example older firmware may
not know the newest messages triggered by the zcrypt device driver and
thus react with reject and the named reply code. Unfortunately this
reply code is not known to the zcrypt error routines and thus default
behavior is to switch the ap queue offline.

This patch now makes the ap error routine aware of the reply code and
so userspace is informed about the bad processing result but the queue
is not switched to offline state any more.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-11-30 07:22:05 +01:00
Harald Freudenberger
159491f3b5 s390/ap: rework assembler functions to use unions for in/out register variables
The inline assembler functions ap_aqic() and ap_qact() used two
variables declared on the very same register. One variable was for
input only, the other for output. Looks like newer versions of the gcc
don't like this. Anyway it is a better coding to use one variable
(which may have a union data type) on one register for input and
output. So this patch introduces unions and uses only one variable now
for input and output for GR1 for the PQAP(QACT) and PQAP(QIC)
invocation.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-11-30 07:22:05 +01:00
Masahiro Yamada
5cfc879cae pcmcia: remove per-arch PCMCIA config entry
Now that all architectures include drivers/pcmcia/Kconfig where
the PCMCIA config is defined, the PCMCIA config entries in per-arch
Kconfig files are redundant.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-11-29 09:39:52 +09:00
Steven Rostedt (VMware)
18588e1487 s390/function_graph: Simplify with function_graph_enter()
The function_graph_enter() function does the work of calling the function
graph hook function and the management of the shadow stack, simplifying the
work done in the architecture dependent prepare_ftrace_return().

Have s390 use the new code, and remove the shadow stack management as well as
having to set up the trace structure.

This is needed to prepare for a fix of a design bug on how the curr_ret_stack
is used.

Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Julian Wiedmann <jwi@linux.ibm.com>
Cc: linux-s390@vger.kernel.org
Cc: stable@kernel.org
Fixes: 03274a3ffb ("tracing/fgraph: Adjust fgraph depth before calling trace return callback")
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-11-27 20:31:30 -05:00
Martin Schwidefsky
814cedbc0b s390/mm: correct pgtable_bytes on page table downgrade
The downgrade of a page table from 3 levels to 2 levels for a 31-bit compat
process removes a pmd table which has to be counted against pgtable_bytes.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-11-27 14:07:12 +01:00
Christoph Hellwig
2eac9c2dfb PCI: consolidate the PCI_DOMAINS and PCI_DOMAINS_GENERIC config options
Move the definitions to drivers/pci and let the architectures select
them.  Two small differences to before: PCI_DOMAINS_GENERIC now selects
PCI_DOMAINS, cutting down the churn for modern architectures.  As the
only architectured arm did previously also offer PCI_DOMAINS as a user
visible choice in addition to selecting it from the relevant configs,
this is gone now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Paul Burton <paul.burton@mips.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-11-23 11:45:44 +09:00
Christoph Hellwig
eb01d42a77 PCI: consolidate PCI config entry in drivers/pci
There is no good reason to duplicate the PCI menu in every architecture.
Instead provide a selectable HAVE_PCI symbol that indicates availability
of PCI support, and a FORCE_PCI symbol to for PCI on and the handle the
rest in drivers/pci.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Palmer Dabbelt <palmer@sifive.com>
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Paul Burton <paul.burton@mips.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-11-23 11:45:34 +09:00
Eric Biggers
1ad0f1603a crypto: drop mask=CRYPTO_ALG_ASYNC from 'cipher' tfm allocations
'cipher' algorithms (single block ciphers) are always synchronous, so
passing CRYPTO_ALG_ASYNC in the mask to crypto_alloc_cipher() has no
effect.  Many users therefore already don't pass it, but some still do.
This inconsistency can cause confusion, especially since the way the
'mask' argument works is somewhat counterintuitive.

Thus, just remove the unneeded CRYPTO_ALG_ASYNC flags.

This patch shouldn't change any actual behavior.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-11-20 14:26:55 +08:00
Thomas Richter
613a41b0d1 s390/cpum_cf: Reject request for sampling in event initialization
On s390 command perf top fails
[root@s35lp76 perf] # ./perf top -F100000  --stdio
   Error:
   cycles: PMU Hardware doesn't support sampling/overflow-interrupts.
   	Try 'perf stat'
[root@s35lp76 perf] #

Using event -e rb0000 works as designed.  Event rb0000 is the event
number of the sampling facility for basic sampling.

During system start up the following PMUs are installed in the kernel's
PMU list (from head to tail):
   cpum_cf --> s390 PMU counter facility device driver
   cpum_sf --> s390 PMU sampling facility device driver
   uprobe
   kprobe
   tracepoint
   task_clock
   cpu_clock

Perf top executes following functions and calls perf_event_open(2) system
call with different parameters many times:

cmd_top
--> __cmd_top
    --> perf_evlist__add_default
        --> __perf_evlist__add_default
            --> perf_evlist__new_cycles (creates event type:0 (HW)
			    		config 0 (CPU_CYCLES)
	        --> perf_event_attr__set_max_precise_ip
		    Uses perf_event_open(2) to detect correct
		    precise_ip level. Fails 3 times on s390 which is ok.

Then functions cmd_top
--> __cmd_top
    --> perf_top__start_counters
        -->perf_evlist__config
	   --> perf_can_comm_exec
               --> perf_probe_api
	           This functions test support for the following events:
		   "cycles:u", "instructions:u", "cpu-clock:u" using
		   --> perf_do_probe_api
		       --> perf_event_open_cloexec
		           Test the close on exec flag support with
			   perf_event_open(2).
	               perf_do_probe_api returns true if the event is
		       supported.
		       The function returns true because event cpu-clock is
		       supported by the PMU cpu_clock.
	               This is achieved by many calls to perf_event_open(2).

Function perf_top__start_counters now calls perf_evsel__open() for every
event, which is the default event cpu_cycles (config:0) and type HARDWARE
(type:0) which a predfined frequence of 4000.

Given the above order of the PMU list, the PMU cpum_cf gets called first
and returns 0, which indicates support for this sampling. The event is
fully allocated in the function perf_event_open (file kernel/event/core.c
near line 10521 and the following check fails:

        event = perf_event_alloc(&attr, cpu, task, group_leader, NULL,
		                 NULL, NULL, cgroup_fd);
	if (IS_ERR(event)) {
		err = PTR_ERR(event);
		goto err_cred;
	}

        if (is_sampling_event(event)) {
		if (event->pmu->capabilities & PERF_PMU_CAP_NO_INTERRUPT) {
			err = -EOPNOTSUPP;
			goto err_alloc;
		}
	}

The check for the interrupt capabilities fails and the system call
perf_event_open() returns -EOPNOTSUPP (-95).

Add a check to return -ENODEV when sampling is requested in PMU cpum_cf.
This allows common kernel code in the perf_event_open() system call to
test the next PMU in above list.

Fixes: 97b1198fec (" "s390, perf: Use common PMU interrupt disabled code")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-11-14 13:44:55 +01:00
Linus Torvalds
3541833fd1 s390 updates for 4.20-rc2
- A fix for the pgtable_bytes misaccounting on s390. The patch changes
    common code part in regard to page table folding and adds extra
    checks to mm_[inc|dec]_nr_[pmds|puds].
 
  - Add FORCE for all build targets using if_changed
 
  - Use non-loadable phdr for the .vmlinux.info section to avoid
    a segment overlap that confuses kexec
 
  - Cleanup the attribute definition for the diagnostic sampling
 
  - Increase stack size for CONFIG_KASAN=y builds
 
  - Export __node_distance to fix a build error
 
  - Correct return code of a PMU event init function
 
  - An update for the default configs
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJb5TIMAAoJEDjwexyKj9rgIH8H/0daZTyxcLwY9gbigaq1Qs4R
 /ScmAJJc2U/Qj8b9UskhsmHAUuAufF2oljU16SquP7CBGhtkLRrjPtdh1AMiiZGM
 reVF7X5LU8MH0QUoNnKPWAL4DD1q2E99IAEH5TeGIODUG6srqvIHBNtXDWNLPtBf
 fpOhJ/NssgxyuYUXi/WnoEjIyP8KABeG6SlpcLzYbmY1hUOIXcixuv39UrL0G5OO
 P8ciL+W5rTcPZCnpJ1Xk9hKploT8gWXhMT5QhNnakgMF/25v80+TZy5xRZMuLAmQ
 T5SFP6B71o05nLK7fLi3VAIKPv/QibjiyJOEf9uUHdo1XZcD5uRu0EQ/LklLUBU=
 =4H06
 -----END PGP SIGNATURE-----

Merge tag 's390-4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 fixes from Martin Schwidefsky:

 - A fix for the pgtable_bytes misaccounting on s390. The patch changes
   common code part in regard to page table folding and adds extra
   checks to mm_[inc|dec]_nr_[pmds|puds].

 - Add FORCE for all build targets using if_changed

 - Use non-loadable phdr for the .vmlinux.info section to avoid a
   segment overlap that confuses kexec

 - Cleanup the attribute definition for the diagnostic sampling

 - Increase stack size for CONFIG_KASAN=y builds

 - Export __node_distance to fix a build error

 - Correct return code of a PMU event init function

 - An update for the default configs

* tag 's390-4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/perf: Change CPUM_CF return code in event init function
  s390: update defconfigs
  s390/mm: Fix ERROR: "__node_distance" undefined!
  s390/kasan: increase instrumented stack size to 64k
  s390/cpum_sf: Rework attribute definition for diagnostic sampling
  s390/mm: fix mis-accounting of pgtable_bytes
  mm: add mm_pxd_folded checks to pgtable_bytes accounting functions
  mm: introduce mm_[p4d|pud|pmd]_folded
  mm: make the __PAGETABLE_PxD_FOLDED defines non-empty
  s390: avoid vmlinux segments overlap
  s390/vdso: add missing FORCE to build targets
  s390/decompressor: add missing FORCE to build targets
2018-11-09 06:30:44 -06:00
Paul E. McKenney
0d4e68e2f3 s390/mm: Convert tlb_table_flush() to use call_rcu()
Now that call_rcu()'s callback is not invoked until after all
preempt-disable regions of code have completed (in addition to explicitly
marked RCU read-side critical sections), call_rcu() can be used in place
of call_rcu_sched().  This commit therefore makes that change.

Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: <linux-s390@vger.kernel.org>
2018-11-08 21:43:20 -08:00
Thomas Richter
0bb2ae1b26 s390/perf: Change CPUM_CF return code in event init function
The function perf_init_event() creates a new event and
assignes it to a PMU. This a done in a loop over all existing
PMUs. For each listed PMU the event init function is called
and if this function does return any other error than -ENOENT,
the loop is terminated the creation of the event fails.

If the event is invalid, return -ENOENT to try other PMUs.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-11-08 07:58:16 +01:00
Martin Schwidefsky
163c8d54a9 compiler: remove __no_sanitize_address_or_inline again
The __no_sanitize_address_or_inline and __no_kasan_or_inline defines
are almost identical. The only difference is that __no_kasan_or_inline
does not have the 'notrace' attribute.

To be able to replace __no_sanitize_address_or_inline with the older
definition, add 'notrace' to __no_kasan_or_inline and change to two
users of __no_sanitize_address_or_inline in the s390 code.

The 'notrace' option is necessary for e.g. the __load_psw_mask function
in arch/s390/include/asm/processor.h. Without the option it is possible
to trace __load_psw_mask which leads to kernel stack overflow.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Pointed-out-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-05 08:14:18 -08:00
Heiko Carstens
02522ad77f s390: update defconfigs
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-11-05 15:10:27 +01:00
Justin M. Forbes
a541f0ebcc s390/mm: Fix ERROR: "__node_distance" undefined!
Fixes:
ERROR: "__node_distance" [drivers/nvme/host/nvme-core.ko] undefined!
make[1]: *** [scripts/Makefile.modpost:92: __modpost] Error 1
make: *** [Makefile:1275: modules] Error 2
+ exit 1

Signed-off-by: Justin M. Forbes <jforbes@fedoraproject.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-11-02 08:31:57 +01:00
Vasily Gorbik
9fed920e68 s390/kasan: increase instrumented stack size to 64k
Increase kasan instrumented kernel stack size from 32k to 64k. Other
architectures seems to get away with just doubling kernel stack size under
kasan, but on s390 this appears to be not enough due to bigger frame size.
The particular pain point is kasan inlined checks (CONFIG_KASAN_INLINE
vs CONFIG_KASAN_OUTLINE). With inlined checks one particular case hitting
stack overflow is fs sync on xfs filesystem:

 #0 [9a0681e8]  704 bytes  check_usage at 34b1fc
 #1 [9a0684a8]  432 bytes  check_usage at 34c710
 #2 [9a068658]  1048 bytes  validate_chain at 35044a
 #3 [9a068a70]  312 bytes  __lock_acquire at 3559fe
 #4 [9a068ba8]  440 bytes  lock_acquire at 3576ee
 #5 [9a068d60]  104 bytes  _raw_spin_lock at 21b44e0
 #6 [9a068dc8]  1992 bytes  enqueue_entity at 2dbf72
 #7 [9a069590]  1496 bytes  enqueue_task_fair at 2df5f0
 #8 [9a069b68]  64 bytes  ttwu_do_activate at 28f438
 #9 [9a069ba8]  552 bytes  try_to_wake_up at 298c4c
 #10 [9a069dd0]  168 bytes  wake_up_worker at 23f97c
 #11 [9a069e78]  200 bytes  insert_work at 23fc2e
 #12 [9a069f40]  648 bytes  __queue_work at 2487c0
 #13 [9a06a1c8]  200 bytes  __queue_delayed_work at 24db28
 #14 [9a06a290]  248 bytes  mod_delayed_work_on at 24de84
 #15 [9a06a388]  24 bytes  kblockd_mod_delayed_work_on at 153e2a0
 #16 [9a06a3a0]  288 bytes  __blk_mq_delay_run_hw_queue at 158168c
 #17 [9a06a4c0]  192 bytes  blk_mq_run_hw_queue at 1581a3c
 #18 [9a06a580]  184 bytes  blk_mq_sched_insert_requests at 15a2192
 #19 [9a06a638]  1024 bytes  blk_mq_flush_plug_list at 1590f3a
 #20 [9a06aa38]  704 bytes  blk_flush_plug_list at 1555028
 #21 [9a06acf8]  320 bytes  schedule at 219e476
 #22 [9a06ae38]  760 bytes  schedule_timeout at 21b0aac
 #23 [9a06b130]  408 bytes  wait_for_common at 21a1706
 #24 [9a06b2c8]  360 bytes  xfs_buf_iowait at fa1540
 #25 [9a06b430]  256 bytes  __xfs_buf_submit at fadae6
 #26 [9a06b530]  264 bytes  xfs_buf_read_map at fae3f6
 #27 [9a06b638]  656 bytes  xfs_trans_read_buf_map at 10ac9a8
 #28 [9a06b8c8]  304 bytes  xfs_btree_kill_root at e72426
 #29 [9a06b9f8]  288 bytes  xfs_btree_lookup_get_block at e7bc5e
 #30 [9a06bb18]  624 bytes  xfs_btree_lookup at e7e1a6
 #31 [9a06bd88]  2664 bytes  xfs_alloc_ag_vextent_near at dfa070
 #32 [9a06c7f0]  144 bytes  xfs_alloc_ag_vextent at dff3ca
 #33 [9a06c880]  1128 bytes  xfs_alloc_vextent at e05fce
 #34 [9a06cce8]  584 bytes  xfs_bmap_btalloc at e58342
 #35 [9a06cf30]  1336 bytes  xfs_bmapi_write at e618de
 #36 [9a06d468]  776 bytes  xfs_iomap_write_allocate at ff678e
 #37 [9a06d770]  720 bytes  xfs_map_blocks at f82af8
 #38 [9a06da40]  928 bytes  xfs_writepage_map at f83cd6
 #39 [9a06dde0]  320 bytes  xfs_do_writepage at f85872
 #40 [9a06df20]  1320 bytes  write_cache_pages at 73dfe8
 #41 [9a06e448]  208 bytes  xfs_vm_writepages at f7f892
 #42 [9a06e518]  88 bytes  do_writepages at 73fe6a
 #43 [9a06e570]  872 bytes  __writeback_single_inode at a20cb6
 #44 [9a06e8d8]  664 bytes  writeback_sb_inodes at a23be2
 #45 [9a06eb70]  296 bytes  __writeback_inodes_wb at a242e0
 #46 [9a06ec98]  928 bytes  wb_writeback at a2500e
 #47 [9a06f038]  848 bytes  wb_do_writeback at a260ae
 #48 [9a06f388]  536 bytes  wb_workfn at a28228
 #49 [9a06f5a0]  1088 bytes  process_one_work at 24a234
 #50 [9a06f9e0]  1120 bytes  worker_thread at 24ba26
 #51 [9a06fe40]  104 bytes  kthread at 26545a
 #52 [9a06fea8]             kernel_thread_starter at 21b6b62

To be able to increase the stack size to 64k reuse LLILL instruction
in __switch_to function to load 64k - STACK_FRAME_OVERHEAD - __PT_SIZE
(65192) value as unsigned.

Reported-by: Benjamin Block <bblock@linux.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-11-02 08:31:57 +01:00
Thomas Richter
c43e1c5a80 s390/cpum_sf: Rework attribute definition for diagnostic sampling
Previously, the attribute entry for diagnostic sampling was added
if authorized.  Otherwise, the array of struct attribute contains
two NULL values.

Change this logic and reserve space for the attribute for diagnostic
sampling. If diagnostic sampling is authorized, add an entry in the
respective position in the array of struct attribute.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Suggested-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-11-02 08:31:56 +01:00
Martin Schwidefsky
e12e4044ae s390/mm: fix mis-accounting of pgtable_bytes
In case a fork or a clone system fails in copy_process and the error
handling does the mmput() at the bad_fork_cleanup_mm label, the
following warning messages will appear on the console:

  BUG: non-zero pgtables_bytes on freeing mm: 16384

The reason for that is the tricks we play with mm_inc_nr_puds() and
mm_inc_nr_pmds() in init_new_context().

A normal 64-bit process has 3 levels of page table, the p4d level and
the pud level are folded. On process termination the free_pud_range()
function in mm/memory.c will subtract 16KB from pgtable_bytes with a
mm_dec_nr_puds() call, but there actually is not really a pud table.

One issue with this is the fact that pgtable_bytes is usually off
by a few kilobytes, but the more severe problem is that for a failed
fork or clone the free_pgtables() function is not called. In this case
there is no mm_dec_nr_puds() or mm_dec_nr_pmds() that go together with
the mm_inc_nr_puds() and mm_inc_nr_pmds in init_new_context().
The pgtable_bytes will be off by 16384 or 32768 bytes and we get the
BUG message. The message itself is purely cosmetic, but annoying.

To fix this override the mm_pmd_folded, mm_pud_folded and mm_p4d_folded
function to check for the true size of the address space.

Reported-by: Li Wang <liwang@redhat.com>
Tested-by: Li Wang <liwang@redhat.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-11-02 08:31:55 +01:00
Mike Rapoport
57c8a661d9 mm: remove include/linux/bootmem.h
Move remaining definitions and declarations from include/linux/bootmem.h
into include/linux/memblock.h and remove the redundant header.

The includes were replaced with the semantic patch below and then
semi-automated removal of duplicated '#include <linux/memblock.h>

@@
@@
- #include <linux/bootmem.h>
+ #include <linux/memblock.h>

[sfr@canb.auug.org.au: dma-direct: fix up for the removal of linux/bootmem.h]
  Link: http://lkml.kernel.org/r/20181002185342.133d1680@canb.auug.org.au
[sfr@canb.auug.org.au: powerpc: fix up for removal of linux/bootmem.h]
  Link: http://lkml.kernel.org/r/20181005161406.73ef8727@canb.auug.org.au
[sfr@canb.auug.org.au: x86/kaslr, ACPI/NUMA: fix for linux/bootmem.h removal]
  Link: http://lkml.kernel.org/r/20181008190341.5e396491@canb.auug.org.au
Link: http://lkml.kernel.org/r/1536927045-23536-30-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Serge Semin <fancer.lancer@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-31 08:54:16 -07:00
Mike Rapoport
c6ffc5ca8f memblock: rename free_all_bootmem to memblock_free_all
The conversion is done using

sed -i 's@free_all_bootmem@memblock_free_all@' \
    $(git grep -l free_all_bootmem)

Link: http://lkml.kernel.org/r/1536927045-23536-26-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Serge Semin <fancer.lancer@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-31 08:54:16 -07:00
Mike Rapoport
eb31d559f1 memblock: remove _virt from APIs returning virtual address
The conversion is done using

sed -i 's@memblock_virt_alloc@memblock_alloc@g' \
	$(git grep -l memblock_virt_alloc)

Link: http://lkml.kernel.org/r/1536927045-23536-8-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Serge Semin <fancer.lancer@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-31 08:54:15 -07:00
Mike Rapoport
9a8dd708d5 memblock: rename memblock_alloc{_nid,_try_nid} to memblock_phys_alloc*
Make it explicit that the caller gets a physical address rather than a
virtual one.

This will also allow using meblock_alloc prefix for memblock allocations
returning virtual address, which is done in the following patches.

The conversion is done using the following semantic patch:

@@
expression e1, e2, e3;
@@
(
- memblock_alloc(e1, e2)
+ memblock_phys_alloc(e1, e2)
|
- memblock_alloc_nid(e1, e2, e3)
+ memblock_phys_alloc_nid(e1, e2, e3)
|
- memblock_alloc_try_nid(e1, e2, e3)
+ memblock_phys_alloc_try_nid(e1, e2, e3)
)

Link: http://lkml.kernel.org/r/1536927045-23536-7-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Serge Semin <fancer.lancer@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-31 08:54:15 -07:00
Mike Rapoport
aca52c3983 mm: remove CONFIG_HAVE_MEMBLOCK
All architecures use memblock for early memory management. There is no need
for the CONFIG_HAVE_MEMBLOCK configuration option.

[rppt@linux.vnet.ibm.com: of/fdt: fixup #ifdefs]
  Link: http://lkml.kernel.org/r/20180919103457.GA20545@rapoport-lnx
[rppt@linux.vnet.ibm.com: csky: fixups after bootmem removal]
  Link: http://lkml.kernel.org/r/20180926112744.GC4628@rapoport-lnx
[rppt@linux.vnet.ibm.com: remove stale #else and the code it protects]
  Link: http://lkml.kernel.org/r/1538067825-24835-1-git-send-email-rppt@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/1536927045-23536-4-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Tested-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Serge Semin <fancer.lancer@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-31 08:54:15 -07:00
Mike Rapoport
b4a991ec58 mm: remove CONFIG_NO_BOOTMEM
All achitectures select NO_BOOTMEM which essentially becomes 'Y' for any
kernel configuration and therefore it can be removed.

[alexander.h.duyck@linux.intel.com: remove now defunct NO_BOOTMEM from depends list for deferred init]
  Link: http://lkml.kernel.org/r/20180925201814.3576.15105.stgit@localhost.localdomain
Link: http://lkml.kernel.org/r/1536927045-23536-3-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Serge Semin <fancer.lancer@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-31 08:54:14 -07:00
Nick Desaulniers
de0d22e50c treewide: remove current_text_addr
Prefer _THIS_IP_ defined in linux/kernel.h.

Most definitions of current_text_addr were the same as _THIS_IP_, but
a few archs had inline assembly instead.

This patch removes the final call site of current_text_addr, making all
of the definitions dead code.

[akpm@linux-foundation.org: fix arch/csky/include/asm/processor.h]
Link: http://lkml.kernel.org/r/20180911182413.180715-1-ndesaulniers@google.com
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-31 08:54:12 -07:00
Johannes Weiner
8508cf3ffa sched: loadavg: consolidate LOAD_INT, LOAD_FRAC, CALC_LOAD
There are several definitions of those functions/macros in places that
mess with fixed-point load averages.  Provide an official version.

[akpm@linux-foundation.org: fix missed conversion in block/blk-iolatency.c]
Link: http://lkml.kernel.org/r/20180828172258.3185-5-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Suren Baghdasaryan <surenb@google.com>
Tested-by: Daniel Drake <drake@endlessm.com>
Cc: Christopher Lameter <cl@linux.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Johannes Weiner <jweiner@fb.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Enderborg <peter.enderborg@sony.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vinayak Menon <vinmenon@codeaurora.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-26 16:26:32 -07:00
Vasily Gorbik
5a2e1853d6 s390: avoid vmlinux segments overlap
Currently .vmlinux.info section of uncompressed vmlinux elf image is
included into the data segment and load address specified as 0. That
extends data segment to address 0 and makes "text" and "data" segments
overlap.
Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000001000 0x0000000000100000 0x0000000000100000
                 0x0000000000ead03c 0x0000000000ead03c  R E    0x1000
  LOAD           0x0000000000eaf000 0x0000000000000000 0x0000000000000000
                 0x0000000001a13400 0x000000000233b520  RWE    0x1000
  NOTE           0x0000000000eae000 0x0000000000fad000 0x0000000000fad000
                 0x000000000000003c 0x000000000000003c         0x4

 Section to Segment mapping:
  Segment Sections...
   00     .text .notes
   01     .rodata __ksymtab __ksymtab_gpl __ksymtab_strings __param
   __modver .data..ro_after_init __ex_table .data __bug_table .init.text
   .exit.text .exit.data .altinstructions .altinstr_replacement
   .nospec_call_table .nospec_return_table .boot.data .init.data
   .data..percpu .bss .vmlinux.info
   02     .notes

Later when vmlinux.bin is produced from vmlinux, .vmlinux.info section
is removed. But elf vmlinux file, even though it is not bootable anymore,
used for debugging and loadable segments overlap should be avoided.

Utilize special ":NONE" phdr specification to avoid adding .vmlinux.info
into loadable data segment. Also set .vmlinux.info section type to INFO,
which allows to get a not-loadable info CONTENTS section.

Since minimal supported version of binutils 2.20 does not have
--dump-section objcopy option, make .vmlinux.info section loadable during
info.bin creation to get actual section contents.

Reported-by: Philipp Rudo <prudo@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-26 10:19:40 +02:00
Vasily Gorbik
b44b136a37 s390/vdso: add missing FORCE to build targets
According to Documentation/kbuild/makefiles.txt all build targets using
if_changed should use FORCE as well. Add missing FORCE to make sure
vdso targets are rebuild properly when not just immediate prerequisites
have changed but also when build command differs.

Reviewed-by: Philipp Rudo <prudo@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-26 10:19:40 +02:00
Vasily Gorbik
ef5febae15 s390/decompressor: add missing FORCE to build targets
According to Documentation/kbuild/makefiles.txt all build targets
using if_changed should use FORCE as well. Add missing FORCE to make
sure vmlinux decompressor targets are rebuild properly when not just
immediate prerequisites have changed but also when build command differs.

Reviewed-by: Philipp Rudo <prudo@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-26 10:19:40 +02:00
Linus Torvalds
0d1e8b8d2b KVM updates for v4.20
ARM:
  - Improved guest IPA space support (32 to 52 bits)
 
  - RAS event delivery for 32bit
 
  - PMU fixes
 
  - Guest entry hardening
 
  - Various cleanups
 
  - Port of dirty_log_test selftest
 
 PPC:
  - Nested HV KVM support for radix guests on POWER9.  The performance is
    much better than with PR KVM.  Migration and arbitrary level of
    nesting is supported.
 
  - Disable nested HV-KVM on early POWER9 chips that need a particular hardware
    bug workaround
 
  - One VM per core mode to prevent potential data leaks
 
  - PCI pass-through optimization
 
  - merge ppc-kvm topic branch and kvm-ppc-fixes to get a better base
 
 s390:
  - Initial version of AP crypto virtualization via vfio-mdev
 
  - Improvement for vfio-ap
 
  - Set the host program identifier
 
  - Optimize page table locking
 
 x86:
  - Enable nested virtualization by default
 
  - Implement Hyper-V IPI hypercalls
 
  - Improve #PF and #DB handling
 
  - Allow guests to use Enlightened VMCS
 
  - Add migration selftests for VMCS and Enlightened VMCS
 
  - Allow coalesced PIO accesses
 
  - Add an option to perform nested VMCS host state consistency check
    through hardware
 
  - Automatic tuning of lapic_timer_advance_ns
 
  - Many fixes, minor improvements, and cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABCAAGBQJb0FINAAoJEED/6hsPKofoI60IAJRS3vOAQ9Fav8cJsO1oBHcX
 3+NexfnBke1bzrjIR3SUcHKGZbdnVPNZc+Q4JjIbPpPmmOMU5jc9BC1dmd5f4Vzh
 BMnQ0yCvgFv3A3fy/Icx1Z8NJppxosdmqdQLrQrNo8aD3cjnqY2yQixdXrAfzLzw
 XEgKdIFCCz8oVN/C9TT4wwJn6l9OE7BM5bMKGFy5VNXzMu7t64UDOLbbjZxNgi1g
 teYvfVGdt5mH0N7b2GPPWRbJmgnz5ygVVpVNQUEFrdKZoCm6r5u9d19N+RRXAwan
 ZYFj10W2T8pJOUf3tryev4V33X7MRQitfJBo4tP5hZfi9uRX89np5zP1CFE7AtY=
 =yEPW
 -----END PGP SIGNATURE-----

Merge tag 'kvm-4.20-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Radim Krčmář:
 "ARM:
   - Improved guest IPA space support (32 to 52 bits)

   - RAS event delivery for 32bit

   - PMU fixes

   - Guest entry hardening

   - Various cleanups

   - Port of dirty_log_test selftest

  PPC:
   - Nested HV KVM support for radix guests on POWER9. The performance
     is much better than with PR KVM. Migration and arbitrary level of
     nesting is supported.

   - Disable nested HV-KVM on early POWER9 chips that need a particular
     hardware bug workaround

   - One VM per core mode to prevent potential data leaks

   - PCI pass-through optimization

   - merge ppc-kvm topic branch and kvm-ppc-fixes to get a better base

  s390:
   - Initial version of AP crypto virtualization via vfio-mdev

   - Improvement for vfio-ap

   - Set the host program identifier

   - Optimize page table locking

  x86:
   - Enable nested virtualization by default

   - Implement Hyper-V IPI hypercalls

   - Improve #PF and #DB handling

   - Allow guests to use Enlightened VMCS

   - Add migration selftests for VMCS and Enlightened VMCS

   - Allow coalesced PIO accesses

   - Add an option to perform nested VMCS host state consistency check
     through hardware

   - Automatic tuning of lapic_timer_advance_ns

   - Many fixes, minor improvements, and cleanups"

* tag 'kvm-4.20-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (204 commits)
  KVM/nVMX: Do not validate that posted_intr_desc_addr is page aligned
  Revert "kvm: x86: optimize dr6 restore"
  KVM: PPC: Optimize clearing TCEs for sparse tables
  x86/kvm/nVMX: tweak shadow fields
  selftests/kvm: add missing executables to .gitignore
  KVM: arm64: Safety check PSTATE when entering guest and handle IL
  KVM: PPC: Book3S HV: Don't use streamlined entry path on early POWER9 chips
  arm/arm64: KVM: Enable 32 bits kvm vcpu events support
  arm/arm64: KVM: Rename function kvm_arch_dev_ioctl_check_extension()
  KVM: arm64: Fix caching of host MDCR_EL2 value
  KVM: VMX: enable nested virtualization by default
  KVM/x86: Use 32bit xor to clear registers in svm.c
  kvm: x86: Introduce KVM_CAP_EXCEPTION_PAYLOAD
  kvm: vmx: Defer setting of DR6 until #DB delivery
  kvm: x86: Defer setting of CR2 until #PF delivery
  kvm: x86: Add payload operands to kvm_multiple_exception
  kvm: x86: Add exception payload fields to kvm_vcpu_events
  kvm: x86: Add has_payload and payload to kvm_queued_exception
  KVM: Documentation: Fix omission in struct kvm_vcpu_events
  KVM: selftests: add Enlightened VMCS test
  ...
2018-10-25 17:57:35 -07:00
Linus Torvalds
62606c224d Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
 "API:
   - Remove VLA usage
   - Add cryptostat user-space interface
   - Add notifier for new crypto algorithms

  Algorithms:
   - Add OFB mode
   - Remove speck

  Drivers:
   - Remove x86/sha*-mb as they are buggy
   - Remove pcbc(aes) from x86/aesni
   - Improve performance of arm/ghash-ce by up to 85%
   - Implement CTS-CBC in arm64/aes-blk, faster by up to 50%
   - Remove PMULL based arm64/crc32 driver
   - Use PMULL in arm64/crct10dif
   - Add aes-ctr support in s5p-sss
   - Add caam/qi2 driver

  Others:
   - Pick better transform if one becomes available in crc-t10dif"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (124 commits)
  crypto: chelsio - Update ntx queue received from cxgb4
  crypto: ccree - avoid implicit enum conversion
  crypto: caam - add SPDX license identifier to all files
  crypto: caam/qi - simplify CGR allocation, freeing
  crypto: mxs-dcp - make symbols 'sha1_null_hash' and 'sha256_null_hash' static
  crypto: arm64/aes-blk - ensure XTS mask is always loaded
  crypto: testmgr - fix sizeof() on COMP_BUF_SIZE
  crypto: chtls - remove set but not used variable 'csk'
  crypto: axis - fix platform_no_drv_owner.cocci warnings
  crypto: x86/aes-ni - fix build error following fpu template removal
  crypto: arm64/aes - fix handling sub-block CTS-CBC inputs
  crypto: caam/qi2 - avoid double export
  crypto: mxs-dcp - Fix AES issues
  crypto: mxs-dcp - Fix SHA null hashes and output length
  crypto: mxs-dcp - Implement sha import/export
  crypto: aegis/generic - fix for big endian systems
  crypto: morus/generic - fix for big endian systems
  crypto: lrw - fix rebase error after out of bounds fix
  crypto: cavium/nitrox - use pci_alloc_irq_vectors() while enabling MSI-X.
  crypto: cavium/nitrox - NITROX command queue changes.
  ...
2018-10-25 16:43:35 -07:00
Linus Torvalds
4dcb9239da Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timekeeping updates from Thomas Gleixner:
 "The timers and timekeeping departement provides:

   - Another large y2038 update with further preparations for providing
     the y2038 safe timespecs closer to the syscalls.

   - An overhaul of the SHCMT clocksource driver

   - SPDX license identifier updates

   - Small cleanups and fixes all over the place"

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits)
  tick/sched : Remove redundant cpu_online() check
  clocksource/drivers/dw_apb: Add reset control
  clocksource: Remove obsolete CLOCKSOURCE_OF_DECLARE
  clocksource/drivers: Unify the names to timer-* format
  clocksource/drivers/sh_cmt: Add R-Car gen3 support
  dt-bindings: timer: renesas: cmt: document R-Car gen3 support
  clocksource/drivers/sh_cmt: Properly line-wrap sh_cmt_of_table[] initializer
  clocksource/drivers/sh_cmt: Fix clocksource width for 32-bit machines
  clocksource/drivers/sh_cmt: Fixup for 64-bit machines
  clocksource/drivers/sh_tmu: Convert to SPDX identifiers
  clocksource/drivers/sh_mtu2: Convert to SPDX identifiers
  clocksource/drivers/sh_cmt: Convert to SPDX identifiers
  clocksource/drivers/renesas-ostm: Convert to SPDX identifiers
  clocksource: Convert to using %pOFn instead of device_node.name
  tick/broadcast: Remove redundant check
  RISC-V: Request newstat syscalls
  y2038: signal: Change rt_sigtimedwait to use __kernel_timespec
  y2038: socket: Change recvmmsg to use __kernel_timespec
  y2038: sched: Change sched_rr_get_interval to use __kernel_timespec
  y2038: utimes: Rework #ifdef guards for compat syscalls
  ...
2018-10-25 11:14:36 -07:00
Linus Torvalds
ba9f6f8954 Merge branch 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull siginfo updates from Eric Biederman:
 "I have been slowly sorting out siginfo and this is the culmination of
  that work.

  The primary result is in several ways the signal infrastructure has
  been made less error prone. The code has been updated so that manually
  specifying SEND_SIG_FORCED is never necessary. The conversion to the
  new siginfo sending functions is now complete, which makes it
  difficult to send a signal without filling in the proper siginfo
  fields.

  At the tail end of the patchset comes the optimization of decreasing
  the size of struct siginfo in the kernel from 128 bytes to about 48
  bytes on 64bit. The fundamental observation that enables this is by
  definition none of the known ways to use struct siginfo uses the extra
  bytes.

  This comes at the cost of a small user space observable difference.
  For the rare case of siginfo being injected into the kernel only what
  can be copied into kernel_siginfo is delivered to the destination, the
  rest of the bytes are set to 0. For cases where the signal and the
  si_code are known this is safe, because we know those bytes are not
  used. For cases where the signal and si_code combination is unknown
  the bits that won't fit into struct kernel_siginfo are tested to
  verify they are zero, and the send fails if they are not.

  I made an extensive search through userspace code and I could not find
  anything that would break because of the above change. If it turns out
  I did break something it will take just the revert of a single change
  to restore kernel_siginfo to the same size as userspace siginfo.

  Testing did reveal dependencies on preferring the signo passed to
  sigqueueinfo over si->signo, so bit the bullet and added the
  complexity necessary to handle that case.

  Testing also revealed bad things can happen if a negative signal
  number is passed into the system calls. Something no sane application
  will do but something a malicious program or a fuzzer might do. So I
  have fixed the code that performs the bounds checks to ensure negative
  signal numbers are handled"

* 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (80 commits)
  signal: Guard against negative signal numbers in copy_siginfo_from_user32
  signal: Guard against negative signal numbers in copy_siginfo_from_user
  signal: In sigqueueinfo prefer sig not si_signo
  signal: Use a smaller struct siginfo in the kernel
  signal: Distinguish between kernel_siginfo and siginfo
  signal: Introduce copy_siginfo_from_user and use it's return value
  signal: Remove the need for __ARCH_SI_PREABLE_SIZE and SI_PAD_SIZE
  signal: Fail sigqueueinfo if si_signo != sig
  signal/sparc: Move EMT_TAGOVF into the generic siginfo.h
  signal/unicore32: Use force_sig_fault where appropriate
  signal/unicore32: Generate siginfo in ucs32_notify_die
  signal/unicore32: Use send_sig_fault where appropriate
  signal/arc: Use force_sig_fault where appropriate
  signal/arc: Push siginfo generation into unhandled_exception
  signal/ia64: Use force_sig_fault where appropriate
  signal/ia64: Use the force_sig(SIGSEGV,...) in ia64_rt_sigreturn
  signal/ia64: Use the generic force_sigsegv in setup_frame
  signal/arm/kvm: Use send_sig_mceerr
  signal/arm: Use send_sig_fault where appropriate
  signal/arm: Use force_sig_fault where appropriate
  ...
2018-10-24 11:22:39 +01:00
Linus Torvalds
0200fbdd43 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking and misc x86 updates from Ingo Molnar:
 "Lots of changes in this cycle - in part because locking/core attracted
  a number of related x86 low level work which was easier to handle in a
  single tree:

   - Linux Kernel Memory Consistency Model updates (Alan Stern, Paul E.
     McKenney, Andrea Parri)

   - lockdep scalability improvements and micro-optimizations (Waiman
     Long)

   - rwsem improvements (Waiman Long)

   - spinlock micro-optimization (Matthew Wilcox)

   - qspinlocks: Provide a liveness guarantee (more fairness) on x86.
     (Peter Zijlstra)

   - Add support for relative references in jump tables on arm64, x86
     and s390 to optimize jump labels (Ard Biesheuvel, Heiko Carstens)

   - Be a lot less permissive on weird (kernel address) uaccess faults
     on x86: BUG() when uaccess helpers fault on kernel addresses (Jann
     Horn)

   - macrofy x86 asm statements to un-confuse the GCC inliner. (Nadav
     Amit)

   - ... and a handful of other smaller changes as well"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (57 commits)
  locking/lockdep: Make global debug_locks* variables read-mostly
  locking/lockdep: Fix debug_locks off performance problem
  locking/pvqspinlock: Extend node size when pvqspinlock is configured
  locking/qspinlock_stat: Count instances of nested lock slowpaths
  locking/qspinlock, x86: Provide liveness guarantee
  x86/asm: 'Simplify' GEN_*_RMWcc() macros
  locking/qspinlock: Rework some comments
  locking/qspinlock: Re-order code
  locking/lockdep: Remove duplicated 'lock_class_ops' percpu array
  x86/defconfig: Enable CONFIG_USB_XHCI_HCD=y
  futex: Replace spin_is_locked() with lockdep
  locking/lockdep: Make class->ops a percpu counter and move it under CONFIG_DEBUG_LOCKDEP=y
  x86/jump-labels: Macrofy inline assembly code to work around GCC inlining bugs
  x86/cpufeature: Macrofy inline assembly code to work around GCC inlining bugs
  x86/extable: Macrofy inline assembly code to work around GCC inlining bugs
  x86/paravirt: Work around GCC inlining bugs when compiling paravirt ops
  x86/bug: Macrofy the BUG table section handling, to work around GCC inlining bugs
  x86/alternatives: Macrofy lock prefixes to work around GCC inlining bugs
  x86/refcount: Work around GCC inlining bug
  x86/objtool: Use asm macros to work around GCC inlining bugs
  ...
2018-10-23 13:08:53 +01:00
Linus Torvalds
e2b623fbe6 s390 updates for the 4.20 merge window
- Improved access control for the zcrypt driver, multiple device nodes
    can now be created with different access control lists
 
  - Extend the pkey API to provide random protected keys, this is useful
    for encrypted swap device with ephemeral protected keys
 
  - Add support for virtually mapped kernel stacks
 
  - Rework the early boot code, this moves the memory detection into the
    boot code that runs prior to decompression.
 
  - Add KASAN support
 
  - Bug fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJbzXKpAAoJEDjwexyKj9rg98YH/jZ5/kEYV44JsACroTNBC782
 6QLCvoCvSgXUAqRwnIfnxrcjrUVNW2aK6rOSsI/I8rQDsSA3boJ7FimoEI2BsUZG
 dcMy0hC47AYB7yKREQX3gdDEj8f0bn8v2ize5F6gwLkIx0A+aBUSivRQeYMaF8sn
 N/5OkSJwjCb+ZkNmDa3SHif+hC5+iL+q1hfuBdQkeCBok9pAqhyosRkgLe8CQgUV
 HGrvaWJ4FudIpg4tu2jL2OsNoZFX2pK5d+Up886+KGKQEUfiXKYtdmzX17Vd7PIk
 Vkf7EWUipzIA7UtrJ6pljoFsrNa+83jm4j5Dgy0ohadCVUBYLORte3yEl4P1EoM=
 =MMf0
 -----END PGP SIGNATURE-----

Merge tag 's390-4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Martin Schwidefsky:

 - Improved access control for the zcrypt driver, multiple device nodes
   can now be created with different access control lists

 - Extend the pkey API to provide random protected keys, this is useful
   for encrypted swap device with ephemeral protected keys

 - Add support for virtually mapped kernel stacks

 - Rework the early boot code, this moves the memory detection into the
   boot code that runs prior to decompression.

 - Add KASAN support

 - Bug fixes and cleanups

* tag 's390-4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (83 commits)
  s390/pkey: move pckmo subfunction available checks away from module init
  s390/kasan: support preemptible kernel build
  s390/pkey: Load pkey kernel module automatically
  s390/perf: Return error when debug_register fails
  s390/sthyi: Fix machine name validity indication
  s390/zcrypt: fix broken zcrypt_send_cprb in-kernel api function
  s390/vmalloc: fix VMALLOC_START calculation
  s390/mem_detect: add missing include
  s390/dumpstack: print psw mask and address again
  s390/crypto: Enhance paes cipher to accept variable length key material
  s390/pkey: Introduce new API for transforming key blobs
  s390/pkey: Introduce new API for random protected key verification
  s390/pkey: Add sysfs attributes to emit secure key blobs
  s390/pkey: Add sysfs attributes to emit protected key blobs
  s390/pkey: Define protected key blob format
  s390/pkey: Introduce new API for random protected key generation
  s390/zcrypt: add ap_adapter_mask sysfs attribute
  s390/zcrypt: provide apfs failure code on type 86 error reply
  s390/zcrypt: zcrypt device driver cleanup
  s390/kasan: add support for mem= kernel parameter
  ...
2018-10-23 11:14:47 +01:00
Vasily Gorbik
cf3dbe5dac s390/kasan: support preemptible kernel build
When the kernel is built with:
CONFIG_PREEMPT=y
CONFIG_PREEMPT_COUNT=y
"stfle" function used by kasan initialization code makes additional
call to preempt_count_add/preempt_count_sub. To avoid removing kasan
instrumentation from sched code where those functions leave split stfle
function and provide __stfle variant without preemption handling to be
used by Kasan.

Reported-by: Benjamin Block <bblock@linux.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-22 08:37:45 +02:00
Thomas Richter
ec0c0bb489 s390/perf: Return error when debug_register fails
Return an error when the function debug_register() fails allocating
the debug handle.
Also remove the registered debug handle when the initialization fails
later on.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-19 08:18:16 +02:00
Janosch Frank
b5130dc222 s390/sthyi: Fix machine name validity indication
When running as a level 3 guest with no host provided sthyi support
sclp_ocf_cpc_name_copy() will only return zeroes. Zeroes are not a
valid group name, so let's not indicate that the group name field is
valid.

Also the group name is not dependent on stsi, let's not return based
on stsi before setting it.

Fixes: 95ca2cb579 ("KVM: s390: Add sthyi emulation")
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-15 12:17:00 +02:00
Paolo Bonzini
3d0d0d9b1d KVM: s390/vfio-ap: Fixes and enhancements for vfio-ap
- add tracing
 - fix a locking bug
 - make local functions and data static
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJbwKzMAAoJEBF7vIC1phx8DR0QALWLdmVtMioQeeoas9LYurI0
 VuFjM5QsH9hVkjDZIP7Y0titQz1L4WWqIwZVffHmQGL8saRr+fd/7gBwAReKgZVU
 OrZDtUpqS1PBsrKQx36MjWrZ5n4R5tzvW38xiPevEIBLq+rtQ1GbiCs3rtRwKlur
 uABquv9uYr8GHrmYa9bUvUpbVvHQvWz/h8T4cjgwAuN0NER6PzqMUzcZBt2Q9s29
 26ZQ+r7CZ2qklJvoB8UOsrWdsZhM58BaY+CJzrAsxD3OAnPJGILpTFW2dXIoVfkh
 LMuuuzl8Tl0ntwJKjifhut7/f9VX9ipTvmA53e2moq52UEA2mJoOzu1Ku/KAivLe
 4efycTIvYRV1UKE1JLWlS/5z6fNg9eG2CykSqRlrznEPiGNTPMY5JemtvrPINkcZ
 QrdbI6ou+grFXlfaG+KcS2iFOgrMqL1UWABiq1jJVW2RAK1ZeUBFHKVeJwKVKSeW
 p9xbh7jl7yIvQ8bfsO8P3LVFWK0EmxJt6oA7ln4X7O1Pbx1QBeH5ZMBsdqLJZFsT
 AQIT/p51JjIF6H4V8/jBCRYuR91IcD9CQlRR96y8zfHiaEYlvS1YFHuZdLZCJ7Ef
 LFLVIHXGfQLHoSN2r/8us7OmmVDJctKwPGLQuh2RcMFJPySm/iBBX9m57S7VWjAb
 87wLYmntUMcFHIWFgixT
 =hZdt
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-next-4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390/vfio-ap: Fixes and enhancements for vfio-ap

- add tracing
- fix a locking bug
- make local functions and data static
2018-10-13 12:00:26 +02:00
Mikhail Zaslonko
5eaf436e0e s390/vmalloc: fix VMALLOC_START calculation
With the introduction of the module area on top of the vmalloc area, the
calculation of VMALLOC_START in setup_memory_end() function hasn't been
adjusted. As a result we got vmalloc area 2 Gb (MODULES_LEN) smaller than
it should be and the preceding vmemmap area got extra memory instead.
The patch fixes this calculation error although there were no visible
negative effects.
Apart from that, change 'tmp' variable to 'vmemmap' in memory_end
calculation for better readability.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Mikhail Zaslonko <zaslonko@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-11 17:02:38 +02:00
Greg Kroah-Hartman
3d647e6268 s390 fixes for 4.19-rc8
Four more patches for 4.19
 
  - Fix resume after suspend-to-disk if resume-CPU != suspend-CPU
 
  - Fix vfio-ccw check for pinned pages
 
  - Two patches to avoid a usercopy-whitelist warning in vfio-ccw
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJbvEC8AAoJEDjwexyKj9rg9vEIANHnOfx9qzM6grkvc3hH4kTR
 VSZ2kZE+bQOqmxUF2XiOuBZkjX6EWZKmWwL8p+5uIydxkqmUvDtG2hYqo7GKkddP
 e3Qze+pSNz4uU+LT4t3wOUJfL1R1Pgrdea7rSr1DwbQQ8K4FiEImMr9pDfVPaHjq
 XDnkpIwavG1s4Lxlkc6IbGHX0tYt2wp2lXO/QEp8sLjqAdUBnOiZhA3Ssy8veS+M
 Q0u0Vgara5enhq9UDevr6e+jJ5w1qT4SZ0T38eif4FJSJMSZpWFe9qI92gKCIv9U
 3oBZM/Qof0He9ow/Yl57rgxLmt8VGXiO5rL+VfT7j0H6kGkmao79SCvgyon0R+M=
 =/FjH
 -----END PGP SIGNATURE-----

Merge tag 's390-4.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Martin writes:
  "s390 fixes for 4.19-rc8

   Four more patches for 4.19:
    - Fix resume after suspend-to-disk if resume-CPU != suspend-CPU
    - Fix vfio-ccw check for pinned pages
    - Two patches to avoid a usercopy-whitelist warning in vfio-ccw"

* tag 's390-4.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/cio: Fix how vfio-ccw checks pinned pages
  s390/cio: Refactor alloc of ccw_io_region
  s390/cio: Convert ccw_io_region to pointer
  s390/hibernate: fix error handling when suspend cpu != resume cpu
2018-10-10 08:44:35 +02:00
Heiko Carstens
c72251ad87 s390/mem_detect: add missing include
Fix this allnoconfig build breakage:

arch/s390/boot/mem_detect.c: In function 'tprot':
arch/s390/boot/mem_detect.c:122:12: error: 'EFAULT' undeclared (first use in this function)

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-10 07:37:22 +02:00
Heiko Carstens
e494990e7b s390/dumpstack: print psw mask and address again
With pointer obfuscation the output of show_registers() became quite useless:

Krnl PSW : (____ptrval____) (____ptrval____) (__list_add_valid+0x98/0xa8)

In order to print the psw mask and address use %px instead of %p.
And the output looks again like this:

Krnl PSW : 0404d00180000000 00000000007c0dd0 (__list_add_valid+0x98/0xa8)

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-10 07:37:20 +02:00
Ingo Franzki
52a34b34d4 s390/crypto: Enhance paes cipher to accept variable length key material
Enhance the paes_s390 kernel module to allow the paes cipher to
accept variable length key material. The key material accepted by
the paes cipher is a key blob of various types. As of today, two
key blob types are supported: CCA secure key blobs and protected
key blobs.

Signed-off-by: Ingo Franzki <ifranzki@linux.ibm.com>
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-10 07:37:20 +02:00
Ingo Franzki
fb1136d658 s390/pkey: Introduce new API for transforming key blobs
Introduce a new ioctl API and in-kernel API to transform
a variable length key blob of any supported type into a
protected key.

Transforming a secure key blob uses the already existing
function pkey_sec2protk().
Transforming a protected key blob also verifies if the
protected key is still valid. If not, -ENODEV is returned.

Both APIs are described in detail in the header files
arch/s390/include/asm/pkey.h and arch/s390/include/uapi/asm/pkey.h.

Signed-off-by: Ingo Franzki <ifranzki@linux.ibm.com>
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-10 07:37:19 +02:00
Ingo Franzki
cb26b9ff71 s390/pkey: Introduce new API for random protected key verification
Introduce a new ioctl API and in-kernel API to verify if a
random protected key is still valid. A protected key is
invalid when its wrapping key verification pattern does not
match the verification pattern of the LPAR. Each time an LPAR
is activated, a new LPAR wrapping key is generated and the
wrapping key verification pattern is updated.
Both APIs are described in detail in the header files
arch/s390/include/asm/pkey.h and arch/s390/include/uapi/asm/pkey.h.

Signed-off-by: Ingo Franzki <ifranzki@linux.ibm.com>
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-10 07:37:18 +02:00
Ingo Franzki
a45a5c7d36 s390/pkey: Introduce new API for random protected key generation
This patch introduces a new ioctl API and in-kernel API to
generate a random protected key. The protected key is generated
in a way that the effective clear key is never exposed in clear.
Both APIs are described in detail in the header files
arch/s390/include/asm/pkey.h and arch/s390/include/uapi/asm/pkey.h.

Signed-off-by: Ingo Franzki <ifranzki@linux.ibm.com>
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:38 +02:00
Harald Freudenberger
ee410de890 s390/zcrypt: zcrypt device driver cleanup
Some cleanup in the s390 zcrypt device driver:
- Removed fragments of pcixx crypto card code. This code
  can't be reached anymore because the hardware detection
  function does not recognize crypto cards < CEX2 since
  commit f56545430736 ("s390/zcrypt: Introduce QACT support
  for AP bus devices.")
- Rename of some files and driver names which where still
  reflecting pcixx support to cex2a/cex2c.
- Removed all the zcrypt version strings in the file headers.
  There is only one place left - the zcrypt.h header file is
  now the only place for zcrypt device driver version info.
- Zcrypt version pump up from 2.2.0 to 2.2.1.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:35 +02:00
Vasily Gorbik
78333d1f90 s390/kasan: add support for mem= kernel parameter
Handle mem= kernel parameter in kasan to limit physical memory.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:35 +02:00
Vasily Gorbik
12e55fa194 s390/kasan: optimize kasan vmemmap allocation
Kasan implementation now supports memory hotplug operations. For that
reason regions of initially standby memory are now skipped from
shadow mapping and are mapped/unmapped dynamically upon bringing
memory online/offline.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:34 +02:00
Vasily Gorbik
296352397d s390/kasan: avoid kasan crash with standby memory defined
Kasan early memory allocator simply chops off memory blocks from the
end of the physical memory. Reuse mem_detect info to identify actual
online memory end rather than using max_physmem_end. This allows to run
the kernel with kasan enabled and standby memory defined.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:33 +02:00
Vasily Gorbik
19733fe872 s390/head: avoid doubling early boot stack size under KASAN
Early boot stack uses predefined 4 pages of memory 0x8000-0xC000. This
stack is used to run not instumented decompressor/facilities
verification C code. It doesn't make sense to double its size when
the kernel is built with KASAN support. BOOT_STACK_ORDER is introduced
to avoid that.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:32 +02:00
Vasily Gorbik
6cad0eb561 s390/mm: improve debugfs ptdump markers walking
This allows to print multiple markers when they happened to have the
same value.

...
0x001bfffff0100000-0x001c000000000000       255M PMD I
---[ Kasan Shadow End ]---
---[ vmemmap Area ]---
0x001c000000000000-0x001c000002000000        32M PMD RW X
...

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:31 +02:00
Vasily Gorbik
e006222b57 s390/mm: optimize debugfs ptdump kasan zero page walking
Kasan zero p4d/pud/pmd/pte are always filled in with corresponding
kasan zero entries. Walking kasan zero page backed area is time
consuming and unnecessary. When kasan zero p4d/pud/pmd is encountered,
it eventually points to the kasan zero page always with the same
attributes and nothing but it, therefore zero p4d/pud/pmd could
be jumped over.

Also adds a space between address range and pages number to separate
them from each other when pages number is huge.

0x0018000000000000-0x0018000010000000       256M PMD RW X
0x0018000010000000-0x001bfffff0000000 1073741312M PTE RO X
0x001bfffff0000000-0x001bfffff0001000         4K PTE RW X

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:30 +02:00
Vasily Gorbik
5dff03813f s390/kasan: add option for 4-level paging support
By default 3-level paging is used when the kernel is compiled with
kasan support. Add 4-level paging option to support systems with more
then 3TB of physical memory and to cover 4-level paging specific code
with kasan as well.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:29 +02:00
Vasily Gorbik
135ff16393 s390/kasan: free early identity mapping structures
Kasan initialization code is changed to populate persistent shadow
first, save allocator position into pgalloc_freeable and proceed with
early identity mapping creation. This way early identity mapping paging
structures could be freed at once after switching to swapper_pg_dir
when early identity mapping is not needed anymore.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:29 +02:00
Vasily Gorbik
5e78596329 s390/kasan: enable stack and global variables access checks
By defining KASAN_SHADOW_OFFSET in Kconfig stack and global variables
memory access check instrumentation is enabled. gcc version 4.9.2 or
newer is also required.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:28 +02:00
Vasily Gorbik
f4f0d32bfb s390/dumpstack: disable __dump_trace kasan instrumentation
Walking async_stack produces false positives. Disable __dump_trace
function instrumentation for now.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:27 +02:00
Vasily Gorbik
ac1256f826 s390/kasan: reipl and kexec support
Some functions from both arch/s390/kernel/ipl.c and
arch/s390/kernel/machine_kexec.c are called without DAT enabled
(or with and without DAT enabled code paths). There is no easy way
to partially disable kasan for those files without a substantial
rework. Disable kasan for both files for now.

To avoid disabling kasan for arch/s390/kernel/diag.c DAT flag is
enabled in diag308 call. pcpu_delegate which disables DAT is marked
with __no_sanitize_address to disable instrumentation for that one
function.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:27 +02:00
Vasily Gorbik
9e8df6daed s390/smp: kasan stack instrumentation support
smp_start_secondary function is called without DAT enabled. To avoid
disabling kasan instrumentation for entire arch/s390/kernel/smp.c
smp_start_secondary has been split in 2 parts. smp_start_secondary has
instrumentation disabled, it does minimal setup and enables DAT. Then
instrumentated __smp_start_secondary is called to do the rest.

__load_psw_mask function instrumentation has been disabled as well
to be able to call it from smp_start_secondary.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:26 +02:00
Vasily Gorbik
d58106c3ec s390/kasan: use noexec and large pages
To lower memory footprint and speed up kasan initialisation detect
EDAT availability and use large pages if possible. As we know how
much memory is needed for initialisation, another simplistic large
page allocator is introduced to avoid memory fragmentation.

Since facilities list is retrieved anyhow, detect noexec support and
adjust pages attributes. Handle noexec kernel option to avoid inconsistent
kasan shadow memory pages flags.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:24 +02:00
Vasily Gorbik
793213a82d s390/kasan: dynamic shadow mem allocation for modules
Move from modules area entire shadow memory preallocation to dynamic
allocation per module load.

This behaivior has been introduced for x86 with bebf56a1b: "This patch
also forces module_alloc() to return 8*PAGE_SIZE aligned address making
shadow memory handling ( kasan_module_alloc()/kasan_module_free() )
more simple. Such alignment guarantees that each shadow page backing
modules address space correspond to only one module_alloc() allocation"

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:23 +02:00
Vasily Gorbik
0dac8f6bc3 s390/mm: add kasan shadow to the debugfs pgtable dump
This change adds address space markers for kasan shadow memory.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:22 +02:00
Vasily Gorbik
b6cbe3e8bd s390/kasan: avoid user access code instrumentation
Kasan instrumentation adds "store" check for variables marked as
modified by inline assembly. With user pointers containing addresses
from another address space this produces false positives.

static inline unsigned long clear_user_xc(void __user *to, ...)
{
	asm volatile(
	...
	: "+a" (to) ...

User space access functions are wrapped by manually instrumented
functions in kasan common code, which should be sufficient to catch
errors. So, we just disable uaccess.o instrumentation altogether.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:21 +02:00
Vasily Gorbik
7fef92ccad s390/kasan: double the stack size
Kasan stack instrumentation pads stack variables with redzones, which
increases stack frames size significantly. Stack sizes are increased
from 16k to 32k in the code, as well as for the kernel stack overflow
detection option (CHECK_STACK).

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:21 +02:00
Vasily Gorbik
42db5ed860 s390/kasan: add initialization code and enable it
Kasan needs 1/8 of kernel virtual address space to be reserved as the
shadow area. And eventually it requires the shadow memory offset to be
known at compile time (passed to the compiler when full instrumentation
is enabled).  Any value picked as the shadow area offset for 3-level
paging would eat up identity mapping on 4-level paging (with 1PB
shadow area size). So, the kernel sticks to 3-level paging when kasan
is enabled. 3TB border is picked as the shadow offset.  The memory
layout is adjusted so, that physical memory border does not exceed
KASAN_SHADOW_START and vmemmap does not go below KASAN_SHADOW_END.

Due to the fact that on s390 paging is set up very late and to cover
more code with kasan instrumentation, temporary identity mapping and
final shadow memory are set up early. The shadow memory mapping is
later carried over to init_mm.pgd during paging_init.

For the needs of paging structures allocation and shadow memory
population a primitive allocator is used, which simply chops off
memory blocks from the end of the physical memory.

Kasan currenty doesn't track vmemmap and vmalloc areas.

Current memory layout (for 3-level paging, 2GB physical memory).

---[ Identity Mapping ]---
0x0000000000000000-0x0000000000100000
---[ Kernel Image Start ]---
0x0000000000100000-0x0000000002b00000
---[ Kernel Image End ]---
0x0000000002b00000-0x0000000080000000        2G <- physical memory border
0x0000000080000000-0x0000030000000000     3070G PUD I
---[ Kasan Shadow Start ]---
0x0000030000000000-0x0000030010000000      256M PMD RW X  <- shadow for 2G memory
0x0000030010000000-0x0000037ff0000000   523776M PTE RO NX <- kasan zero ro page
0x0000037ff0000000-0x0000038000000000      256M PMD RW X  <- shadow for 2G modules
---[ Kasan Shadow End ]---
0x0000038000000000-0x000003d100000000      324G PUD I
---[ vmemmap Area ]---
0x000003d100000000-0x000003e080000000
---[ vmalloc Area ]---
0x000003e080000000-0x000003ff80000000
---[ Modules Area ]---
0x000003ff80000000-0x0000040000000000        2G

Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:20 +02:00
Vasily Gorbik
d0e2eb0a36 s390: add pgd_page primitive
Add pgd_page primitive which is required by kasan common code.
Also fixes typo in p4d_page definition.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:19 +02:00
Vasily Gorbik
34377d3cfb s390: introduce MAX_PTRS_PER_P4D
Kasan common code requires MAX_PTRS_PER_P4D definition, which in case
of s390 is always PTRS_PER_P4D.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:18 +02:00
Vasily Gorbik
fb594ec13e s390/kasan: replace some memory functions
Follow the common kasan approach:

    "KASan replaces memory functions with manually instrumented
    variants.  Original functions declared as weak symbols so strong
    definitions in mm/kasan/kasan.c could replace them. Original
    functions have aliases with '__' prefix in name, so we could call
    non-instrumented variant if needed."

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:18 +02:00
Vasily Gorbik
0a9b40911b s390/kasan: avoid instrumentation of early C code
Instrumented C code cannot run without the kasan shadow area. Exempt
source code files from kasan which are running before / used during
kasan initialization.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:17 +02:00
Vasily Gorbik
3484984585 s390/kasan: avoid vdso instrumentation
vdso is mapped into user space processes, which won't have kasan
shodow mapped.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:16 +02:00
Vasily Gorbik
75f195420a s390/mm: add missing pfn_to_kaddr helper
kasan common code uses pfn_to_kaddr, which is defined by many other
architectures. Adding it as well to avoid a build error.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:15 +02:00
Vasily Gorbik
49698745e5 s390: move ipl block and cmd line handling to early boot phase
To distinguish zfcpdump case and to be able to parse some of the kernel
command line arguments early (e.g. mem=) ipl block retrieval and command
line construction code is moved to the early boot phase.

"memory_end" is set up correctly respecting "mem=" and hsa_size in case
of the zfcpdump.

arch/s390/boot/string.c is introduced to provide string handling and
command line parsing functions to early boot phase code for the compressed
kernel image case.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:14 +02:00
Vasily Gorbik
b09decfd99 s390/sclp: introduce sclp_early_get_hsa_size
Introduce sclp_early_get_hsa_size function to be used during early
memory detection. This function allows to find a memory limit imposed
during zfcpdump.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:13 +02:00
Vasily Gorbik
f01b8bca08 s390/mem_detect: add info source debug print
Print mem_detect info source when memblock=debug is specified.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:13 +02:00
Vasily Gorbik
54c57795e8 s390/mem_detect: replace tprot loop with binary search
In a situation when other memory detection methods are not available
(no SCLP and no z/VM diag260), continuous online memory is assumed.
Replacing tprot loop with faster binary search, as only online memory
end has to be found.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:12 +02:00
Vasily Gorbik
cd45c99561 s390/mem_detect: use SCLP info for continuous memory detection
When neither SCLP storage info, nor z/VM diag260 "storage configuration"
are available assume a continuous online memory of size specified by
SCLP info.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:11 +02:00
Vasily Gorbik
6e98e64329 s390/mem_detect: introduce z/VM specific diag260 call
In the case when z/VM memory is defined with "define storage config"
command, SCLP storage info is not available. Utilize diag260 "storage
configuration" call, to get information about z/VM specific guest memory
definitions with potential memory holes.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:10 +02:00
Vasily Gorbik
fddbaa5c42 s390/mem_detect: introduce SCLP storage info
SCLP storage info allows to detect continuous and non-continuous online
memory under LPAR, z/VM and KVM, when standby memory is defined.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:09 +02:00
Vasily Gorbik
251b72a440 s390: introduce .boot.data section compile time validation
Make sure that .boot.data sections of vmlinux and
arch/s390/compressed/vmlinux match before producing the compressed kernel
image. Symbols presence, order and sizes are cross-checked.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:08 +02:00
Vasily Gorbik
6966d604e2 s390/mem_detect: move tprot loop to early boot phase
Move memory detection to early boot phase. To store online memory
regions "struct mem_detect_info" has been introduced together with
for_each_mem_detect_block iterator. mem_detect_info is later converted
to memblock.

Also introduces sclp_early_get_meminfo function to get maximum physical
memory and maximum increment number.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:08 +02:00
Vasily Gorbik
17aacfbfa1 s390/sclp: move sclp_early_read_info to sclp_early_core.c
To enable early online memory detection sclp_early_read_info has
been moved to sclp_early_core.c. sclp_info_sccb has been made a part
of .boot.data, which allows to reuse it later during early kernel
startup and make sclp_early_read_info call just once.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:07 +02:00
Vasily Gorbik
d1b52a4388 s390: introduce .boot.data section
Introduce .boot.data section which is "shared" between the decompressor
code and the decompressed kernel. The decompressor will store values in
it, and copy over to the decompressed image before starting it. This
method allows to avoid using pre-defined addresses and other hacks to
pass values between those boot phases.

.boot.data section is a part of init data, and will be freed after kernel
initialization is complete.

For uncompressed kernel image, .boot.data section is basically the same
as .init.data

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:06 +02:00
Vasily Gorbik
7516fc11e4 s390/decompressor: clean up and rename compressed/misc.c
Since compressed/misc.c is conditionally compiled move error reporting
code to boot/main.c. With that being done compressed/misc.c has no
"miscellaneous" functions left and is all about plain decompression
now. Rename it accordingly.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:05 +02:00
Vasily Gorbik
15426ca43d s390: rescue initrd as early as possible
To avoid multi-stage initrd rescue operation and to simplify
assumptions during early memory allocations move initrd at some final
safe destination as early as possible. This would also allow us to
drop .bss usage restrictions for some files.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:05 +02:00
Vasily Gorbik
a2ac1bb1f3 s390/decompressor: get rid of .bss usage
Using .bss in early code should be avoided. It might overlay initrd
image or not yet be initialized. Clean up the last couple of places in
the decompressor's code where .bss is used and enfore no .bss usage
check on boot/compressed/misc.c. In particular:
- initializing free_mem_ptr and free_mem_end_ptr with values guarantee
that these variables won't end up in the .bss section.
- define STATIC_RW_DATA to go into .data section.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:03 +02:00
Vasily Gorbik
369f91c374 s390/decompressor: rework uncompressed image info collection
The kernel decompressor has to know several bits of information about
uncompressed image. Currently this info is collected by running "nm" on
uncompressed vmlinux + "sed" and producing sizes.h file. This method
worked well, but it has several disadvantages. Obscure symbols name
pattern matching is fragile. Adding new values makes pattern even
longer. Logic is spread across code and make file. Limited ability to
adjust symbols values (currently magic lma value of 0x100000 is always
subtracted). Apart from that same pieces of information (and more)
would be needed for early memory detection and features like KASLR
outside of boot/compressed/ folder where sizes.h is generated.

To overcome limitations new "struct vmlinux_info" has been introduced
to include values needed for the decompressor and the rest of the
boot code. The only static instance of vmlinux_info is produced during
vmlinux link step by filling in struct fields by the linker (like it is
done with input_data in boot/compressed/vmlinux.scr.lds.S). This way
individual values could be adjusted with all the knowledge linker has
and arithmetic it supports. Later .vmlinux.info section (which contains
struct vmlinux_info) is transplanted into the decompressor image and
dropped from uncompressed image altogether.

While doing that replace "compressed/vmlinux.scr.lds.S" linker
script (whose purpose is to rename .data section in piggy.o to
.rodata.compressed) with plain objcopy command. And simplify
decompressor's linker script.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:02 +02:00
Vasily Gorbik
8f75582a2f s390: remove decompressor's head.S
Decompressor's head.S provided "data mover" sole purpose of which has
been to safely move uncompressed kernel at 0x100000 and jump to it.

With current bzImage layout entire decompressor's code guaranteed to be
in a safe location under 0x100000, and hence could not be overwritten
during kernel move. For that reason head.S could be replaced with simple
memmove function. To do so introduce early boot code phase which is
executed from arch/s390/boot/head.S after "verify_facilities" and takes
care of optional kernel image decompression and transition to it.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:02 +02:00
Vasily Gorbik
32ce55a659 s390: unify stack size definitions
Remove STACK_ORDER and STACK_SIZE in favour of identical THREAD_SIZE_ORDER
and THREAD_SIZE definitions. THREAD_SIZE and THREAD_SIZE_ORDER naming is
misleading since it is used as general kernel stack size information. But
both those definitions are used in the common code and throughout
architectures specific code, so changing the naming is problematic.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:20:58 +02:00
Martin Schwidefsky
ce3dc44749 s390: add support for virtually mapped kernel stacks
With virtually mapped kernel stacks the kernel stack overflow detection
is now fault based, every stack has a guard page in the vmalloc space.
The panic_stack is renamed to nodat_stack and is used for all function
that need to run without DAT, e.g. memcpy_real or do_start_kdump.

The main effect is a reduction in the kernel image size as with vmap
stacks the old style overflow checking that adds two instructions per
function is not needed anymore. Result from bloat-o-meter:

add/remove: 20/1 grow/shrink: 13/26854 up/down: 2198/-216240 (-214042)

In regard to performance the micro-benchmark for fork has a hit of a
few microseconds, allocating 4 pages in vmalloc space is more expensive
compare to an order-2 page allocation. But with real workload I could
not find a noticeable difference.

Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:20:57 +02:00
Martin Schwidefsky
ff340d2472 s390: add stack switch helper
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:20:56 +02:00
Martin Schwidefsky
00e9e6645a s390/pfault: do not use stack buffers for hardware data
With CONFIG_VMAP_STACK=y the stack is allocated from the vmalloc space.
Data structures passed to a hardware or a hypervisor interface that
requires V=R can not be allocated on the stack anymore.

Make the init and fini pfault parameter blocks static variables.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:20:54 +02:00
Martin Schwidefsky
8ef9eda018 s390/hypfs: do not use stack buffers for hardware data
With CONFIG_VMAP_STACK=y the stack is allocated from the vmalloc space.
Data structures passed to a hardware or a hypervisor interface that
requires V=R can not be allocated on the stack anymore.

Use kmalloc to get memory for the hypsfs_diag304 structure.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:20:53 +02:00
Martin Schwidefsky
d36a928139 s390/appldata: do not use stack buffers for hardware data
With CONFIG_VMAP_STACK=y the stack is allocated from the vmalloc space.
Data structures passed to a hardware or a hypervisor interface that
requires V=R can not be allocated on the stack anymore.

Use kmalloc to get memory for the appldata_product_id and the
appldata_parameter_list structures.

Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:20:52 +02:00
Martin Schwidefsky
f689789a28 s390/appldata: pass parameter list pointer to appldata_asm
In preparation for CONFIG_VMAP_STACK=y move the allocation of the
struct appldata_parameter_list to the caller of appldata_asm().

Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:20:50 +02:00
Christian Borntraeger
ed3054a302 Merge branch 'apv11' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kernelorgnext 2018-10-08 12:14:54 +02:00
Julian Wiedmann
346e485d42 s390/ccwgroup: add get_ccwgroupdev_by_busid()
Provide function to find a ccwgroup device by its busid.

Acked-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-08 09:09:59 +02:00
Harald Freudenberger
00fab2350e s390/zcrypt: multiple zcrypt device nodes support
This patch is an extension to the zcrypt device driver to provide,
support and maintain multiple zcrypt device nodes. The individual
zcrypt device nodes can be restricted in terms of crypto cards,
domains and available ioctls. Such a device node can be used as a
base for container solutions like docker to control and restrict
the access to crypto resources.

The handling is done with a new sysfs subdir /sys/class/zcrypt.
Echoing a name (or an empty sting) into the attribute "create" creates
a new zcrypt device node. In /sys/class/zcrypt a new link will appear
which points to the sysfs device tree of this new device. The
attribute files "ioctlmask", "apmask" and "aqmask" in this directory
are used to customize this new zcrypt device node instance. Finally
the zcrypt device node can be destroyed by echoing the name into
/sys/class/zcrypt/destroy. The internal structs holding the device
info are reference counted - so a destroy will not hard remove a
device but only marks it as removable when the reference counter drops
to zero.

The mask values are bitmaps in big endian order starting with bit 0.
So adapter number 0 is the leftmost bit, mask is 0x8000...  The sysfs
attributes accept 2 different formats:
* Absolute hex string starting with 0x like "0x12345678" does set
  the mask starting from left to right. If the given string is shorter
  than the mask it is padded with 0s on the right. If the string is
  longer than the mask an error comes back (EINVAL).
* Relative format - a concatenation (done with ',') of the
  terms +<bitnr>[-<bitnr>] or -<bitnr>[-<bitnr>]. <bitnr> may be any
  valid number (hex, decimal or octal) in the range 0...255. Here are
  some examples:
    "+0-15,+32,-128,-0xFF"
    "-0-255,+1-16,+0x128"
    "+1,+2,+3,+4,-5,-7-10"

A simple usage examples:

  # create new zcrypt device 'my_zcrypt':
  echo "my_zcrypt" >/sys/class/zcrypt/create
  # go into the device dir of this new device
  echo "my_zcrypt" >create
  cd my_zcrypt/
  ls -l
  total 0
  -rw-r--r-- 1 root root 4096 Jul 20 15:23 apmask
  -rw-r--r-- 1 root root 4096 Jul 20 15:23 aqmask
  -r--r--r-- 1 root root 4096 Jul 20 15:23 dev
  -rw-r--r-- 1 root root 4096 Jul 20 15:23 ioctlmask
  lrwxrwxrwx 1 root root    0 Jul 20 15:23 subsystem -> ../../../../class/zcrypt
  ...
  # customize this zcrypt node clone
  # enable only adapter 0 and 2
  echo "0xa0" >apmask
  # enable only domain 6
  echo "+6" >aqmask
  # enable all 256 ioctls
  echo "+0-255" >ioctls
  # now the /dev/my_zcrypt may be used
  # finally destroy it
  echo "my_zcrypt" >/sys/class/zcrypt/destroy

Please note that a very similar 'filtering behavior' also applies to
the parent z90crypt device. The two mask attributes apmask and aqmask
in /sys/bus/ap act the very same for the z90crypt device node. However
the implementation here is totally different as the ap bus acts on
bind/unbind of queue devices and associated drivers but the effect is
still the same. So there are two filters active for each additional
zcrypt device node: The adapter/domain needs to be enabled on the ap
bus level and it needs to be active on the zcrypt device node level.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-08 09:09:58 +02:00
Pierre Morel
0e237e4469 KVM: s390: Tracing APCB changes
kvm_arch_crypto_set_masks is a new function to centralize
the setup the APCB masks inside the CRYCB SIE satellite.

To trace APCB mask changes, we add KVM_EVENT() tracing to
both kvm_arch_crypto_set_masks and kvm_arch_crypto_clear_masks.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Message-Id: <1538728270-10340-2-git-send-email-pmorel@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-10-05 13:10:18 +02:00
Christian Borntraeger
8e41bd5431 KVM: s390: fix locking for crypto setting error path
We need to unlock the kvm->lock mutex in the error case.

Reported-by: smatch
Fixes: 37940fb0b6 ("KVM: s390: device attrs to enable/disable AP interpretation")
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-10-05 10:04:03 +02:00
Paolo Bonzini
dd5bd0a65f KVM: s390: Features for 4.20
- Initial version of AP crypto virtualization via vfio-mdev
 - Set the host program identifier
 - Optimize page table locking
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJbsxPQAAoJEBF7vIC1phx8TDoP/2zJTTf6s4Kc+jltNsFaaZyO
 rg5N6ZhL+YRpdtPB/H5Y07zt8MSAOfMMqFwzSJo2B+C/xs4BjVtTx6H7M/5AS4Rl
 /JC2xcjoVi11FzJ1EflfLlqOtPrenJmB+c7RrLy61xIYCY8VhM55u4epIjY/FWwA
 VlLVHIP7+9MBgDG6TNEuvAiFwwpM2axITzXw6vkjC/8CbRQz3cY+zvBqhVDq3KOO
 MLHSmBKLbrA940XhUlPQ1wDplGlZ5lobG6+pXnynCs8YBj12zEivNe4y9Z1v0XsM
 nKQZxkDK+q9LG7WyRU5uIA00+msFopGrUCsQd/S/HQA8wyJ6xYeLALQpNHgMR7ts
 Qiv4oj/2nd7qW8X0Fs25no0G5MtOSvHqNGKQ5pY09q8JAxmU1vnSNFR+KZuS+fX7
 YyUf+SeBAZqkSzXgI11nD4hyxyFX1SQiO5FPjPyE93fPdJ9fKaQv4A/wdsrt6+ca
 5GaE2RJIxhKfkr9dHWJXQBGkAuYS8PnJiNYUdati5aemTht71KCYuafRzYL/T0YG
 omuDHbsS0L0EniMIWaWqmwu7M1BLsnMLA8nLsMrCANBG1PWaebobP7HXeK1jK90b
 ODhzldX5r3wQcj0nVLfdA6UOiY0wyvHYyRNiq+EBO9FXHtrNpxjz2X2MmK2fhkE6
 EaDLlgLSpB8ZT6MZHsWA
 =XI83
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-next-4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390: Features for 4.20
- Initial version of AP crypto virtualization via vfio-mdev
- Set the host program identifier
- Optimize page table locking
2018-10-04 17:12:45 +02:00
Eric W. Biederman
f283801851 signal: Remove the need for __ARCH_SI_PREABLE_SIZE and SI_PAD_SIZE
Rework the defintion of struct siginfo so that the array padding
struct siginfo to SI_MAX_SIZE can be placed in a union along side of
the rest of the struct siginfo members.  The result is that we no
longer need the __ARCH_SI_PREAMBLE_SIZE or SI_PAD_SIZE definitions.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-10-03 16:46:43 +02:00
Ard Biesheuvel
57d1587703 s390/vmlinux.lds: Move JUMP_TABLE_DATA into output section
Commit e872267b8b ("jump_table: move entries into ro_after_init
region") moved the __jump_table input section into the __ro_after_init
output section, but inadvertently put the macro in the wrong place in
the s390 linker script. Let's fix that.

Fixes: e872267b8b ("jump_table: move entries into ro_after_init region")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Cc: linux-s390@vger.kernel.org
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20180930164950.3841-1-ard.biesheuvel@linaro.org
2018-10-02 08:08:08 +02:00
Christian Borntraeger
55d09dd4c8 Merge branch 'apv11' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kernelorgnext 2018-10-01 08:53:23 +02:00
David Hildenbrand
af4bf6c3d9 s390/mm: optimize locking without huge pages in gmap_pmd_op_walk()
Right now we temporarily take the page table lock in
gmap_pmd_op_walk() even though we know we won't need it (if we can
never have 1mb pages mapped into the gmap).

Let's make this a special case, so gmap_protect_range() and
gmap_sync_dirty_log_pmd() will not take the lock when huge pages are
not allowed.

gmap_protect_range() is called quite frequently for managing shadow
page tables in vSIE environments.

Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Message-Id: <20180806155407.15252-1-david@redhat.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2018-10-01 08:52:24 +02:00
Collin Walling
67d49d52ae KVM: s390: set host program identifier
A host program identifier (HPID) provides information regarding the
underlying host environment. A level-2 (VM) guest will have an HPID
denoting Linux/KVM, which is set during VCPU setup. A level-3 (VM on a
VM) and beyond guest will have an HPID denoting KVM vSIE, which is set
for all shadow control blocks, overriding the original value of the
HPID.

Signed-off-by: Collin Walling <walling@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Message-Id: <1535734279-10204-4-git-send-email-walling@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-10-01 08:51:42 +02:00
Tony Krowiak
112c24d4dc KVM: s390: CPU model support for AP virtualization
Introduces two new CPU model facilities to support
AP virtualization for KVM guests:

1. AP Query Configuration Information (QCI) facility is installed.

   This is indicated by setting facilities bit 12 for
   the guest. The kernel will not enable this facility
   for the guest if it is not set on the host.

   If this facility is not set for the KVM guest, then only
   APQNs with an APQI less than 16 will be used by a Linux
   guest regardless of the matrix configuration for the virtual
   machine. This is a limitation of the Linux AP bus.

2. AP Facilities Test facility (APFT) is installed.

   This is indicated by setting facilities bit 15 for
   the guest. The kernel will not enable this facility for
   the guest if it is not set on the host.

   If this facility is not set for the KVM guest, then no
   AP devices will be available to the guest regardless of
   the guest's matrix configuration for the virtual
   machine. This is a limitation of the Linux AP bus.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Tested-by: Michael Mueller <mimu@linux.ibm.com>
Tested-by: Farhan Ali <alifm@linux.ibm.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20180925231641.4954-26-akrowiak@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-09-28 15:50:11 +02:00
Tony Krowiak
37940fb0b6 KVM: s390: device attrs to enable/disable AP interpretation
Introduces two new VM crypto device attributes (KVM_S390_VM_CRYPTO)
to enable or disable AP instruction interpretation from userspace
via the KVM_SET_DEVICE_ATTR ioctl:

* The KVM_S390_VM_CRYPTO_ENABLE_APIE attribute enables hardware
  interpretation of AP instructions executed on the guest.

* The KVM_S390_VM_CRYPTO_DISABLE_APIE attribute disables hardware
  interpretation of AP instructions executed on the guest. In this
  case the instructions will be intercepted and pass through to
  the guest.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20180925231641.4954-25-akrowiak@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-09-28 15:50:11 +02:00
Pierre Morel
9ee71f20cb KVM: s390: vsie: allow guest FORMAT-0 CRYCB on host FORMAT-2
When the guest schedules a SIE with a FORMAT-0 CRYCB,
we are able to schedule it in the host with a FORMAT-2
CRYCB if the host uses FORMAT-2

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Message-Id: <20180925231641.4954-24-akrowiak@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-09-28 15:50:11 +02:00
Pierre Morel
6b79de4b05 KVM: s390: vsie: allow guest FORMAT-1 CRYCB on host FORMAT-2
When the guest schedules a SIE with a CRYCB FORMAT-1 CRYCB,
we are able to schedule it in the host with a FORMAT-2 CRYCB
if the host uses FORMAT-2.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Message-Id: <20180925231641.4954-23-akrowiak@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-09-28 15:50:11 +02:00
Pierre Morel
c9ba8c2cd2 KVM: s390: vsie: allow guest FORMAT-0 CRYCB on host FORMAT-1
When the guest schedules a SIE with a FORMAT-0 CRYCB,
we are able to schedule it in the host with a FORMAT-1
CRYCB if the host uses FORMAT-1 or FORMAT-0.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Message-Id: <20180925231641.4954-22-akrowiak@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-09-28 15:50:11 +02:00
Pierre Morel
6ee7409820 KVM: s390: vsie: allow CRYCB FORMAT-0
When the host and the guest both use a FORMAT-0 CRYCB,
we copy the guest's FORMAT-0 APCB to a shadow CRYCB
for use by vSIE.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Message-Id: <20180925231641.4954-21-akrowiak@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-09-28 15:50:11 +02:00
Pierre Morel
19fd83a647 KVM: s390: vsie: allow CRYCB FORMAT-1
When the host and guest both use a FORMAT-1 CRYCB, we copy
the guest's FORMAT-0 APCB to a shadow CRYCB for use by
vSIE.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Message-Id: <20180925231641.4954-20-akrowiak@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-09-28 15:50:11 +02:00
Pierre Morel
56019f9aca KVM: s390: vsie: Allow CRYCB FORMAT-2
When the guest and the host both use CRYCB FORMAT-2,
we copy the guest's FORMAT-1 APCB to a FORMAT-1
shadow APCB.

This patch also cleans up the shadow_crycb() function.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Message-Id: <20180925231641.4954-19-akrowiak@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-09-28 15:50:11 +02:00
Pierre Morel
3af84def9c KVM: s390: vsie: Make use of CRYCB FORMAT2 clear
The comment preceding the shadow_crycb function is
misleading, we effectively accept FORMAT2 CRYCB in the
guest.

When using FORMAT2 in the host we do not need to or with
FORMAT1.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180925231641.4954-18-akrowiak@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-09-28 15:50:11 +02:00
Pierre Morel
d6f6959ac5 KVM: s390: vsie: Do the CRYCB validation first
We need to handle the validity checks for the crycb, no matter what the
settings for the keywrappings are. So lets move the keywrapping checks
after we have done the validy checks.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180925231641.4954-17-akrowiak@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-09-28 15:50:11 +02:00
Pierre Morel
6cc571b1b1 KVM: s390: Clear Crypto Control Block when using vSIE
When we clear the Crypto Control Block (CRYCB) used by a guest
level 2, the vSIE shadow CRYCB for guest level 3 must be updated
before the guest uses it.

We achieve this by using the KVM_REQ_VSIE_RESTART synchronous
request for each vCPU belonging to the guest to force the reload
of the shadow CRYCB before rerunning the guest level 3.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Message-Id: <20180925231641.4954-16-akrowiak@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-09-28 15:50:11 +02:00
Tony Krowiak
258287c994 s390: vfio-ap: implement mediated device open callback
Implements the open callback on the mediated matrix device.
The function registers a group notifier to receive notification
of the VFIO_GROUP_NOTIFY_SET_KVM event. When notified,
the vfio_ap device driver will get access to the guest's
kvm structure. The open callback must ensure that only one
mediated device shall be opened per guest.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Tested-by: Michael Mueller <mimu@linux.ibm.com>
Tested-by: Farhan Ali <alifm@linux.ibm.com>
Tested-by: Pierre Morel <pmorel@linux.ibm.com>
Acked-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20180925231641.4954-12-akrowiak@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-09-28 15:50:09 +02:00
Kees Cook
531fa5d620 s390/crypto: Remove VLA usage of skcipher
In the quest to remove all stack VLA usage from the kernel[1], this
replaces struct crypto_skcipher and SKCIPHER_REQUEST_ON_STACK() usage
with struct crypto_sync_skcipher and SYNC_SKCIPHER_REQUEST_ON_STACK(),
which uses a fixed stack size.

[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com

Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux-s390@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-09-28 12:46:07 +08:00
Heiko Carstens
13ddb52c16 s390/jump_label: Switch to relative references
Enable support for relative references in jump_label entries.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-s390@vger.kernel.org
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Jessica Yu <jeyu@kernel.org>
Link: https://lkml.kernel.org/r/20180919065144.25010-10-ard.biesheuvel@linaro.org
2018-09-27 17:56:49 +02:00
Ard Biesheuvel
e872267b8b jump_table: Move entries into ro_after_init region
The __jump_table sections emitted into the core kernel and into
each module consist of statically initialized references into
other parts of the code, and with the exception of entries that
point into init code, which are defused at post-init time, these
data structures are never modified.

So let's move them into the ro_after_init section, to prevent them
from being corrupted inadvertently by buggy code, or deliberately
by an attacker.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Jessica Yu <jeyu@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-s390@vger.kernel.org
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: https://lkml.kernel.org/r/20180919065144.25010-9-ard.biesheuvel@linaro.org
2018-09-27 17:56:49 +02:00
Tony Krowiak
42104598ef KVM: s390: interface to clear CRYCB masks
Introduces a new KVM function to clear the APCB0 and APCB1 in the guest's
CRYCB. This effectively clears all bits of the APM, AQM and ADM masks
configured for the guest. The VCPUs are taken out of SIE to ensure the
VCPUs do not get out of sync.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Tested-by: Michael Mueller <mimu@linux.ibm.com>
Tested-by: Farhan Ali <alifm@linux.ibm.com>
Tested-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20180925231641.4954-11-akrowiak@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-09-26 21:02:59 +02:00
Tony Krowiak
1fde573413 s390: vfio-ap: base implementation of VFIO AP device driver
Introduces a new AP device driver. This device driver
is built on the VFIO mediated device framework. The framework
provides sysfs interfaces that facilitate passthrough
access by guests to devices installed on the linux host.

The VFIO AP device driver will serve two purposes:

1. Provide the interfaces to reserve AP devices for exclusive
   use by KVM guests. This is accomplished by unbinding the
   devices to be reserved for guest usage from the zcrypt
   device driver and binding them to the VFIO AP device driver.

2. Implements the functions, callbacks and sysfs attribute
   interfaces required to create one or more VFIO mediated
   devices each of which will be used to configure the AP
   matrix for a guest and serve as a file descriptor
   for facilitating communication between QEMU and the
   VFIO AP device driver.

When the VFIO AP device driver is initialized:

* It registers with the AP bus for control of type 10 (CEX4
  and newer) AP queue devices. This limitation was imposed
  due to:

  1. A desire to keep the code as simple as possible;

  2. Some older models are no longer supported by the kernel
     and others are getting close to end of service.

  3. A lack of older systems on which to test older devices.

  The probe and remove callbacks will be provided to support
  the binding/unbinding of AP queue devices to/from the VFIO
  AP device driver.

* Creates a matrix device, /sys/devices/vfio_ap/matrix,
  to serve as the parent of the mediated devices created, one
  for each guest, and to hold the APQNs of the AP devices bound to
  the VFIO AP device driver.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Tested-by: Michael Mueller <mimu@linux.ibm.com>
Tested-by: Farhan Ali <alifm@linux.ibm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20180925231641.4954-5-akrowiak@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-09-26 20:45:51 +02:00
Tony Krowiak
e585b24aeb KVM: s390: refactor crypto initialization
This patch refactors the code that initializes and sets up the
crypto configuration for a guest. The following changes are
implemented via this patch:

1. Introduces a flag indicating AP instructions executed on
   the guest shall be interpreted by the firmware. This flag
   is used to set a bit in the guest's state description
   indicating AP instructions are to be interpreted.

2. Replace code implementing AP interfaces with code supplied
   by the AP bus to query the AP configuration.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Tested-by: Michael Mueller <mimu@linux.ibm.com>
Tested-by: Farhan Ali <alifm@linux.ibm.com>
Message-Id: <20180925231641.4954-4-akrowiak@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-09-26 20:45:20 +02:00
David Hildenbrand
3194cdb711 KVM: s390: introduce and use KVM_REQ_VSIE_RESTART
When we change the crycb (or execution controls), we also have to make sure
that the vSIE shadow datastructures properly consider the changed
values before rerunning the vSIE. We can achieve that by simply using a
VCPU request now.

This has to be a synchronous request (== handled before entering the
(v)SIE again).

The request will make sure that the vSIE handler is left, and that the
request will be processed (NOP), therefore forcing a reload of all
vSIE data (including rebuilding the crycb) when re-entering the vSIE
interception handler the next time.

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20180925231641.4954-3-akrowiak@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-09-26 09:13:20 +02:00
David Hildenbrand
9ea5972865 KVM: s390: vsie: simulate VCPU SIE entry/exit
VCPU requests and VCPU blocking right now don't take care of the vSIE
(as it was not necessary until now). But we want to have synchronous VCPU
requests that will also be handled before running the vSIE again.

So let's simulate a SIE entry of the VCPU when calling the sie during
vSIE handling and check for PROG_ flags. The existing infrastructure
(e.g. exit_sie()) will then detect that the SIE (in form of the vSIE) is
running and properly kick the vSIE CPU, resulting in it leaving the vSIE
loop and therefore the vSIE interception handler, allowing it to handle
VCPU requests.

E.g. if we want to modify the crycb of the VCPU and make sure that any
masks also get applied to the VSIE crycb shadow (which uses masks from the
VCPU crycb), we will need a way to hinder the vSIE from running and make
sure to process the updated crycb before reentering the vSIE again.

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20180925231641.4954-2-akrowiak@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-09-26 09:13:20 +02:00
Julian Wiedmann
ccc413f621 s390/qdio: clean up AOB handling
I've stumbled over this too many times now... AOBs are only ever used on
Output Queues. So in qdio_kick_handler(), move the call to their handler
into the Output-only path, and get rid of the convoluted contains_aobs()
helper. No functional change.

While at it, also remove
1. the unused sbal_state->aob field. For processing an async completion,
   upper-layer drivers get their AOB pointer from the CQ buffer.
2. an unused EXPORT for qdio_allocate_aob(). External users would have
   no way of passing an allocated AOB back into qdio.ko anyways...

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-09-20 13:20:29 +02:00
Vasily Gorbik
4e62d45885 s390: clean up stacks setup
Replace hard coded stack frame overhead values with STACK_FRAME_OVERHEAD
definition. Avoid unnecessary arithmetic instructions.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-09-20 13:20:29 +02:00
Vasily Gorbik
26f4414a45 s390/vdso: correct CFI annotations of vDSO functions
Correct stack frame overhead for 31-bit vdso, which should be 96 rather
then 160. This is done by reusing STACK_FRAME_OVERHEAD definition which
contains correct value based on build flags. This fixes stack unwinding
within vdso code for 31-bit processes. While at it replace all hard coded
stack frame overhead values with the same definition in vdso64 as well.

Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-09-20 13:20:29 +02:00
Vasily Gorbik
d1befa6582 s390/vdso: avoid 64-bit vdso mapping for compat tasks
vdso_fault used is_compat_task function (on s390 it tests "current"
thread_info flags) to distinguish compat tasks and map 31-bit vdso
pages. But "current" task might not correspond to mm context.

When 31-bit compat inferior is executed under gdb, gdb does
PTRACE_PEEKTEXT on vdso page, causing vdso_fault with "current" being
64-bit gdb process. So, 31-bit inferior ends up with 64-bit vdso mapped.

To avoid this problem a new compat_mm flag has been introduced into
mm context. This flag is used in vdso_fault and vdso_mremap instead
of is_compat_task.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-09-20 13:20:29 +02:00
Martin Schwidefsky
8e5a7627b5 s390: add initial 64-bit restart PSW
To be able to start a kernel image loaded into memory with a
PSW restart, place a 64-bit restart PSW at 0x1a0 in absolute
lowcore.

Suggested-by: Dominik Klein <dominik.klein@linux.ibm.com>
Tested-by: Dominik Klein <dominik.klein@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-09-20 13:20:28 +02:00
Jan Höppner
6779df406b s390/sclp: Allow to request adapter reset
The SCLP event 24 "Adapter Error Notification" supports three different
action qualifier of which 'adapter reset' is currently not enabled in
the sysfs interface. However, userspace tools might want to be able
to use the reset functionality as well. Enable the 'adapter reset'
qualifier.

Signed-off-by: Jan Höppner <hoeppner@linux.ibm.com>
Reviewed-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-09-20 13:20:27 +02:00
Gerald Schaefer
55a5542a54 s390/hibernate: fix error handling when suspend cpu != resume cpu
The resume code checks if the resume cpu is the same as the suspend cpu.
If not, and if it is also not possible to switch to the suspend cpu, an
error message should be printed and the resume process should be stopped
by loading a disabled wait psw.

The current logic is broken in multiple ways, the message is never printed,
and the disabled wait psw never loaded because the kernel panics before that:
- sam31 and SIGP_SET_ARCHITECTURE to ESA mode is wrong, this will break
  on the first 64bit instruction in sclp_early_printk().
- The init stack should be used, but the stack pointer is not set up correctly
  (missing aghi %r15,-STACK_FRAME_OVERHEAD).
- __sclp_early_printk() checks the sclp_init_state. If it is not
  sclp_init_state_uninitialized, it simply returns w/o printing anything.
  In the resumed kernel however, sclp_init_state will never be uninitialized.

This patch fixes those issues by removing the sam31/ESA logic, adding a
correct init stack pointer, and also introducing sclp_early_printk_force()
to allow using sclp_early_printk() even when sclp_init_state is not
uninitialized.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-09-20 13:20:23 +02:00
Paolo Bonzini
cb5fb87a2f KVM: s390: Fixes for 4.19
- more fallout from the hugetlbfs enablement
 - bugfix for vma handling
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJbm68DAAoJEBF7vIC1phx8i3cP/R3UwXGrTq3u5+78PfUu+Nvm
 zTSgHID25/pCsytvRcK7SSJQt+h53RjuiQlOcrtSJZvq3Btt23gl+pFltqvWGZDM
 PYgy+LaaaoIXNeOJqwkwcnNsq4usQ0N1jQ8EgNUxEc2EBkffA6mKcTS0lWDm47AE
 Yxsyz0RsvwcbYCOj10s1b4D9h97+ZkqAC2RDnafzQOb54j/aOH7blsStNDXfdkte
 xHJ7mELOwAWre2QhD7OeTX1aCjBfKUJzWK2sPxm7qt5nEtH18o9F6j2Y/d8dnWg8
 kqthyyb+nH0aHgv3G+7sKBOp1Ra7ftHAVrrbh3m9hceaW5+VuEhWSN6zVZKVdWta
 4bjiWa9I1gKS/Wz479ztdbpt+koRAYSg5CLHAD7YsBR3hHtnAZwuRO38S4Xg+Tfb
 Hp4Lhf8gnq9yQqrcyfKWQKtU9mb84oNtEw/wQPUvr2tZyeoB5NbnVy5UDa9V1DRS
 FnF1/MIZ/MszkwbNEnhk+WWFy41h+uduizfXwQ/YO/xciEEdM2TaDHuLIpBegiSO
 4kFfHRSMxqbwNhQSsJI8dIFpSHqXrtn2p6JTsWdi94B0QBYaNm4XMNpgEZIs5CsE
 aNGhQa+IZ5jk6YsGMx4mqAmV2jN9bMEoKR/SaOvrSkPvAO9QxZP4V6W72YGoi6le
 RZK8Z+HGxg6qpeaAiG0t
 =TT85
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-master-4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390: Fixes for 4.19

- more fallout from the hugetlbfs enablement
- bugfix for vma handling
2018-09-18 15:12:51 +02:00
Linus Torvalds
1d176582c7 s390 fixes for 4.19-rc4
One fix for the zcrypt driver to correctly handle incomplete
 encryption/decryption operations.
 
 A cleanup for the aqmask/apmask parsing to avoid variable
 length arrays on the stack.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJbmkYUAAoJEDjwexyKj9rgrugH/21Uf1S0mkkRfDeYxD6lJva6
 zNmEZ3V+GXc8L/0CisFWQOcIU1fO+jozp9HPGkQxeTvAuqIfVhBRVoMMxiTRaZb3
 xdSBel8EvAGxlZq/6eq6fU59HPWGm+N53rC9J5MMQQqgpSmq8F2QeO5CoidflRh8
 bdio9cliLsjPu+3P2JU3noolhb/f577J3dgP4gKARRpOfh8vUI7NLU3Mham+7886
 ASjG8s/zr9spPnrErJusloOLDJt4M94J8KrIbB/WAT1wZv7GxClaGsCoyCuk4cDt
 TV8zIgMK9TChedwMOO0T8WMaxq+XJV+iI0dC1eZ7xNUlaIBw+UbifxCG335L3E0=
 =dHq1
 -----END PGP SIGNATURE-----

Merge tag 's390-4.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 fixes from Martin Schwidefsky:

 - One fix for the zcrypt driver to correctly handle incomplete
   encryption/decryption operations.

 - A cleanup for the aqmask/apmask parsing to avoid variable length
   arrays on the stack.

* tag 's390-4.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/zcrypt: remove VLA usage from the AP bus
  s390/crypto: Fix return code checking in cbc_paes_crypt()
2018-09-13 16:22:24 -10:00
Janosch Frank
40ebdb8e59 KVM: s390: Make huge pages unavailable in ucontrol VMs
We currently do not notify all gmaps when using gmap_pmdp_xchg(), due
to locking constraints. This makes ucontrol VMs, which is the only VM
type that creates multiple gmaps, incompatible with huge pages. Also
we would need to hold the guest_table_lock of all gmaps that have this
vmaddr maped to synchronize access to the pmd.

ucontrol VMs are rather exotic and creating a new locking concept is
no easy task. Hence we return EINVAL when trying to active
KVM_CAP_S390_HPAGE_1M and report it as being not available when
checking for it.

Fixes: a4499382 ("KVM: s390: Add huge page enablement control")
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Message-Id: <20180801112508.138159-1-frankja@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2018-09-12 14:46:37 +02:00
Janosch Frank
1843abd032 s390/mm: Check for valid vma before zapping in gmap_discard
Userspace could have munmapped the area before doing unmapping from
the gmap. This would leave us with a valid vmaddr, but an invalid vma
from which we would try to zap memory.

Let's check before using the vma.

Fixes: 1e133ab296 ("s390/mm: split arch/s390/mm/pgtable.c")
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Message-Id: <20180816082432.78828-1-frankja@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2018-09-12 14:46:37 +02:00
Janosch Frank
df88f3181f KVM: s390: Properly lock mm context allow_gmap_hpage_1m setting
We have to do down_write on the mm semaphore to set a bitfield in the
mm context.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Fixes: a4499382 ("KVM: s390: Add huge page enablement control")
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-09-04 11:40:26 +02:00
Pierre Morel
204c972456 KVM: s390: vsie: copy wrapping keys to right place
Copy the key mask to the right offset inside the shadow CRYCB

Fixes: bbeaa58b3 ("KVM: s390: vsie: support aes dea wrapping keys")
Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Cc: stable@vger.kernel.org # v4.8+
Message-Id: <1535019956-23539-2-git-send-email-pmorel@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-09-04 11:26:11 +02:00
Janosch Frank
a11bdb1a6b KVM: s390: Fix pfmf and conditional skey emulation
We should not return with a lock.
We also have to increase the address when we do page clearing.

Fixes: bd096f6443 ("KVM: s390: Add skey emulation fault handling")
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Message-Id: <20180830081355.59234-1-frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-09-04 11:24:43 +02:00
Ingo Franzki
b81126e01a s390/crypto: Fix return code checking in cbc_paes_crypt()
The return code of cpacf_kmc() is less than the number of
bytes to process in case of an error, not greater.
The crypt routines for the other cipher modes already have
this correctly.

Cc: stable@vger.kernel.org # v4.11+
Fixes: 2793784307 ("s390/crypt: Add protected key AES module")
Signed-off-by: Ingo Franzki <ifranzki@linux.ibm.com>
Acked-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-09-04 10:58:17 +02:00