Commit Graph

2446 Commits

Author SHA1 Message Date
Martin Schwidefsky
152e9b8676 s390/vtime: steal time exponential moving average
To be able to judge the current overcommitment ratio for a CPU add
a lowcore field with the exponential moving average of the steal time.
The average is updated every tick.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-03-06 14:59:50 +01:00
Linus Torvalds
b1b988a6a0 Merge branch 'timers-2038-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull year 2038 updates from Thomas Gleixner:
 "Another round of changes to make the kernel ready for 2038. After lots
  of preparatory work this is the first set of syscalls which are 2038
  safe:

    403 clock_gettime64
    404 clock_settime64
    405 clock_adjtime64
    406 clock_getres_time64
    407 clock_nanosleep_time64
    408 timer_gettime64
    409 timer_settime64
    410 timerfd_gettime64
    411 timerfd_settime64
    412 utimensat_time64
    413 pselect6_time64
    414 ppoll_time64
    416 io_pgetevents_time64
    417 recvmmsg_time64
    418 mq_timedsend_time64
    419 mq_timedreceiv_time64
    420 semtimedop_time64
    421 rt_sigtimedwait_time64
    422 futex_time64
    423 sched_rr_get_interval_time64

  The syscall numbers are identical all over the architectures"

* 'timers-2038-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
  riscv: Use latest system call ABI
  checksyscalls: fix up mq_timedreceive and stat exceptions
  unicore32: Fix __ARCH_WANT_STAT64 definition
  asm-generic: Make time32 syscall numbers optional
  asm-generic: Drop getrlimit and setrlimit syscalls from default list
  32-bit userspace ABI: introduce ARCH_32BIT_OFF_T config option
  compat ABI: use non-compat openat and open_by_handle_at variants
  y2038: add 64-bit time_t syscalls to all 32-bit architectures
  y2038: rename old time and utime syscalls
  y2038: remove struct definition redirects
  y2038: use time32 syscall names on 32-bit
  syscalls: remove obsolete __IGNORE_ macros
  y2038: syscalls: rename y2038 compat syscalls
  x86/x32: use time64 versions of sigtimedwait and recvmmsg
  timex: change syscalls to use struct __kernel_timex
  timex: use __kernel_timex internally
  sparc64: add custom adjtimex/clock_adjtime functions
  time: fix sys_timer_settime prototype
  time: Add struct __kernel_timex
  time: make adjtime compat handling available for 32 bit
  ...
2019-03-05 14:08:26 -08:00
Linus Torvalds
3591b19511 s390 updates for the 5.1 merge window
- A copy of Arnds compat wrapper generation series
 
  - Pass information about the KVM guest to the host in form the control
    program code and the control program version code
 
  - Map IOV resources to support PCI physical functions on s390
 
  - Add vector load and store alignment hints to improve performance
 
  - Use the "jdd" constraint with gcc 9 to make jump labels working again
 
  - Remove amode workaround for old z/VM releases from the DCSS code
 
  - Add support for in-kernel performance measurements using the
    CPU measurement counter facility
 
  - Introduce a new PMU device cpum_cf_diag to capture counters and
    store thenn as event raw data.
 
  - Bug fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJcfh4QAAoJEDjwexyKj9rgXVAH/RzVbi3vznldujSNfCFTZKPu
 EmFFAZIfbhifW3szfylyOJL52pFhxjcWzY0hkFEkbs2t90sn8l1BNkDscYZtfNHC
 XvN3N9LsHyxOeyxvQuWLSio58qm+Lr1L0UrIhbMvqyAVkOLmIHvybFwi83OkMptm
 djoL8NbuNsAA2s26y2bZLNtU7FmOW5smJIlnt7H4dmK4SFylqZKS/EnUZxGDgn+7
 UrrTTOQUir0QZ8vraANsP1M0/LqPcd2YusLmj4jOdZ5Muc2Ch2AA991FofqdKShO
 /8cGlsIzwHWGgdnP/YDea5gbetvonayYduixKy3EnYpWQ9iogiBjH4G7QNxcncs=
 =v26J
 -----END PGP SIGNATURE-----

Merge tag 's390-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Martin Schwidefsky:

 - A copy of Arnds compat wrapper generation series

 - Pass information about the KVM guest to the host in form the control
   program code and the control program version code

 - Map IOV resources to support PCI physical functions on s390

 - Add vector load and store alignment hints to improve performance

 - Use the "jdd" constraint with gcc 9 to make jump labels working again

 - Remove amode workaround for old z/VM releases from the DCSS code

 - Add support for in-kernel performance measurements using the CPU
   measurement counter facility

 - Introduce a new PMU device cpum_cf_diag to capture counters and store
   thenn as event raw data.

 - Bug fixes and cleanups

* tag 's390-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (54 commits)
  Revert "s390/cpum_cf: Add kernel message exaplanations"
  s390/dasd: fix read device characteristic with CONFIG_VMAP_STACK=y
  s390/suspend: fix prefix register reset in swsusp_arch_resume
  s390: warn about clearing als implied facilities
  s390: allow overriding facilities via command line
  s390: clean up redundant facilities list setup
  s390/als: remove duplicated in-place implementation of stfle
  s390/cio: Use cpa range elsewhere within vfio-ccw
  s390/cio: Fix vfio-ccw handling of recursive TICs
  s390: vfio_ap: link the vfio_ap devices to the vfio_ap bus subsystem
  s390/cpum_cf: Handle EBUSY return code from CPU counter facility reservation
  s390/cpum_cf: Add kernel message exaplanations
  s390/cpum_cf_diag: Add support for s390 counter facility diagnostic trace
  s390/cpum_cf: add ctr_stcctm() function
  s390/cpum_cf: move common functions into a separate file
  s390/cpum_cf: introduce kernel_cpumcf_avail() function
  s390/cpu_mf: replace stcctm5() with the stcctm() function
  s390/cpu_mf: add store cpu counter multiple instruction support
  s390/cpum_cf: Add minimal in-kernel interface for counter measurements
  s390/cpum_cf: introduce kernel_cpumcf_alert() to obtain measurement alerts
  ...
2019-03-05 11:13:10 -08:00
Martin Schwidefsky
c8e8ed386a s390/suspend: fix prefix register reset in swsusp_arch_resume
The reset of the prefix to zero in swsusp_arch_resume uses a 4 byte stack
slot. With CONFIG_VMAP_STACK=y this is now in the vmalloc area, this works
only with DAT enabled. Move the DAT disable in swsusp_arch_resume after
the prefix reset.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-03-01 16:23:27 +01:00
Vasily Gorbik
d8901f2b2d s390: clean up redundant facilities list setup
Facilities list in the lowcore is initially set up by verify_facilities
from als.c and later initializations are redundant, so cleaning them up.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-03-01 08:00:39 +01:00
Thomas Richter
47b7478583 s390/cpum_cf: Handle EBUSY return code from CPU counter facility reservation
Rservation of the CPU Measurement Counter facility may fail if
it is already in use by the cf_diag device driver.
This is indicated by a non zero return code (-EBUSY).
However this return code is ignored and the counter facility
may be used in parallel by different device drivers.

Handle the failing reservation and return an error to the
caller.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-02-22 09:19:58 +01:00
Thomas Richter
fe5908bccc s390/cpum_cf_diag: Add support for s390 counter facility diagnostic trace
Introduce a PMU device named cpum_cf_diag. It extracts the
values of all counters in all authorized counter sets and stores
them as event raw data. This is done with the STORE CPU COUNTER
MULTIPLE instruction to speed up access. All counter sets
fit into one buffer. The values of each counter are taken
when the event is started on the performance sub-system and when
the event is stopped.
This results in counter values available at the start and
at the end of the measurement time frame. The difference is
calculated for each counter. The differences of all
counters are then saved as event raw data in the perf.data
file.

The counter values are accompanied by the time stamps
when the counter set was started and when the counter set
was stopped. This data is part of a trailer entry which
describes the time frame, counter set version numbers,
CPU speed, and machine type for later analysis.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-02-22 09:19:56 +01:00
Hendrik Brueckner
7f5ac1a022 s390/cpum_cf: move common functions into a separate file
Move common functions of the couter facility support into a separate
file.

Signed-off-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-02-22 09:19:55 +01:00
Hendrik Brueckner
869f4f98fa s390/cpum_cf: introduce kernel_cpumcf_avail() function
A preparation to move out common CPU-MF counter facility support
functions, first introduce a function that indicates whether the
support is ready to use.

Signed-off-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-02-22 09:19:54 +01:00
Hendrik Brueckner
346d034d7f s390/cpu_mf: replace stcctm5() with the stcctm() function
Remove the stcctm5() function to extract counters from the MT-diagnostic
counter set with the stcctm() function.  For readability, introduce an
enum to map the counter sets names to respective numbers for the stcctm
instruction.

Signed-off-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-02-22 09:19:53 +01:00
Hendrik Brueckner
26b8317f51 s390/cpum_cf: introduce kernel_cpumcf_alert() to obtain measurement alerts
During a __kernel_cpumcf_begin()/end() session, save measurement alerts
for the counter facility in the per-CPU cpu_cf_events variable.
Users can obtain and, optionally, clear the alerts by calling
kernel_cpumcf_alert() to specifically handle alerts.

Signed-off-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-02-22 09:19:50 +01:00
Hendrik Brueckner
f944bcdf5b s390/cpu_mf: move struct cpu_cf_events and per-CPU variable to header file
Make the struct cpu_cf_events and the respective per-CPU variable available
to in-kernel users.  Access to this per-CPU variable shall be done between
the calls to __kernel_cpumcf_begin() and __kernel_cpumcf_end().

Signed-off-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-02-22 09:19:49 +01:00
Hendrik Brueckner
f1c0b83173 s390/cpum_cf: rename per-CPU counter facility structure and variables
Rename the struct cpu_hw_events to cpu_cf_events and also the respective
per-CPU variable to make its name more clear.  No functional changes.

Signed-off-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-02-22 09:19:48 +01:00
Hendrik Brueckner
3d33345aa3 s390/cpum_cf: prepare for in-kernel counter measurements
Prepare the counter facility support to be used by other in-kernel
users.  The first step introduces the __kernel_cpumcf_begin() and
__kernel_cpumcf_end() functions to reserve the counter facility
for doing measurements and to release after the measurements are
done.

Signed-off-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-02-22 09:19:47 +01:00
Hendrik Brueckner
30e145f811 s390/cpum_cf: move counter set controls to a new header file
Move counter set specific controls and functions to the asm/cpu_mcf.h
header file containg all counter facility support definitions.  Also
adapt few variable names and header file includes.  No functional changes.

Signed-off-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-02-22 09:19:46 +01:00
Martin Schwidefsky
86a86804e4 s390/setup: fix boot crash for machine without EDAT-1
The fix to make WARN work in the early boot code created a problem
on older machines without EDAT-1. The setup_lowcore_dat_on function
uses the pointer from lowcore_ptr[0] to set the DAT bit in the new
PSWs. That does not work if the kernel page table is set up with
4K pages as the prefix address maps to absolute zero.

To make this work the PSWs need to be changed with via address 0 in
form of the S390_lowcore definition.

Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Cornelia Huck <cohuck@redhat.com>
Fixes: 94f85ed3e2f8 ("s390/setup: fix early warning messages")
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-02-20 09:48:31 +01:00
Martin Schwidefsky
8727638426 s390/setup: fix early warning messages
The setup_lowcore() function creates a new prefix page for the boot CPU.
The PSW mask for the system_call, external interrupt, i/o interrupt and
the program check handler have the DAT bit set in this new prefix page.

At the time setup_lowcore is called the system still runs without virtual
address translation, the paging_init() function creates the kernel page
table and loads the CR13 with the kernel ASCE.

Any code between setup_lowcore() and the end of paging_init() that has
a BUG or WARN statement will create a program check that can not be
handled correctly as there is no kernel page table yet.

To allow early WARN statements initially setup the lowcore with DAT off
and set the DAT bit only after paging_init() has completed.

Cc: stable@vger.kernel.org
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-02-20 09:48:29 +01:00
Thomas Gleixner
41ea39101d y2038: Add time64 system calls
This series finally gets us to the point of having system calls with
 64-bit time_t on all architectures, after a long time of incremental
 preparation patches.
 
 There was actually one conversion that I missed during the summer,
 i.e. Deepa's timex series, which I now updated based the 5.0-rc1 changes
 and review comments.
 
 The following system calls are now added on all 32-bit architectures
 using the same system call numbers:
 
 403 clock_gettime64
 404 clock_settime64
 405 clock_adjtime64
 406 clock_getres_time64
 407 clock_nanosleep_time64
 408 timer_gettime64
 409 timer_settime64
 410 timerfd_gettime64
 411 timerfd_settime64
 412 utimensat_time64
 413 pselect6_time64
 414 ppoll_time64
 416 io_pgetevents_time64
 417 recvmmsg_time64
 418 mq_timedsend_time64
 419 mq_timedreceiv_time64
 420 semtimedop_time64
 421 rt_sigtimedwait_time64
 422 futex_time64
 423 sched_rr_get_interval_time64
 
 Each one of these corresponds directly to an existing system call
 that includes a 'struct timespec' argument, or a structure containing
 a timespec or (in case of clock_adjtime) timeval. Not included here
 are new versions of getitimer/setitimer and getrusage/waitid, which
 are planned for the future but only needed to make a consistent API
 rather than for correct operation beyond y2038. These four system
 calls are based on 'timeval', and it has not been finally decided
 what the replacement kernel interface will use instead.
 
 So far, I have done a lot of build testing across most architectures,
 which has found a number of bugs. Runtime testing so far included
 testing LTP on 32-bit ARM with the existing system calls, to ensure
 we do not regress for existing binaries, and a test with a 32-bit
 x86 build of LTP against a modified version of the musl C library
 that has been adapted to the new system call interface [3].
 This library can be used for testing on all architectures supported
 by musl-1.1.21, but it is not how the support is getting integrated
 into the official musl release. Official musl support is planned
 but will require more invasive changes to the library.
 
 Link: https://lore.kernel.org/lkml/20190110162435.309262-1-arnd@arndb.de/T/
 Link: https://lore.kernel.org/lkml/20190118161835.2259170-1-arnd@arndb.de/
 Link: https://git.linaro.org/people/arnd/musl-y2038.git/ [2]
 Signed-off-by: Arnd Bergmann <arnd@arndb.de>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJcXf7/AAoJEGCrR//JCVInPSUP/RhsQSCKMGtONB/vVICQhwep
 PybhzBSpHWFxszzTi6BEPN1zS9B069G9mDollRBYZCckyPqL/Bv6sI/vzQZdNk01
 Q6Nw92OnNE1QP8owZ5TjrZhpbtopWdqIXjsbGZlloUemvuJP2JwvKovQUcn5CPTQ
 jbnqU04CVyFFJYVxAnGJ+VSeWNrjW/cm/m+rhLFjUcwW7Y3aodxsPqPP6+K9hY9P
 yIWfcH42WBeEWGm1RSBOZOScQl4SGCPUAhFydl/TqyEQagyegJMIyMOv9wZ5AuTT
 xK644bDVmNsrtJDZDpx+J8hytXCk1LrnKzkHR/uK80iUIraF/8D7PlaPgTmEEjko
 XcrywEkvkXTVU3owCm2/sbV+8fyFKzSPipnNfN1JNxEX71A98kvMRtPjDueQq/GA
 Yh81rr2YLF2sUiArkc2fNpENT7EGhrh1q6gviK3FB8YDgj1kSgPK5wC/X0uolC35
 E7iC2kg4NaNEIjhKP/WKluCaTvjRbvV+0IrlJLlhLTnsqbA57ZKCCteiBrlm7wQN
 4csUtCyxchR9Ac2o/lj+Mf53z68Zv74haIROp18K2dL7ZpVcOPnA3XHeauSAdoyp
 wy2Ek6ilNvlNB+4x+mRntPoOsyuOUGv7JXzB9JvweLWUd9G7tvYeDJQp/0YpDppb
 K4UWcKnhtEom0DgK08vY
 =IZVb
 -----END PGP SIGNATURE-----

Merge tag 'y2038-new-syscalls' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground into timers/2038

Pull y2038 - time64 system calls from Arnd Bergmann:

This series finally gets us to the point of having system calls with 64-bit
time_t on all architectures, after a long time of incremental preparation
patches.

There was actually one conversion that I missed during the summer,
i.e. Deepa's timex series, which I now updated based the 5.0-rc1 changes
and review comments.

The following system calls are now added on all 32-bit architectures using
the same system call numbers:

403 clock_gettime64
404 clock_settime64
405 clock_adjtime64
406 clock_getres_time64
407 clock_nanosleep_time64
408 timer_gettime64
409 timer_settime64
410 timerfd_gettime64
411 timerfd_settime64
412 utimensat_time64
413 pselect6_time64
414 ppoll_time64
416 io_pgetevents_time64
417 recvmmsg_time64
418 mq_timedsend_time64
419 mq_timedreceiv_time64
420 semtimedop_time64
421 rt_sigtimedwait_time64
422 futex_time64
423 sched_rr_get_interval_time64

Each one of these corresponds directly to an existing system call that
includes a 'struct timespec' argument, or a structure containing a timespec
or (in case of clock_adjtime) timeval. Not included here are new versions
of getitimer/setitimer and getrusage/waitid, which are planned for the
future but only needed to make a consistent API rather than for correct
operation beyond y2038. These four system calls are based on 'timeval', and
it has not been finally decided what the replacement kernel interface will
use instead.

So far, I have done a lot of build testing across most architectures, which
has found a number of bugs. Runtime testing so far included testing LTP on
32-bit ARM with the existing system calls, to ensure we do not regress for
existing binaries, and a test with a 32-bit x86 build of LTP against a
modified version of the musl C library that has been adapted to the new
system call interface [3].  This library can be used for testing on all
architectures supported by musl-1.1.21, but it is not how the support is
getting integrated into the official musl release. Official musl support is
planned but will require more invasive changes to the library.

Link: https://lore.kernel.org/lkml/20190110162435.309262-1-arnd@arndb.de/T/
Link: https://lore.kernel.org/lkml/20190118161835.2259170-1-arnd@arndb.de/
Link: https://git.linaro.org/people/arnd/musl-y2038.git/ [2]
2019-02-10 21:24:43 +01:00
Thomas Gleixner
fd659cc095 arch: System call unification and cleanup
The system call tables have diverged a bit over the years, and a number
 of the recent additions never made it into all architectures, for one
 reason or another.
 
 This is an attempt to clean it up as far as we can without breaking
 compatibility, doing a number of steps:
 
 - Add system calls that have not yet been integrated into all
   architectures but that we definitely want there. This includes
   {,f}statfs64() and get{eg,eu,g,p,u,pp}id() on alpha, which have
   been missing traditionally.
 
 - The s390 compat syscall handling is cleaned up to be more like
   what we do on other architectures, while keeping the 31-bit
   pointer extension. This was merged as a shared branch by the
   s390 maintainers and is included here in order to base the other
   patches on top.
 
 - Add the separate ipc syscalls on all architectures that
   traditionally only had sys_ipc(). This version is done without
   support for IPC_OLD that is we have in sys_ipc. The
   new semtimedop_time64 syscall will only be added here, not
   in sys_ipc
 
 - Add syscall numbers for a couple of syscalls that we probably
   don't need everywhere, in particular pkey_* and rseq,
   for the purpose of symmetry: if it's in asm-generic/unistd.h,
   it makes sense to have it everywhere. I expect that any future
   system calls will get assigned on all platforms together, even
   when they appear to be specific to a single architecture.
 
 - Prepare for having the same system call numbers for any future
   calls. In combination with the generated tables, this hopefully
   makes it easier to add new calls across all architectures
   together.
 
 All of the above are technically separate from the y2038 work,
 but are done as preparation before we add the new 64-bit time_t
 system calls everywhere, providing a common baseline set of system
 calls.
 
 I expect that glibc and other libraries that want to use 64-bit
 time_t will require linux-5.1 kernel headers for building in
 the future, and at a much later point may also require linux-5.1
 or a later version as the minimum kernel at runtime. Having a
 common baseline then allows the removal of many architecture or
 kernel version specific workarounds.
 
 Signed-off-by: Arnd Bergmann <arnd@arndb.de>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJcXf6XAAoJEGCrR//JCVInIm4P/AlkMmQRa/B2ziWMW6PifPoI
 v18r44017rA1BPENyZvumJUdM5mDvNofOW8F2DYQ7Uiys2YtXenwe/Cf8LHn2n6c
 TMXGQryQpvNmfDCyU+0UjF8m2+poFMrL4aRTXtjODh1YTsPNgeDC+KFMCAAtZmZd
 cVbXFudtbdYKD/pgCX4SI1CWAMBiXe2e+ukPdJVr+iqusCMTApf+GOuyvDBZY9s/
 vURb+tIS87HZ/jehWfZFSuZt+Gu7b3ijUXNC8v9qSIxNYekw62vBNl6F09HE79uB
 Bv4OujAODqKvI9gGyydBzLJNzaMo0ryQdusyqcJHT7MY/8s+FwcYAXyTlQ3DbbB4
 2u/c+58OwJ9Zk12p4LXZRA47U+vRhQt2rO4+zZWs2txNNJY89ZvCm/Z04KOiu5Xz
 1Nnj607KGzthYRs2gs68AwzGGyf0uykIQ3RcaJLIBlX1Nd8BWO0ZgAguCvkXbQMX
 XNXJTd92HmeuKKpiO0n/M4/mCeP0cafBRPCZbKlHyTl0Jeqd/HBQEO9Z8Ifwyju3
 mXz9JCR9VlPCkX605keATbjtPGZf3XQtaXlQnezitDudXk8RJ33EpPcbhx76wX7M
 Rux37ByqEOzk4wMGX9YQyNU7z7xuVg4sJAa2LlJqYeKXHtym+u3gG7SGP5AsYjmk
 6mg2+9O2yZuLhQtOtrwm
 =s4wf
 -----END PGP SIGNATURE-----

Merge tag 'y2038-syscall-cleanup' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground into timers/2038

Pull preparatory work for y2038 changes from Arnd Bergmann:

System call unification and cleanup

The system call tables have diverged a bit over the years, and a number of
the recent additions never made it into all architectures, for one reason
or another.

This is an attempt to clean it up as far as we can without breaking
compatibility, doing a number of steps:

 - Add system calls that have not yet been integrated into all architectures
   but that we definitely want there. This includes {,f}statfs64() and
   get{eg,eu,g,p,u,pp}id() on alpha, which have been missing traditionally.

 - The s390 compat syscall handling is cleaned up to be more like what we
   do on other architectures, while keeping the 31-bit pointer
   extension. This was merged as a shared branch by the s390 maintainers
   and is included here in order to base the other patches on top.

 - Add the separate ipc syscalls on all architectures that traditionally
   only had sys_ipc(). This version is done without support for IPC_OLD
   that is we have in sys_ipc. The new semtimedop_time64 syscall will only
   be added here, not in sys_ipc

 - Add syscall numbers for a couple of syscalls that we probably don't need
   everywhere, in particular pkey_* and rseq, for the purpose of symmetry:
   if it's in asm-generic/unistd.h, it makes sense to have it everywhere. I
   expect that any future system calls will get assigned on all platforms
   together, even when they appear to be specific to a single architecture.

 - Prepare for having the same system call numbers for any future calls. In
   combination with the generated tables, this hopefully makes it easier to
   add new calls across all architectures together.

All of the above are technically separate from the y2038 work, but are done
as preparation before we add the new 64-bit time_t system calls everywhere,
providing a common baseline set of system calls.

I expect that glibc and other libraries that want to use 64-bit time_t will
require linux-5.1 kernel headers for building in the future, and at a much
later point may also require linux-5.1 or a later version as the minimum
kernel at runtime. Having a common baseline then allows the removal of many
architecture or kernel version specific workarounds.
2019-02-10 20:44:19 +01:00
YueHaibing
f8b11e089a s390: remove unused including <linux/version.h>
Remove including <linux/version.h> that don't need it.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-02-07 11:57:09 +01:00
Gerald Schaefer
d4192437d7 s390: remove dead code
Remove some dead code from head64.S, which was left over since commit
da292bbe1f ("[S390] eliminate ipl_device from lowcore") removed
ipl_device from lowcore.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-02-07 11:56:57 +01:00
Gerald Schaefer
ea0ca93d6a s390/setup: remove obsolete #ifdef
The #ifdef CONFIG_DMA_API_DEBUG check in reserve_kernel() is no longer
needed, since commit ea535e418c ("dma-debug: switch check from _text
to _stext") changed the logic in lib/dma-debug.c, so remove it.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-02-07 11:56:55 +01:00
Arnd Bergmann
48166e6ea4 y2038: add 64-bit time_t syscalls to all 32-bit architectures
This adds 21 new system calls on each ABI that has 32-bit time_t
today. All of these have the exact same semantics as their existing
counterparts, and the new ones all have macro names that end in 'time64'
for clarification.

This gets us to the point of being able to safely use a C library
that has 64-bit time_t in user space. There are still a couple of
loose ends to tie up in various areas of the code, but this is the
big one, and should be entirely uncontroversial at this point.

In particular, there are four system calls (getitimer, setitimer,
waitid, and getrusage) that don't have a 64-bit counterpart yet,
but these can all be safely implemented in the C library by wrapping
around the existing system calls because the 32-bit time_t they
pass only counts elapsed time, not time since the epoch. They
will be dealt with later.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
2019-02-07 00:13:28 +01:00
Arnd Bergmann
8dabe7245b y2038: syscalls: rename y2038 compat syscalls
A lot of system calls that pass a time_t somewhere have an implementation
using a COMPAT_SYSCALL_DEFINEx() on 64-bit architectures, and have
been reworked so that this implementation can now be used on 32-bit
architectures as well.

The missing step is to redefine them using the regular SYSCALL_DEFINEx()
to get them out of the compat namespace and make it possible to build them
on 32-bit architectures.

Any system call that ends in 'time' gets a '32' suffix on its name for
that version, while the others get a '_time32' suffix, to distinguish
them from the normal version, which takes a 64-bit time argument in the
future.

In this step, only 64-bit architectures are changed, doing this rename
first lets us avoid touching the 32-bit architectures twice.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-02-07 00:13:27 +01:00
Greg Kroah-Hartman
7dd541a3fb s390: no need to check return value of debugfs_create functions
When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: linux-s390@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-01-28 15:58:55 +01:00
Collin Walling
4ad78b8651 s390/setup: set control program code via diag 318
The s390x diagnose 318 instruction sets the control program name code (CPNC)
and control program version code (CPVC) to provide useful information
regarding the OS during debugging. The CPNC is explicitly set to 4 to
indicate a Linux/KVM environment.

The CPVC is a 7-byte value containing:

 - 3-byte Linux version code, currently set to 0
 - 3-byte unique value, currently set to 0
 - 1-byte trailing null

Signed-off-by: Collin Walling <walling@linux.ibm.com>
Acked-by: Janosch Frank <frankja@linux.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <1544135405-22385-2-git-send-email-walling@linux.ibm.com>
[set version code to 0 until the structure is fully defined]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-01-28 15:58:53 +01:00
Martin Schwidefsky
634692ab70 s390/suspend: fix stack setup in swsusp_arch_suspend
The patch that added support for the virtually mapped kernel stacks changed
swsusp_arch_suspend to switch to the nodat-stack as the vmap stack is not
available while going in and out of suspend.

Unfortunately the switch to the nodat-stack is incorrect which breaks
suspend to disk.

Cc: stable@vger.kernel.org # v4.20
Fixes: ce3dc44749 ("s390: add support for virtually mapped kernel stacks")
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-01-28 15:43:17 +01:00
Arnd Bergmann
b41c51c8e1 arch: add pkey and rseq syscall numbers everywhere
Most architectures define system call numbers for the rseq and pkey system
calls, even when they don't support the features, and perhaps never will.

Only a few architectures are missing these, so just define them anyway
for consistency. If we decide to add them later to one of these, the
system call numbers won't get out of sync then.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
2019-01-25 17:22:50 +01:00
Arnd Bergmann
0d6040d468 arch: add split IPC system calls where needed
The IPC system call handling is highly inconsistent across architectures,
some use sys_ipc, some use separate calls, and some use both.  We also
have some architectures that require passing IPC_64 in the flags, and
others that set it implicitly.

For the addition of a y2038 safe semtimedop() system call, I chose to only
support the separate entry points, but that requires first supporting
the regular ones with their own syscall numbers.

The IPC_64 is now implied by the new semctl/shmctl/msgctl system
calls even on the architectures that require passing it with the ipc()
multiplexer.

I'm not adding the new semtimedop() or semop() on 32-bit architectures,
those will get implemented using the new semtimedop_time64() version
that gets added along with the other time64 calls.
Three 64-bit architectures (powerpc, s390 and sparc) get semtimedop().

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2019-01-25 17:22:50 +01:00
Arnd Bergmann
90856087da s390: remove compat_wrapper.c
Now that all these wrappers are automatically generated, we can
remove the entire file, and instead point to the regualar syscalls
like all other architectures do.

The 31-bit pointer extension is now handled in the __s390_sys_*()
wrappers.

Link: https://lore.kernel.org/lkml/20190116131527.2071570-6-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-01-18 09:33:20 +01:00
Arnd Bergmann
aa0d6e70d3 s390: autogenerate compat syscall wrappers
Any system call that takes a pointer argument on s390 requires
a wrapper function to do a 31-to-64 zero-extension, these are
currently generated in arch/s390/kernel/compat_wrapper.c.

On arm64 and x86, we already generate similar wrappers for all
system calls in the place of their definition, just for a different
purpose (they load the arguments from pt_regs).

We can do the same thing here, by adding an asm/syscall_wrapper.h
file with a copy of all the relevant macros to override the generic
version. Besides the addition of the compat entry point, these also
rename the entry points with a __s390_ or __s390x_ prefix, similar
to what we do on arm64 and x86. This in turn requires renaming
a few things, and adding a proper ni_syscall() entry point.

In order to still compile system call definitions that pass an
loff_t argument, the __SC_COMPAT_CAST() macro checks for that
and forces an -ENOSYS error, which was the best I could come up
with. Those functions must obviously not get called from user
space, but instead require hand-written compat_sys_*() handlers,
which fortunately already exist.

Link: https://lore.kernel.org/lkml/20190116131527.2071570-5-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
[heiko.carstens@de.ibm.com: compile fix for !CONFIG_COMPAT]
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-01-18 09:33:19 +01:00
Arnd Bergmann
fef747bab3 s390: use generic UID16 implementation
s390 has an almost identical copy of the code in kernel/uid16.c.

The problem here is that it requires calling the regular system calls,
which the generic implementation handles correctly, but the internal
interfaces are not declared in a global header for this.

The best way forward here seems to be to just use the generic code and
delete the s390 specific implementation.

I keep the changes to uapi/asm/posix_types.h inside of an #ifdef check
so user space does not observe any changes. As some of the system calls
pass pointers, we also need wrappers in compat_wrapper.c, which I add
for all calls with at least one argument. All those wrappers can be
removed in a later step.

Link: https://lore.kernel.org/lkml/20190116131527.2071570-4-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-01-18 09:33:18 +01:00
Arnd Bergmann
58fa4a410f ipc: introduce ksys_ipc()/compat_ksys_ipc() for s390
The sys_ipc() and compat_ksys_ipc() functions are meant to only
be used from the system call table, not called by another function.

Introduce ksys_*() interfaces for this purpose, as we have done
for many other system calls.

Link: https://lore.kernel.org/lkml/20190116131527.2071570-3-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
[heiko.carstens@de.ibm.com: compile fix for !CONFIG_COMPAT]
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-01-18 09:33:18 +01:00
Arnd Bergmann
1ecff5ef0a s390: open-code s390_personality syscall
Patch series "s390: rework compat wrapper generation".

As promised, I gave this a go and changed the SYSCALL_DEFINEx()
infrastructure to always include the wrappers for doing the
31-bit argument conversion on s390 compat mode.

This does three main things:

- The UID16 rework saved a lot of duplicated code, and would
  probably make sense by itself, but is also required as
  we can no longer call sys_*() functions directly after the
  last step.

- Removing the compat_wrapper.c file is of course the main
  goal here, in order to remove the need to maintain the
  compat_wrapper.c file when new system calls get added.
  Unfortunately, this requires adding some complexity in
  syscall_wrapper.h, and trades a small reduction in source
  code lines for a small increase in binary size for
  unused wrappers.

- As an added benefit, the use of syscall_wrapper.h now makes
  it easy to change the syscall wrappers so they no longer
  see all user space register contents, similar to changes
  done in commits fa697140f9 ("syscalls/x86: Use 'struct pt_regs'
  based syscall calling convention for 64-bit syscalls") and
  4378a7d4be ("arm64: implement syscall wrappers").
  I leave the actual implementation of this for you, if you
  want to do it later.

I did not test the changes at runtime, but I looked at the
generated object code, which seems fine here and includes
the same conversions as before.

This patch(of 5):

The sys_personality function is not meant to be called from other system
calls. We could introduce an intermediate ksys_personality function,
but it does almost nothing, so this just moves the implementation into
the caller.

Link: https://lore.kernel.org/lkml/20190116131527.2071570-1-arnd@arndb.de
Link: https://lore.kernel.org/lkml/20190116131527.2071570-2-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-01-18 09:33:17 +01:00
David Hildenbrand
60f1bf29c0 s390/smp: Fix calling smp_call_ipl_cpu() from ipl CPU
When calling smp_call_ipl_cpu() from the IPL CPU, we will try to read
from pcpu_devices->lowcore. However, due to prefixing, that will result
in reading from absolute address 0 on that CPU. We have to go via the
actual lowcore instead.

This means that right now, we will read lc->nodat_stack == 0 and
therfore work on a very wrong stack.

This BUG essentially broke rebooting under QEMU TCG (which will report
a low address protection exception). And checking under KVM, it is
also broken under KVM. With 1 VCPU it can be easily triggered.

:/# echo 1 > /proc/sys/kernel/sysrq
:/# echo b > /proc/sysrq-trigger
[   28.476745] sysrq: SysRq : Resetting
[   28.476793] Kernel stack overflow.
[   28.476817] CPU: 0 PID: 424 Comm: sh Not tainted 5.0.0-rc1+ #13
[   28.476820] Hardware name: IBM 2964 NE1 716 (KVM/Linux)
[   28.476826] Krnl PSW : 0400c00180000000 0000000000115c0c (pcpu_delegate+0x12c/0x140)
[   28.476861]            R:0 T:1 IO:0 EX:0 Key:0 M:0 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
[   28.476863] Krnl GPRS: ffffffffffffffff 0000000000000000 000000000010dff8 0000000000000000
[   28.476864]            0000000000000000 0000000000000000 0000000000ab7090 000003e0006efbf0
[   28.476864]            000000000010dff8 0000000000000000 0000000000000000 0000000000000000
[   28.476865]            000000007fffc000 0000000000730408 000003e0006efc58 0000000000000000
[   28.476887] Krnl Code: 0000000000115bfe: 4170f000            la      %r7,0(%r15)
[   28.476887]            0000000000115c02: 41f0a000            la      %r15,0(%r10)
[   28.476887]           #0000000000115c06: e370f0980024        stg     %r7,152(%r15)
[   28.476887]           >0000000000115c0c: c0e5fffff86e        brasl   %r14,114ce8
[   28.476887]            0000000000115c12: 41f07000            la      %r15,0(%r7)
[   28.476887]            0000000000115c16: a7f4ffa8            brc     15,115b66
[   28.476887]            0000000000115c1a: 0707                bcr     0,%r7
[   28.476887]            0000000000115c1c: 0707                bcr     0,%r7
[   28.476901] Call Trace:
[   28.476902] Last Breaking-Event-Address:
[   28.476920]  [<0000000000a01c4a>] arch_call_rest_init+0x22/0x80
[   28.476927] Kernel panic - not syncing: Corrupt kernel stack, can't continue.
[   28.476930] CPU: 0 PID: 424 Comm: sh Not tainted 5.0.0-rc1+ #13
[   28.476932] Hardware name: IBM 2964 NE1 716 (KVM/Linux)
[   28.476932] Call Trace:

Fixes: 2f859d0dad ("s390/smp: reduce size of struct pcpu")
Cc: stable@vger.kernel.org # 4.0+
Reported-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-01-11 17:12:03 +01:00
Vasily Gorbik
190f056fba s390/vdso: correct vdso mapping for compat tasks
While "s390/vdso: avoid 64-bit vdso mapping for compat tasks" fixed
64-bit vdso mapping for compat tasks under gdb it introduced another
problem. "compat_mm" flag is not inherited during fork and when
31-bit process forks a child (but does not perform exec) it ends up
with 64-bit vdso. To address that, init_new_context (which is called
during fork and exec) now initialize compat_mm based on thread TIF_31BIT
flag. Later compat_mm is adjusted in arch_setup_additional_pages, which
is called during exec.

Fixes: d1befa6582 ("s390/vdso: avoid 64-bit vdso mapping for compat tasks")
Reported-by: Stefan Liebler <stli@linux.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: <stable@vger.kernel.org> # v4.20+
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-01-11 17:12:02 +01:00
Gerald Schaefer
b7cb707c37 s390/smp: fix CPU hotplug deadlock with CPU rescan
smp_rescan_cpus() is called without the device_hotplug_lock, which can lead
to a dedlock when a new CPU is found and immediately set online by a udev
rule.

This was observed on an older kernel version, where the cpu_hotplug_begin()
loop was still present, and it resulted in hanging chcpu and systemd-udev
processes. This specific deadlock will not show on current kernels. However,
there may be other possible deadlocks, and since smp_rescan_cpus() can still
trigger a CPU hotplug operation, the device_hotplug_lock should be held.

For reference, this was the deadlock with the old cpu_hotplug_begin() loop:

        chcpu (rescan)                       systemd-udevd

 echo 1 > /sys/../rescan
 -> smp_rescan_cpus()
 -> (*) get_online_cpus()
    (increases refcount)
 -> smp_add_present_cpu()
    (new CPU found)
 -> register_cpu()
 -> device_add()
 -> udev "add" event triggered -----------> udev rule sets CPU online
                                         -> echo 1 > /sys/.../online
                                         -> lock_device_hotplug_sysfs()
                                            (this is missing in rescan path)
                                         -> device_online()
                                         -> (**) device_lock(new CPU dev)
                                         -> cpu_up()
                                         -> cpu_hotplug_begin()
                                            (loops until refcount == 0)
                                            -> deadlock with (*)
 -> bus_probe_device()
 -> device_attach()
 -> device_lock(new CPU dev)
    -> deadlock with (**)

Fix this by taking the device_hotplug_lock in the CPU rescan path.

Cc: <stable@vger.kernel.org>
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-01-11 17:12:02 +01:00
Christian Borntraeger
03aa047ef2 s390/early: improve machine detection
Right now the early machine detection code check stsi 3.2.2 for "KVM"
and set MACHINE_IS_VM if this is different. As the console detection
uses diagnose 8 if MACHINE_IS_VM returns true this will crash Linux
early for any non z/VM system that sets a different value than KVM.
So instead of assuming z/VM, do not set any of MACHINE_IS_LPAR,
MACHINE_IS_VM, or MACHINE_IS_KVM.

CC: stable@vger.kernel.org
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-01-11 17:12:02 +01:00
Masahiro Yamada
ba97df4558 kbuild: use assignment instead of define ... endef for filechk_* rules
You do not have to use define ... endef for filechk_* rules.

For simple cases, the use of assignment looks cleaner, IMHO.

I updated the usage for scripts/Kbuild.include in case somebody
misunderstands the 'define ... endif' is the requirement.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2019-01-06 10:22:35 +09:00
Masahiro Yamada
e9666d10a5 jump_label: move 'asm goto' support test to Kconfig
Currently, CONFIG_JUMP_LABEL just means "I _want_ to use jump label".

The jump label is controlled by HAVE_JUMP_LABEL, which is defined
like this:

  #if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_JUMP_LABEL)
  # define HAVE_JUMP_LABEL
  #endif

We can improve this by testing 'asm goto' support in Kconfig, then
make JUMP_LABEL depend on CC_HAS_ASM_GOTO.

Ugly #ifdef HAVE_JUMP_LABEL will go away, and CONFIG_JUMP_LABEL will
match to the real kernel capability.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
2019-01-06 09:46:51 +09:00
Linus Torvalds
d9a7fa67b4 Merge branch 'next-seccomp' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull seccomp updates from James Morris:

 - Add SECCOMP_RET_USER_NOTIF

 - seccomp fixes for sparse warnings and s390 build (Tycho)

* 'next-seccomp' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  seccomp, s390: fix build for syscall type change
  seccomp: fix poor type promotion
  samples: add an example of seccomp user trap
  seccomp: add a return code to trap to userspace
  seccomp: switch system call argument type to void *
  seccomp: hoist struct seccomp_data recalculation higher
2019-01-02 09:48:13 -08:00
Linus Torvalds
5694cecdb0 arm64 festive updates for 4.21
In the end, we ended up with quite a lot more than I expected:
 
 - Support for ARMv8.3 Pointer Authentication in userspace (CRIU and
   kernel-side support to come later)
 
 - Support for per-thread stack canaries, pending an update to GCC that
   is currently undergoing review
 
 - Support for kexec_file_load(), which permits secure boot of a kexec
   payload but also happens to improve the performance of kexec
   dramatically because we can avoid the sucky purgatory code from
   userspace. Kdump will come later (requires updates to libfdt).
 
 - Optimisation of our dynamic CPU feature framework, so that all
   detected features are enabled via a single stop_machine() invocation
 
 - KPTI whitelisting of Cortex-A CPUs unaffected by Meltdown, so that
   they can benefit from global TLB entries when KASLR is not in use
 
 - 52-bit virtual addressing for userspace (kernel remains 48-bit)
 
 - Patch in LSE atomics for per-cpu atomic operations
 
 - Custom preempt.h implementation to avoid unconditional calls to
   preempt_schedule() from preempt_enable()
 
 - Support for the new 'SB' Speculation Barrier instruction
 
 - Vectorised implementation of XOR checksumming and CRC32 optimisations
 
 - Workaround for Cortex-A76 erratum #1165522
 
 - Improved compatibility with Clang/LLD
 
 - Support for TX2 system PMUS for profiling the L3 cache and DMC
 
 - Reflect read-only permissions in the linear map by default
 
 - Ensure MMIO reads are ordered with subsequent calls to Xdelay()
 
 - Initial support for memory hotplug
 
 - Tweak the threshold when we invalidate the TLB by-ASID, so that
   mremap() performance is improved for ranges spanning multiple PMDs.
 
 - Minor refactoring and cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABCgAGBQJcE4TmAAoJELescNyEwWM0Nr0H/iaU7/wQSzHyNXtZoImyKTul
 Blu2ga4/EqUrTU7AVVfmkl/3NBILWlgQVpY6tH6EfXQuvnxqD7CizbHyLdyO+z0S
 B5PsFUH2GLMNAi48AUNqGqkgb2knFbg+T+9IimijDBkKg1G/KhQnRg6bXX32mLJv
 Une8oshUPBVJMsHN1AcQknzKariuoE3u0SgJ+eOZ9yA2ZwKxP4yy1SkDt3xQrtI0
 lojeRjxcyjTP1oGRNZC+BWUtGOT35p7y6cGTnBd/4TlqBGz5wVAJUcdoxnZ6JYVR
 O8+ob9zU+4I0+SKt80s7pTLqQiL9rxkKZ5joWK1pr1g9e0s5N5yoETXKFHgJYP8=
 =sYdt
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 festive updates from Will Deacon:
 "In the end, we ended up with quite a lot more than I expected:

   - Support for ARMv8.3 Pointer Authentication in userspace (CRIU and
     kernel-side support to come later)

   - Support for per-thread stack canaries, pending an update to GCC
     that is currently undergoing review

   - Support for kexec_file_load(), which permits secure boot of a kexec
     payload but also happens to improve the performance of kexec
     dramatically because we can avoid the sucky purgatory code from
     userspace. Kdump will come later (requires updates to libfdt).

   - Optimisation of our dynamic CPU feature framework, so that all
     detected features are enabled via a single stop_machine()
     invocation

   - KPTI whitelisting of Cortex-A CPUs unaffected by Meltdown, so that
     they can benefit from global TLB entries when KASLR is not in use

   - 52-bit virtual addressing for userspace (kernel remains 48-bit)

   - Patch in LSE atomics for per-cpu atomic operations

   - Custom preempt.h implementation to avoid unconditional calls to
     preempt_schedule() from preempt_enable()

   - Support for the new 'SB' Speculation Barrier instruction

   - Vectorised implementation of XOR checksumming and CRC32
     optimisations

   - Workaround for Cortex-A76 erratum #1165522

   - Improved compatibility with Clang/LLD

   - Support for TX2 system PMUS for profiling the L3 cache and DMC

   - Reflect read-only permissions in the linear map by default

   - Ensure MMIO reads are ordered with subsequent calls to Xdelay()

   - Initial support for memory hotplug

   - Tweak the threshold when we invalidate the TLB by-ASID, so that
     mremap() performance is improved for ranges spanning multiple PMDs.

   - Minor refactoring and cleanups"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (125 commits)
  arm64: kaslr: print PHYS_OFFSET in dump_kernel_offset()
  arm64: sysreg: Use _BITUL() when defining register bits
  arm64: cpufeature: Rework ptr auth hwcaps using multi_entry_cap_matches
  arm64: cpufeature: Reduce number of pointer auth CPU caps from 6 to 4
  arm64: docs: document pointer authentication
  arm64: ptr auth: Move per-thread keys from thread_info to thread_struct
  arm64: enable pointer authentication
  arm64: add prctl control for resetting ptrauth keys
  arm64: perf: strip PAC when unwinding userspace
  arm64: expose user PAC bit positions via ptrace
  arm64: add basic pointer authentication support
  arm64/cpufeature: detect pointer authentication
  arm64: Don't trap host pointer auth use to EL2
  arm64/kvm: hide ptrauth from guests
  arm64/kvm: consistently handle host HCR_EL2 flags
  arm64: add pointer authentication register bits
  arm64: add comments about EC exception levels
  arm64: perf: Treat EXCLUDE_EL* bit definitions as unsigned
  arm64: kpti: Whitelist Cortex-A CPUs that don't implement the CSV3 field
  arm64: enable per-task stack canaries
  ...
2018-12-25 17:41:56 -08:00
Tycho Andersen
4fc96ee908 seccomp, s390: fix build for syscall type change
A recent patch landed in the security tree [1] that changed the type of the
seccomp syscall. Unfortunately, I didn't quite get every instance of the
forward declarations, and thus there is a build failure. Here's the last
one that I could find, for s390. It should go through the security tree,
although hopefully some s390 people can check and make sure it looks
reasonable?

The only oddity is the trailing semicolon; some lines around this patch
have it, and some lines don't. I've left this one as-is.

[1]: https://lore.kernel.org/lkml/20181212231630.GA31584@beast/T/#u

Signed-off-by: Tycho Andersen <tycho@tycho.ws>
Fixes: 6a21cc50f0 ("seccomp: add a return code to trap to userspace")
Signed-off-by: Kees Cook <keescook@chromium.org>
2018-12-13 16:51:01 -08:00
AKASHI Takahiro
b6664ba42f s390, kexec_file: drop arch_kexec_mem_walk()
Since s390 already knows where to locate buffers, calling
arch_kexec_mem_walk() has no sense. So we can just drop it as kbuf->mem
indicates this while all other architectures sets it to 0 initially.

This change is a preparatory work for the next patch, where all the
variant memory walks, either on system resource or memblock, will be
put in one common place so that it will satisfy all the architectures'
need.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Reviewed-by: Philipp Rudo <prudo@linux.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06 14:38:49 +00:00
Linus Torvalds
0f1f692375 While rewriting the function graph tracer, I discovered a design flaw that
was introduced by a patch that tried to fix one bug, but by doing so created
 another bug. As both bugs corrupt the output (but they do not crash the
 kernel), I decided to fix the design such that it could have both bugs
 fixed. The original fix, fixed time reporting of the function graph tracer
 when doing a max_depth of one. This was code that can test how much the
 kernel interferes with userspace. But in doing so, it could corrupt the time
 keeping of the function profiler.
 
 The issue is that the curr_ret_stack variable was being used for two
 different meanings. One was to keep track of the stack pointer on the
 ret_stack (shadow stack used by the function graph tracer), and the other
 use case was the graph call depth.  Although, the two may be closely
 related, where they got updated was the issue that lead to the two different
 bugs that required the two use cases to be updated differently.
 
 The big issue with this fix is that it requires changing each architecture.
 The good news is, I was able to remove a lot of code that was duplicated
 within the architectures and place it into a single location. Then I could
 make the fix in one place.
 
 I pushed this code into linux-next to let it settle over a week, and before
 doing so, I cross compiled all the affected architectures to make sure that
 they built fine.
 
 In the mean time, I also pulled in a patch that fixes the sched_switch
 previous tasks state output, that was not actually correct.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCW/4NPhQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qnWAAQCyUIRLgYImr81eTl52lxNRsULk+aiI
 U29kRFWWU0c40AEA1X9sDF0MgOItbRGfZtnHTZEousXRDaDf4Fge2kF7Egg=
 =liQ0
 -----END PGP SIGNATURE-----

Merge tag 'trace-v4.20-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:
 "While rewriting the function graph tracer, I discovered a design flaw
  that was introduced by a patch that tried to fix one bug, but by doing
  so created another bug.

  As both bugs corrupt the output (but they do not crash the kernel), I
  decided to fix the design such that it could have both bugs fixed. The
  original fix, fixed time reporting of the function graph tracer when
  doing a max_depth of one. This was code that can test how much the
  kernel interferes with userspace. But in doing so, it could corrupt
  the time keeping of the function profiler.

  The issue is that the curr_ret_stack variable was being used for two
  different meanings. One was to keep track of the stack pointer on the
  ret_stack (shadow stack used by the function graph tracer), and the
  other use case was the graph call depth. Although, the two may be
  closely related, where they got updated was the issue that lead to the
  two different bugs that required the two use cases to be updated
  differently.

  The big issue with this fix is that it requires changing each
  architecture. The good news is, I was able to remove a lot of code
  that was duplicated within the architectures and place it into a
  single location. Then I could make the fix in one place.

  I pushed this code into linux-next to let it settle over a week, and
  before doing so, I cross compiled all the affected architectures to
  make sure that they built fine.

  In the mean time, I also pulled in a patch that fixes the sched_switch
  previous tasks state output, that was not actually correct"

* tag 'trace-v4.20-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  sched, trace: Fix prev_state output in sched_switch tracepoint
  function_graph: Have profiler use curr_ret_stack and not depth
  function_graph: Reverse the order of pushing the ret_stack and the callback
  function_graph: Move return callback before update of curr_ret_stack
  function_graph: Use new curr_ret_depth to manage depth instead of curr_ret_stack
  function_graph: Make ftrace_push_return_trace() static
  sparc/function_graph: Simplify with function_graph_enter()
  sh/function_graph: Simplify with function_graph_enter()
  s390/function_graph: Simplify with function_graph_enter()
  riscv/function_graph: Simplify with function_graph_enter()
  powerpc/function_graph: Simplify with function_graph_enter()
  parisc: function_graph: Simplify with function_graph_enter()
  nds32: function_graph: Simplify with function_graph_enter()
  MIPS: function_graph: Simplify with function_graph_enter()
  microblaze: function_graph: Simplify with function_graph_enter()
  arm64: function_graph: Simplify with function_graph_enter()
  ARM: function_graph: Simplify with function_graph_enter()
  x86/function_graph: Simplify with function_graph_enter()
  function_graph: Create function_graph_enter() to consolidate architecture code
2018-11-30 09:32:34 -08:00
Steven Rostedt (VMware)
18588e1487 s390/function_graph: Simplify with function_graph_enter()
The function_graph_enter() function does the work of calling the function
graph hook function and the management of the shadow stack, simplifying the
work done in the architecture dependent prepare_ftrace_return().

Have s390 use the new code, and remove the shadow stack management as well as
having to set up the trace structure.

This is needed to prepare for a fix of a design bug on how the curr_ret_stack
is used.

Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Julian Wiedmann <jwi@linux.ibm.com>
Cc: linux-s390@vger.kernel.org
Cc: stable@kernel.org
Fixes: 03274a3ffb ("tracing/fgraph: Adjust fgraph depth before calling trace return callback")
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-11-27 20:31:30 -05:00
Thomas Richter
613a41b0d1 s390/cpum_cf: Reject request for sampling in event initialization
On s390 command perf top fails
[root@s35lp76 perf] # ./perf top -F100000  --stdio
   Error:
   cycles: PMU Hardware doesn't support sampling/overflow-interrupts.
   	Try 'perf stat'
[root@s35lp76 perf] #

Using event -e rb0000 works as designed.  Event rb0000 is the event
number of the sampling facility for basic sampling.

During system start up the following PMUs are installed in the kernel's
PMU list (from head to tail):
   cpum_cf --> s390 PMU counter facility device driver
   cpum_sf --> s390 PMU sampling facility device driver
   uprobe
   kprobe
   tracepoint
   task_clock
   cpu_clock

Perf top executes following functions and calls perf_event_open(2) system
call with different parameters many times:

cmd_top
--> __cmd_top
    --> perf_evlist__add_default
        --> __perf_evlist__add_default
            --> perf_evlist__new_cycles (creates event type:0 (HW)
			    		config 0 (CPU_CYCLES)
	        --> perf_event_attr__set_max_precise_ip
		    Uses perf_event_open(2) to detect correct
		    precise_ip level. Fails 3 times on s390 which is ok.

Then functions cmd_top
--> __cmd_top
    --> perf_top__start_counters
        -->perf_evlist__config
	   --> perf_can_comm_exec
               --> perf_probe_api
	           This functions test support for the following events:
		   "cycles:u", "instructions:u", "cpu-clock:u" using
		   --> perf_do_probe_api
		       --> perf_event_open_cloexec
		           Test the close on exec flag support with
			   perf_event_open(2).
	               perf_do_probe_api returns true if the event is
		       supported.
		       The function returns true because event cpu-clock is
		       supported by the PMU cpu_clock.
	               This is achieved by many calls to perf_event_open(2).

Function perf_top__start_counters now calls perf_evsel__open() for every
event, which is the default event cpu_cycles (config:0) and type HARDWARE
(type:0) which a predfined frequence of 4000.

Given the above order of the PMU list, the PMU cpum_cf gets called first
and returns 0, which indicates support for this sampling. The event is
fully allocated in the function perf_event_open (file kernel/event/core.c
near line 10521 and the following check fails:

        event = perf_event_alloc(&attr, cpu, task, group_leader, NULL,
		                 NULL, NULL, cgroup_fd);
	if (IS_ERR(event)) {
		err = PTR_ERR(event);
		goto err_cred;
	}

        if (is_sampling_event(event)) {
		if (event->pmu->capabilities & PERF_PMU_CAP_NO_INTERRUPT) {
			err = -EOPNOTSUPP;
			goto err_alloc;
		}
	}

The check for the interrupt capabilities fails and the system call
perf_event_open() returns -EOPNOTSUPP (-95).

Add a check to return -ENODEV when sampling is requested in PMU cpum_cf.
This allows common kernel code in the perf_event_open() system call to
test the next PMU in above list.

Fixes: 97b1198fec (" "s390, perf: Use common PMU interrupt disabled code")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-11-14 13:44:55 +01:00
Linus Torvalds
3541833fd1 s390 updates for 4.20-rc2
- A fix for the pgtable_bytes misaccounting on s390. The patch changes
    common code part in regard to page table folding and adds extra
    checks to mm_[inc|dec]_nr_[pmds|puds].
 
  - Add FORCE for all build targets using if_changed
 
  - Use non-loadable phdr for the .vmlinux.info section to avoid
    a segment overlap that confuses kexec
 
  - Cleanup the attribute definition for the diagnostic sampling
 
  - Increase stack size for CONFIG_KASAN=y builds
 
  - Export __node_distance to fix a build error
 
  - Correct return code of a PMU event init function
 
  - An update for the default configs
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJb5TIMAAoJEDjwexyKj9rgIH8H/0daZTyxcLwY9gbigaq1Qs4R
 /ScmAJJc2U/Qj8b9UskhsmHAUuAufF2oljU16SquP7CBGhtkLRrjPtdh1AMiiZGM
 reVF7X5LU8MH0QUoNnKPWAL4DD1q2E99IAEH5TeGIODUG6srqvIHBNtXDWNLPtBf
 fpOhJ/NssgxyuYUXi/WnoEjIyP8KABeG6SlpcLzYbmY1hUOIXcixuv39UrL0G5OO
 P8ciL+W5rTcPZCnpJ1Xk9hKploT8gWXhMT5QhNnakgMF/25v80+TZy5xRZMuLAmQ
 T5SFP6B71o05nLK7fLi3VAIKPv/QibjiyJOEf9uUHdo1XZcD5uRu0EQ/LklLUBU=
 =4H06
 -----END PGP SIGNATURE-----

Merge tag 's390-4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 fixes from Martin Schwidefsky:

 - A fix for the pgtable_bytes misaccounting on s390. The patch changes
   common code part in regard to page table folding and adds extra
   checks to mm_[inc|dec]_nr_[pmds|puds].

 - Add FORCE for all build targets using if_changed

 - Use non-loadable phdr for the .vmlinux.info section to avoid a
   segment overlap that confuses kexec

 - Cleanup the attribute definition for the diagnostic sampling

 - Increase stack size for CONFIG_KASAN=y builds

 - Export __node_distance to fix a build error

 - Correct return code of a PMU event init function

 - An update for the default configs

* tag 's390-4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/perf: Change CPUM_CF return code in event init function
  s390: update defconfigs
  s390/mm: Fix ERROR: "__node_distance" undefined!
  s390/kasan: increase instrumented stack size to 64k
  s390/cpum_sf: Rework attribute definition for diagnostic sampling
  s390/mm: fix mis-accounting of pgtable_bytes
  mm: add mm_pxd_folded checks to pgtable_bytes accounting functions
  mm: introduce mm_[p4d|pud|pmd]_folded
  mm: make the __PAGETABLE_PxD_FOLDED defines non-empty
  s390: avoid vmlinux segments overlap
  s390/vdso: add missing FORCE to build targets
  s390/decompressor: add missing FORCE to build targets
2018-11-09 06:30:44 -06:00
Thomas Richter
0bb2ae1b26 s390/perf: Change CPUM_CF return code in event init function
The function perf_init_event() creates a new event and
assignes it to a PMU. This a done in a loop over all existing
PMUs. For each listed PMU the event init function is called
and if this function does return any other error than -ENOENT,
the loop is terminated the creation of the event fails.

If the event is invalid, return -ENOENT to try other PMUs.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-11-08 07:58:16 +01:00
Vasily Gorbik
9fed920e68 s390/kasan: increase instrumented stack size to 64k
Increase kasan instrumented kernel stack size from 32k to 64k. Other
architectures seems to get away with just doubling kernel stack size under
kasan, but on s390 this appears to be not enough due to bigger frame size.
The particular pain point is kasan inlined checks (CONFIG_KASAN_INLINE
vs CONFIG_KASAN_OUTLINE). With inlined checks one particular case hitting
stack overflow is fs sync on xfs filesystem:

 #0 [9a0681e8]  704 bytes  check_usage at 34b1fc
 #1 [9a0684a8]  432 bytes  check_usage at 34c710
 #2 [9a068658]  1048 bytes  validate_chain at 35044a
 #3 [9a068a70]  312 bytes  __lock_acquire at 3559fe
 #4 [9a068ba8]  440 bytes  lock_acquire at 3576ee
 #5 [9a068d60]  104 bytes  _raw_spin_lock at 21b44e0
 #6 [9a068dc8]  1992 bytes  enqueue_entity at 2dbf72
 #7 [9a069590]  1496 bytes  enqueue_task_fair at 2df5f0
 #8 [9a069b68]  64 bytes  ttwu_do_activate at 28f438
 #9 [9a069ba8]  552 bytes  try_to_wake_up at 298c4c
 #10 [9a069dd0]  168 bytes  wake_up_worker at 23f97c
 #11 [9a069e78]  200 bytes  insert_work at 23fc2e
 #12 [9a069f40]  648 bytes  __queue_work at 2487c0
 #13 [9a06a1c8]  200 bytes  __queue_delayed_work at 24db28
 #14 [9a06a290]  248 bytes  mod_delayed_work_on at 24de84
 #15 [9a06a388]  24 bytes  kblockd_mod_delayed_work_on at 153e2a0
 #16 [9a06a3a0]  288 bytes  __blk_mq_delay_run_hw_queue at 158168c
 #17 [9a06a4c0]  192 bytes  blk_mq_run_hw_queue at 1581a3c
 #18 [9a06a580]  184 bytes  blk_mq_sched_insert_requests at 15a2192
 #19 [9a06a638]  1024 bytes  blk_mq_flush_plug_list at 1590f3a
 #20 [9a06aa38]  704 bytes  blk_flush_plug_list at 1555028
 #21 [9a06acf8]  320 bytes  schedule at 219e476
 #22 [9a06ae38]  760 bytes  schedule_timeout at 21b0aac
 #23 [9a06b130]  408 bytes  wait_for_common at 21a1706
 #24 [9a06b2c8]  360 bytes  xfs_buf_iowait at fa1540
 #25 [9a06b430]  256 bytes  __xfs_buf_submit at fadae6
 #26 [9a06b530]  264 bytes  xfs_buf_read_map at fae3f6
 #27 [9a06b638]  656 bytes  xfs_trans_read_buf_map at 10ac9a8
 #28 [9a06b8c8]  304 bytes  xfs_btree_kill_root at e72426
 #29 [9a06b9f8]  288 bytes  xfs_btree_lookup_get_block at e7bc5e
 #30 [9a06bb18]  624 bytes  xfs_btree_lookup at e7e1a6
 #31 [9a06bd88]  2664 bytes  xfs_alloc_ag_vextent_near at dfa070
 #32 [9a06c7f0]  144 bytes  xfs_alloc_ag_vextent at dff3ca
 #33 [9a06c880]  1128 bytes  xfs_alloc_vextent at e05fce
 #34 [9a06cce8]  584 bytes  xfs_bmap_btalloc at e58342
 #35 [9a06cf30]  1336 bytes  xfs_bmapi_write at e618de
 #36 [9a06d468]  776 bytes  xfs_iomap_write_allocate at ff678e
 #37 [9a06d770]  720 bytes  xfs_map_blocks at f82af8
 #38 [9a06da40]  928 bytes  xfs_writepage_map at f83cd6
 #39 [9a06dde0]  320 bytes  xfs_do_writepage at f85872
 #40 [9a06df20]  1320 bytes  write_cache_pages at 73dfe8
 #41 [9a06e448]  208 bytes  xfs_vm_writepages at f7f892
 #42 [9a06e518]  88 bytes  do_writepages at 73fe6a
 #43 [9a06e570]  872 bytes  __writeback_single_inode at a20cb6
 #44 [9a06e8d8]  664 bytes  writeback_sb_inodes at a23be2
 #45 [9a06eb70]  296 bytes  __writeback_inodes_wb at a242e0
 #46 [9a06ec98]  928 bytes  wb_writeback at a2500e
 #47 [9a06f038]  848 bytes  wb_do_writeback at a260ae
 #48 [9a06f388]  536 bytes  wb_workfn at a28228
 #49 [9a06f5a0]  1088 bytes  process_one_work at 24a234
 #50 [9a06f9e0]  1120 bytes  worker_thread at 24ba26
 #51 [9a06fe40]  104 bytes  kthread at 26545a
 #52 [9a06fea8]             kernel_thread_starter at 21b6b62

To be able to increase the stack size to 64k reuse LLILL instruction
in __switch_to function to load 64k - STACK_FRAME_OVERHEAD - __PT_SIZE
(65192) value as unsigned.

Reported-by: Benjamin Block <bblock@linux.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-11-02 08:31:57 +01:00