linux_dsm_epyc7002/arch/s390/include/asm/uaccess.h
Martin Schwidefsky 0aaba41b58 s390: remove all code using the access register mode
The vdso code for the getcpu() and the clock_gettime() call use the access
register mode to access the per-CPU vdso data page with the current code.

An alternative to the complicated AR mode is to use the secondary space
mode. This makes the vdso faster and quite a bit simpler. The downside is
that the uaccess code has to be changed quite a bit.

Which instructions are used depends on the machine and what kind of uaccess
operation is requested. The instruction dictates which ASCE value needs
to be loaded into %cr1 and %cr7.

The different cases:

* User copy with MVCOS for z10 and newer machines
  The MVCOS instruction can copy between the primary space (aka user) and
  the home space (aka kernel) directly. For set_fs(KERNEL_DS) the kernel
  ASCE is loaded into %cr1. For set_fs(USER_DS) the user space is already
  loaded in %cr1.

* User copy with MVCP/MVCS for older machines
  To be able to execute the MVCP/MVCS instructions the kernel needs to
  switch to primary mode. The control register %cr1 has to be set to the
  kernel ASCE and %cr7 to either the kernel ASCE or the user ASCE dependent
  on set_fs(KERNEL_DS) vs set_fs(USER_DS).

* Data access in the user address space for strnlen / futex
  To use "normal" instruction with data from the user address space the
  secondary space mode is used. The kernel needs to switch to primary mode,
  %cr1 has to contain the kernel ASCE and %cr7 either the user ASCE or the
  kernel ASCE, dependent on set_fs.

To load a new value into %cr1 or %cr7 is an expensive operation, the kernel
tries to be lazy about it. E.g. for multiple user copies in a row with
MVCP/MVCS the replacement of the vdso ASCE in %cr7 with the user ASCE is
done only once. On return to user space a CPU bit is checked that loads the
vdso ASCE again.

To enable and disable the data access via the secondary space two new
functions are added, enable_sacf_uaccess and disable_sacf_uaccess. The fact
that a context is in secondary space uaccess mode is stored in the
mm_segment_t value for the task. The code of an interrupt may use set_fs
as long as it returns to the previous state it got with get_fs with another
call to set_fs. The code in finish_arch_post_lock_switch simply has to do a
set_fs with the current mm_segment_t value for the task.

For CPUs with MVCOS:

CPU running in                        | %cr1 ASCE | %cr7 ASCE |
--------------------------------------|-----------|-----------|
user space                            |  user     |  vdso     |
kernel, USER_DS, normal-mode          |  user     |  vdso     |
kernel, USER_DS, normal-mode, lazy    |  user     |  user     |
kernel, USER_DS, sacf-mode            |  kernel   |  user     |
kernel, KERNEL_DS, normal-mode        |  kernel   |  vdso     |
kernel, KERNEL_DS, normal-mode, lazy  |  kernel   |  kernel   |
kernel, KERNEL_DS, sacf-mode          |  kernel   |  kernel   |

For CPUs without MVCOS:

CPU running in                        | %cr1 ASCE | %cr7 ASCE |
--------------------------------------|-----------|-----------|
user space                            |  user     |  vdso     |
kernel, USER_DS, normal-mode          |  user     |  vdso     |
kernel, USER_DS, normal-mode lazy     |  kernel   |  user     |
kernel, USER_DS, sacf-mode            |  kernel   |  user     |
kernel, KERNEL_DS, normal-mode        |  kernel   |  vdso     |
kernel, KERNEL_DS, normal-mode, lazy  |  kernel   |  kernel   |
kernel, KERNEL_DS, sacf-mode          |  kernel   |  kernel   |

The lines with "lazy" refer to the state after a copy via the secondary
space with a delayed reload of %cr1 and %cr7.

There are three hardware address spaces that can cause a DAT exception,
primary, secondary and home space. The exception can be related to
four different fault types: user space fault, vdso fault, kernel fault,
and the gmap faults.

Dependent on the set_fs state and normal vs. sacf mode there are a number
of fault combinations:

1) user address space fault via the primary ASCE
2) gmap address space fault via the primary ASCE
3) kernel address space fault via the primary ASCE for machines with
   MVCOS and set_fs(KERNEL_DS)
4) vdso address space faults via the secondary ASCE with an invalid
   address while running in secondary space in problem state
5) user address space fault via the secondary ASCE for user-copy
   based on the secondary space mode, e.g. futex_ops or strnlen_user
6) kernel address space fault via the secondary ASCE for user-copy
   with secondary space mode with set_fs(KERNEL_DS)
7) kernel address space fault via the primary ASCE for user-copy
   with secondary space mode with set_fs(USER_DS) on machines without
   MVCOS.
8) kernel address space fault via the home space ASCE

Replace user_space_fault() with a new function get_fault_type() that
can distinguish all four different fault types.

With these changes the futex atomic ops from the kernel and the
strnlen_user will get a little bit slower, as well as the old style
uaccess with MVCP/MVCS. All user accesses based on MVCOS will be as
fast as before. On the positive side, the user space vdso code is a
lot faster and Linux ceases to use the complicated AR mode.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2017-11-14 11:01:47 +01:00

281 lines
6.5 KiB
C

/* SPDX-License-Identifier: GPL-2.0 */
/*
* S390 version
* Copyright IBM Corp. 1999, 2000
* Author(s): Hartmut Penner (hp@de.ibm.com),
* Martin Schwidefsky (schwidefsky@de.ibm.com)
*
* Derived from "include/asm-i386/uaccess.h"
*/
#ifndef __S390_UACCESS_H
#define __S390_UACCESS_H
/*
* User space memory access functions
*/
#include <asm/processor.h>
#include <asm/ctl_reg.h>
#include <asm/extable.h>
#include <asm/facility.h>
/*
* The fs value determines whether argument validity checking should be
* performed or not. If get_fs() == USER_DS, checking is performed, with
* get_fs() == KERNEL_DS, checking is bypassed.
*
* For historical reasons, these macros are grossly misnamed.
*/
#define KERNEL_DS (0)
#define KERNEL_DS_SACF (1)
#define USER_DS (2)
#define USER_DS_SACF (3)
#define get_ds() (KERNEL_DS)
#define get_fs() (current->thread.mm_segment)
#define segment_eq(a,b) (((a) & 2) == ((b) & 2))
void set_fs(mm_segment_t fs);
static inline int __range_ok(unsigned long addr, unsigned long size)
{
return 1;
}
#define __access_ok(addr, size) \
({ \
__chk_user_ptr(addr); \
__range_ok((unsigned long)(addr), (size)); \
})
#define access_ok(type, addr, size) __access_ok(addr, size)
unsigned long __must_check
raw_copy_from_user(void *to, const void __user *from, unsigned long n);
unsigned long __must_check
raw_copy_to_user(void __user *to, const void *from, unsigned long n);
#define INLINE_COPY_FROM_USER
#define INLINE_COPY_TO_USER
#ifdef CONFIG_HAVE_MARCH_Z10_FEATURES
#define __put_get_user_asm(to, from, size, spec) \
({ \
register unsigned long __reg0 asm("0") = spec; \
int __rc; \
\
asm volatile( \
"0: mvcos %1,%3,%2\n" \
"1: xr %0,%0\n" \
"2:\n" \
".pushsection .fixup, \"ax\"\n" \
"3: lhi %0,%5\n" \
" jg 2b\n" \
".popsection\n" \
EX_TABLE(0b,3b) EX_TABLE(1b,3b) \
: "=d" (__rc), "+Q" (*(to)) \
: "d" (size), "Q" (*(from)), \
"d" (__reg0), "K" (-EFAULT) \
: "cc"); \
__rc; \
})
static inline int __put_user_fn(void *x, void __user *ptr, unsigned long size)
{
unsigned long spec = 0x010000UL;
int rc;
switch (size) {
case 1:
rc = __put_get_user_asm((unsigned char __user *)ptr,
(unsigned char *)x,
size, spec);
break;
case 2:
rc = __put_get_user_asm((unsigned short __user *)ptr,
(unsigned short *)x,
size, spec);
break;
case 4:
rc = __put_get_user_asm((unsigned int __user *)ptr,
(unsigned int *)x,
size, spec);
break;
case 8:
rc = __put_get_user_asm((unsigned long __user *)ptr,
(unsigned long *)x,
size, spec);
break;
}
return rc;
}
static inline int __get_user_fn(void *x, const void __user *ptr, unsigned long size)
{
unsigned long spec = 0x01UL;
int rc;
switch (size) {
case 1:
rc = __put_get_user_asm((unsigned char *)x,
(unsigned char __user *)ptr,
size, spec);
break;
case 2:
rc = __put_get_user_asm((unsigned short *)x,
(unsigned short __user *)ptr,
size, spec);
break;
case 4:
rc = __put_get_user_asm((unsigned int *)x,
(unsigned int __user *)ptr,
size, spec);
break;
case 8:
rc = __put_get_user_asm((unsigned long *)x,
(unsigned long __user *)ptr,
size, spec);
break;
}
return rc;
}
#else /* CONFIG_HAVE_MARCH_Z10_FEATURES */
static inline int __put_user_fn(void *x, void __user *ptr, unsigned long size)
{
size = raw_copy_to_user(ptr, x, size);
return size ? -EFAULT : 0;
}
static inline int __get_user_fn(void *x, const void __user *ptr, unsigned long size)
{
size = raw_copy_from_user(x, ptr, size);
return size ? -EFAULT : 0;
}
#endif /* CONFIG_HAVE_MARCH_Z10_FEATURES */
/*
* These are the main single-value transfer routines. They automatically
* use the right size if we just have the right pointer type.
*/
#define __put_user(x, ptr) \
({ \
__typeof__(*(ptr)) __x = (x); \
int __pu_err = -EFAULT; \
__chk_user_ptr(ptr); \
switch (sizeof (*(ptr))) { \
case 1: \
case 2: \
case 4: \
case 8: \
__pu_err = __put_user_fn(&__x, ptr, \
sizeof(*(ptr))); \
break; \
default: \
__put_user_bad(); \
break; \
} \
__builtin_expect(__pu_err, 0); \
})
#define put_user(x, ptr) \
({ \
might_fault(); \
__put_user(x, ptr); \
})
int __put_user_bad(void) __attribute__((noreturn));
#define __get_user(x, ptr) \
({ \
int __gu_err = -EFAULT; \
__chk_user_ptr(ptr); \
switch (sizeof(*(ptr))) { \
case 1: { \
unsigned char __x = 0; \
__gu_err = __get_user_fn(&__x, ptr, \
sizeof(*(ptr))); \
(x) = *(__force __typeof__(*(ptr)) *) &__x; \
break; \
}; \
case 2: { \
unsigned short __x = 0; \
__gu_err = __get_user_fn(&__x, ptr, \
sizeof(*(ptr))); \
(x) = *(__force __typeof__(*(ptr)) *) &__x; \
break; \
}; \
case 4: { \
unsigned int __x = 0; \
__gu_err = __get_user_fn(&__x, ptr, \
sizeof(*(ptr))); \
(x) = *(__force __typeof__(*(ptr)) *) &__x; \
break; \
}; \
case 8: { \
unsigned long long __x = 0; \
__gu_err = __get_user_fn(&__x, ptr, \
sizeof(*(ptr))); \
(x) = *(__force __typeof__(*(ptr)) *) &__x; \
break; \
}; \
default: \
__get_user_bad(); \
break; \
} \
__builtin_expect(__gu_err, 0); \
})
#define get_user(x, ptr) \
({ \
might_fault(); \
__get_user(x, ptr); \
})
int __get_user_bad(void) __attribute__((noreturn));
unsigned long __must_check
raw_copy_in_user(void __user *to, const void __user *from, unsigned long n);
/*
* Copy a null terminated string from userspace.
*/
long __strncpy_from_user(char *dst, const char __user *src, long count);
static inline long __must_check
strncpy_from_user(char *dst, const char __user *src, long count)
{
might_fault();
return __strncpy_from_user(dst, src, count);
}
unsigned long __must_check __strnlen_user(const char __user *src, unsigned long count);
static inline unsigned long strnlen_user(const char __user *src, unsigned long n)
{
might_fault();
return __strnlen_user(src, n);
}
/*
* Zero Userspace
*/
unsigned long __must_check __clear_user(void __user *to, unsigned long size);
static inline unsigned long __must_check clear_user(void __user *to, unsigned long n)
{
might_fault();
return __clear_user(to, n);
}
int copy_to_user_real(void __user *dest, void *src, unsigned long count);
void s390_kernel_write(void *dst, const void *src, size_t size);
#endif /* __S390_UACCESS_H */